Thread: Junk date getting uploaded into date field

Junk date getting uploaded into date field

From
bsreejithin
Date:
We have a csv file which we upload into postgres DB. If there are some
errors, like a data mismatch with the database table columns, postgres
should raise and error and upload should fail.

What is happening now is that, in case we get some junk date in the upload
file, postgres does auto-correction and does not raise an error!

A wrong date like ,say, 33-Oct-2013 gets converted as 2-Nov-2013 when the
data is loaded into a date field. No error raised.Data gets uploaded!

How can I prevent this ? I don't want this junk data to get loaded.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Junk-date-getting-uploaded-into-date-field-tp5776969.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Junk date getting uploaded into date field

From
John R Pierce
Date:
On 11/5/2013 1:04 AM, bsreejithin wrote:
> A wrong date like ,say, 33-Oct-2013 gets converted as 2-Nov-2013 when the
> data is loaded into a date field. No error raised.Data gets uploaded!

ummmmm.   postgresql won't do that conversion

postgres=# select '33-Oct-2013'::date;
ERROR:  date/time field value out of range: "33-Oct-2013"
LINE 1: select '33-Oct-2013'::date;
                ^
.

how exactly are you inserting this CSV data into postgres ?

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Junk date getting uploaded into date field

From
Thomas Kellerer
Date:
bsreejithin, 05.11.2013 10:04:
> We have a csv file which we upload into postgres DB. If there are some
> errors, like a data mismatch with the database table columns, postgres
> should raise and error and upload should fail.
>
> What is happening now is that, in case we get some junk date in the upload
> file, postgres does auto-correction and does not raise an error!
>
> A wrong date like ,say, 33-Oct-2013 gets converted as 2-Nov-2013 when the
> data is loaded into a date field. No error raised.Data gets uploaded!
>
> How can I prevent this ? I don't want this junk data to get loaded.
>

I'm pretty sure Postgres will not allow that.

If I had to guess: you are using a Java program which uses SimpleDateFormat and the "lenient" parsing was not disabled.

Because that's exactly what happens with a SimpleDateFormat in it's default configuration.

http://docs.oracle.com/javase/6/docs/api/java/text/DateFormat.html#setLenient%28boolean%29




Re: Junk date getting uploaded into date field

From
bsreejithin
Date:
Not able to post the attached details as a comment in the reply box, so
attaching it as an image file :
<http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png>



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Junk-date-getting-uploaded-into-date-field-tp5776969p5776987.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Junk date getting uploaded into date field

From
Thomas Kellerer
Date:
bsreejithin, 05.11.2013 13:14:
> Not able to post the attached details as a comment in the reply box, so
> attaching it as an image file :
> <http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png>

It would have much easier if you had simply used copy & paste to post a text version of that SQL.
Does your mail client not allow you to do that?

But your test case is essentially this:

   select to_date('33-OCT-2013', 'dd-mon-yyyy')

which indeed returns 2013-11-02 (using 9.3.1)

I don't know if this is inteded or actually a bug - I can't find anything in the docs relating to that behaviour.

Thomas


Re: Junk date getting uploaded into date field

From
bsreejithin
Date:
I am not using any mail client.I was directly trying to post the content I
attached.Got a connection reset by peer error on submit.Thought some issue
with the browser - so tried with firefox, chrome as well, in addition to IE
- got the same error there also. Any way, that's not the issue.

I am not sure why : select to_date('33-OCT-2013', 'dd-mon-yyyy')

is returning 2013-11-02.

For cases like the issue I am facing, where we need to raise an error saying
the data is wrong, DB manipulating the data is not proper.

May be there could be some uses cases for such data manipulation. But then,
in that case, it would have been better to have a parameter which can switch
ON/OFF this behavior.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Junk-date-getting-uploaded-into-date-field-tp5776969p5776992.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Junk date getting uploaded into date field

From
Edson Richter
Date:
Em 05/11/2013 10:36, Thomas Kellerer escreveu:
> bsreejithin, 05.11.2013 13:14:
>> Not able to post the attached details as a comment in the reply box, so
>> attaching it as an image file :
>> <http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png>
> It would have much easier if you had simply used copy & paste to post a text version of that SQL.
> Does your mail client not allow you to do that?
>
> But your test case is essentially this:
>
>     select to_date('33-OCT-2013', 'dd-mon-yyyy')
>
> which indeed returns 2013-11-02 (using 9.3.1)
>
> I don't know if this is inteded or actually a bug - I can't find anything in the docs relating to that behaviour.
>
> Thomas
>
>
>
>
I also don't know if this is intended or a bug, but for me, it seems to
be right: 2013-11-02 is the 33 day counting from 2013-10-01.

Edson


Re: Junk date getting uploaded into date field

From
Albe Laurenz
Date:
Thomas Kellerer wrote:
> bsreejithin, 05.11.2013 13:14:
>> Not able to post the attached details as a comment in the reply box, so
>> attaching it as an image file :
>> <http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png>
> 
> It would have much easier if you had simply used copy & paste to post a text version of that SQL.
> Does your mail client not allow you to do that?
> 
> But your test case is essentially this:
> 
>    select to_date('33-OCT-2013', 'dd-mon-yyyy')
> 
> which indeed returns 2013-11-02 (using 9.3.1)
> 
> I don't know if this is inteded or actually a bug - I can't find anything in the docs relating to that
> behaviour.

There is a comment in utils/adt/formatting.c:

 * This function does very little error checking, e.g.
 * to_timestamp('20096040','YYYYMMDD') works

So at least this is not by accident.

On the other hand, I have always thought that these functions
are for Oracle compatibility, and sqlplus says:

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

SQL> SELECT to_date('20096040','YYYYMMDD') FROM dual;
SELECT to_date('20096040','YYYYMMDD') FROM dual
               *
ERROR at line 1:
ORA-01843: not a valid month


I don't know if that should be fixed, but fixing it might break SQL
that deliberately uses the current behaviour.

Yours,
Laurenz Albe

Re: Junk date getting uploaded into date field

From
Reid Thompson
Date:
On Tue, 2013-11-05 at 04:14 -0800, bsreejithin wrote:

> Not able to post the attached details as a comment in the reply box, so
> attaching it as an image file : 
> <http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png> 
> 
> 
> 
> --
> View this message in context: http://postgresql.1045698.n5.nabble.com/Junk-date-getting-uploaded-into-date-field-tp5776969p5776987.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
> 
> 

to achieve what you want.... bdate needs to be defined as a date, not a
string.

test=# create table temptabl(bdate date);
CREATE TABLE
Time: 239.358 ms
test=# insert into temptabl values('33-OCT-2013');
ERROR:  date/time field value out of range: "33-OCT-2013"
LINE 1: insert into temptabl values('33-OCT-2013');                                   ^
HINT:  Perhaps you need a different "datestyle" setting.
Time: 0.288 ms
test=# 


Re: Junk date getting uploaded into date field

From
Tom Lane
Date:
Albe Laurenz <laurenz.albe@wien.gv.at> writes:
> There is a comment in utils/adt/formatting.c:

>  * This function does very little error checking, e.g.
>  * to_timestamp('20096040','YYYYMMDD') works

> So at least this is not by accident.

No, it isn't.  This is in fact the traditional behavior of Unix time
conversion utilities such as mktime(3).  The glibc man page gives the
specific example that "40 October" will be converted to "9 November"
rather than throwing an error.  The POSIX and C standards are not so
explicit, saying only that the inputs are not restricted to the normal
ranges (which I think would entitle an implementation to change 40
October to 31 October instead; but I've never heard of anyone doing
it that way).

            regards, tom lane


Re: Junk date getting uploaded into date field

From
Tom Lane
Date:
bsreejithin <bsreejithin@gmail.com> writes:
> I am not sure why : select to_date('33-OCT-2013', 'dd-mon-yyyy')
> is returning 2013-11-02.
> For cases like the issue I am facing, where we need to raise an error saying
> the data is wrong, DB manipulating the data is not proper.

Then don't use to_date().  Just entering the string through the normal date
type input function will do what you want.  to_date() is meant for trying
to extract data from weird input formats.

            regards, tom lane


Re: Junk date getting uploaded into date field

From
Michael Nolan
Date:
On 11/5/13, bsreejithin <bsreejithin@gmail.com> wrote:
>
> I am not sure why : select to_date('33-OCT-2013', 'dd-mon-yyyy')
>
> is returning 2013-11-02.
>
> For cases like the issue I am facing, where we need to raise an error
> saying
> the data is wrong, DB manipulating the data is not proper.

Try using a cast to date instead:

select '33-oct-2013'::date throws an error.
--
Mike Nolan


Re: Junk date getting uploaded into date field

From
Steve Crawford
Date:
On 11/05/2013 05:29 AM, Albe Laurenz wrote:
> Thomas Kellerer wrote:
>> bsreejithin, 05.11.2013 13:14:
>>> Not able to post the attached details as a comment in the reply box, so
>>> attaching it as an image file :
>>> <http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png>
>> It would have much easier if you had simply used copy & paste to post a text version of that SQL.
>> Does your mail client not allow you to do that?
>>
>> But your test case is essentially this:
>>
>>     select to_date('33-OCT-2013', 'dd-mon-yyyy')
>>
>> which indeed returns 2013-11-02 (using 9.3.1)
>>
>> I don't know if this is inteded or actually a bug - I can't find anything in the docs relating to that
>> behaviour.
> There is a comment in utils/adt/formatting.c:
>
>   * This function does very little error checking, e.g.
>   * to_timestamp('20096040','YYYYMMDD') works
>
>
I think the place for such warnings in addition to the source-code is in
the documentation. This or similar issues with to_date have popped up on
the lists a number of times.

Perhaps a "see warnings below" by the to_date description in table:
http://www.postgresql.org/docs/9.3/static/functions-formatting.html#FUNCTIONS-FORMATTING-TABLE

Then under usage notes something like:

The to_date and to_timestamp functions do minimal input error-checking
and are intended for conversion of non-standard formats that cannot be
handled by casting. These functions will attempt to convert illegal
dates to the best of their ability, e.g. to_date('33-OCT-2013',
'dd-mon-yyyy') will return 2013-11-02. Users of these functions are
advised to perform whatever external error-checking they deem prudent.

Cheers,
Steve



Re: Junk date getting uploaded into date field

From
Albe Laurenz
Date:
Steve Crawford wrote:
>> There is a comment in utils/adt/formatting.c:
>>
>>   * This function does very little error checking, e.g.
>>   * to_timestamp('20096040','YYYYMMDD') works
>>
>>
> I think the place for such warnings in addition to the source-code is in
> the documentation. This or similar issues with to_date have popped up on
> the lists a number of times.
> 
> Perhaps a "see warnings below" by the to_date description in table:
> http://www.postgresql.org/docs/9.3/static/functions-formatting.html#FUNCTIONS-FORMATTING-TABLE
> 
> Then under usage notes something like:
> 
> The to_date and to_timestamp functions do minimal input error-checking
> and are intended for conversion of non-standard formats that cannot be
> handled by casting. These functions will attempt to convert illegal
> dates to the best of their ability, e.g. to_date('33-OCT-2013',
> 'dd-mon-yyyy') will return 2013-11-02. Users of these functions are
> advised to perform whatever external error-checking they deem prudent.

I like that.

Would you write a patch and add it to the commitfest?

Yours,
Laurenz Albe

Re: Junk date getting uploaded into date field

From
John R Pierce
Date:
On 11/5/2013 10:29 AM, Steve Crawford wrote:
> The to_date and to_timestamp functions do minimal input error-checking
> and are intended for conversion of non-standard formats that cannot be
> handled by casting. These functions will attempt to convert illegal
> dates to the best of their ability, e.g. to_date('33-OCT-2013',
> 'dd-mon-yyyy') will return 2013-11-02. Users of these functions are
> advised to perform whatever external error-checking they deem prudent.


I think this should also point out that this behavior is different than
cast, which is much stricter.

the fact that these two methods are polar opposites in this behavior is
troublesome.


--
john r pierce                                      37N 122W
somewhere on the middle of the left coast