Thread: numeric_to_number() function skipping some digits

numeric_to_number() function skipping some digits

From
Jeevan Chalke
Date:
Hi,<br /><br />With PG84, I have tried something like this which seem incorrect to me.<br /><br /># SELECT '' AS
to_number_2, to_number('-347,58', '99G999');<br /> to_number_2 | to_number <br />-------------+-----------<br />
            |     -3458<br />(1 row)<br /><br />After browsing the code (numeric_to_number), I have found that number
stringis processed according to the number format. Which means, when 'G' is found in the format we have moved the
numberpointer to next character. This is why the digit '7' is skipped. And while processing ',', as it is not a digit
itis ignored.<br /><br />Is this expected behaviour? or a bug?<br /><br />Thanks<br /><br />-- <br />Jeevan B Chalke<br
/>EnterpriseDBSoftware India Private Limited, Pune<br />Visit us at: <a
href="http://www.enterprisedb.com">www.enterprisedb.com</a><br/> --- <br />If better is possible, then good is not
enough<br/><br /> 

Re: numeric_to_number() function skipping some digits

From
Tom Lane
Date:
Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:
> With PG84, I have tried something like this which seem incorrect to me.

> # SELECT '' AS to_number_2,  to_number('-347,58', '99G999');

Well, the input doesn't actually match the format, so I'm not totally
excited about this.  You do get sane results from:

regression=# select to_number('-347,58', '999G99');to_number 
-----------   -34758
(1 row)

Should we have it throw an error if the input corresponding to a G
symbol doesn't match the expected group separator?  I'm concerned that
that would break applications that work okay today.
        regards, tom lane


Re: numeric_to_number() function skipping some digits

From
Brendan Jurd
Date:
2009/9/19 Tom Lane <tgl@sss.pgh.pa.us>:
> Should we have it throw an error if the input corresponding to a G
> symbol doesn't match the expected group separator?  I'm concerned that
> that would break applications that work okay today.
>

It would be a substantial change to the behaviour, and to do it
properly we'd have to change to_date() to actually parse separator
characters as well.

That is, you can currently write to_date('2009/09/19', 'YYYY-MM-DD')
-- it doesn't matter what the separator characters actually look like,
since per the format pattern they cannot affect the date outcome.

This naturally leads to the question we always have to ask with these
functions: What Does Oracle Do?

But FWIW, a -1 from me for changing this.

Cheers,
BJ


Re: numeric_to_number() function skipping some digits

From
Jeevan Chalke
Date:
Hi,

On Sat, Sep 19, 2009 at 1:52 AM, Brendan Jurd <direvus@gmail.com> wrote:
2009/9/19 Tom Lane <tgl@sss.pgh.pa.us>:
> Should we have it throw an error if the input corresponding to a G
> symbol doesn't match the expected group separator?  I'm concerned that
> that would break applications that work okay today.
>

It would be a substantial change to the behaviour, and to do it
properly we'd have to change to_date() to actually parse separator
characters as well.

That is, you can currently write to_date('2009/09/19', 'YYYY-MM-DD')
-- it doesn't matter what the separator characters actually look like,
since per the format pattern they cannot affect the date outcome.

This naturally leads to the question we always have to ask with these
functions: What Does Oracle Do?

Oracle returns "19-SEP-09" irrespective of the format.
Here in PG, we have getting the proper date irrespective of the format as Oracle. But in the case to to_number the returned value is wrong. For example following query returns '340' on PG where as it returns '3450' on Oracle.

select to_number('34,50','999,99') from dual;


But FWIW, a -1 from me for changing this.

Do you mean this is the expected behaviour then?
 

Cheers,
BJ



--
Jeevan B Chalke
EnterpriseDB Software India Private Limited, Pune
Visit us at: www.enterprisedb.com
---
If better is possible, then good is not enough

Re: numeric_to_number() function skipping some digits

From
Brendan Jurd
Date:
2009/9/21 Jeevan Chalke <jeevan.chalke@enterprisedb.com>:
> Oracle returns "19-SEP-09" irrespective of the format.
> Here in PG, we have getting the proper date irrespective of the format as
> Oracle. But in the case to to_number the returned value is wrong. For
> example following query returns '340' on PG where as it returns '3450' on
> Oracle.
>
> select to_number('34,50','999,99') from dual;
>

Hi Jeevan,

Thanks for checking up on the Oracle behaviour.  It appears to
silently disregard grouping characters in the format pattern, and also
disregard them wherever they appear in the input string (or else it
reads the string from right-to-left?).

It seems that, to match Oracle, we'd need to teach the code that 'G'
and ',' are no-ops for to_number(), and also that such characters
should be ignored in the input.

To be honest, though, I'm not sure it's worth pursuing.  If you want
to feed in numbers that have decorative characters all through them,
it's far more predictable to just regex out the cruft and use ordinary
numeric parsing than to use to_number(), which is infamous for its
idiosyncrasies:

# SELECT regexp_replace('34,50', E'[\\d.]', '', 'g')::numeric;
3450

Cheers,
BJ


Re: numeric_to_number() function skipping some digits

From
Brendan Jurd
Date:
2009/9/21 Brendan Jurd <direvus@gmail.com>:
>
> # SELECT regexp_replace('34,50', E'[\\d.]', '', 'g')::numeric;
> 3450
>

Sorry, that regex ought to have read E'[^\\d.]'.


Re: numeric_to_number() function skipping some digits

From
Jeevan Chalke
Date:
Hi,

On Mon, Sep 21, 2009 at 12:36 PM, Brendan Jurd <direvus@gmail.com> wrote:
2009/9/21 Jeevan Chalke <jeevan.chalke@enterprisedb.com>:
> Oracle returns "19-SEP-09" irrespective of the format.
> Here in PG, we have getting the proper date irrespective of the format as
> Oracle. But in the case to to_number the returned value is wrong. For
> example following query returns '340' on PG where as it returns '3450' on
> Oracle.
>
> select to_number('34,50','999,99') from dual;
>

Hi Jeevan,

Thanks for checking up on the Oracle behaviour.  It appears to
silently disregard grouping characters in the format pattern, and also
disregard them wherever they appear in the input string (or else it
reads the string from right-to-left?).

It seems that Oracle reads formatting string from right-to-left. Here are few results:
('number','format') ==> Oracle          PG
--------------------------------------------
('34,50','999,99')  ==> 3450            340
('34,50','99,99')   ==> 3450            3450
('34,50','99,999')  ==> Invalid Number  3450
('34,50','999,999') ==> Invalid Number  340
 

It seems that, to match Oracle, we'd need to teach the code that 'G'
and ',' are no-ops for to_number(), and also that such characters
should be ignored in the input.

That means we cannot simply ignore such characters from the input. Rather we can process the string R-L. But yes this will definitely going to break the current applications running today.


To be honest, though, I'm not sure it's worth pursuing.  If you want
to feed in numbers that have decorative characters all through them,
it's far more predictable to just regex out the cruft and use ordinary
numeric parsing than to use to_number(), which is infamous for its
idiosyncrasies:

# SELECT regexp_replace('34,50', E'[\\d.]', '', 'g')::numeric;
3450

This (with E'[^\\d.]') ignores/replaces all the characters except digits from the input which we certainly not wishing to do. Instead we can continue with the current implementation. But IMHO, somewhere in the time-line we need to fix this.


Cheers,
BJ


Thanks
--
Jeevan B Chalke
EnterpriseDB Software India Private Limited, Pune
Visit us at: www.enterprisedb.com
---
If better is possible, then good is not enough

Re: numeric_to_number() function skipping some digits

From
Euler Taveira de Oliveira
Date:
Jeevan Chalke escreveu:
> That means we cannot simply ignore such characters from the input.
> Rather we can process the string R-L. But yes this will definitely going
> to break the current applications running today.
> 
IIRC we tight the to_char() and to_timestamp() input for 8.4. Why don't we do
it for to_number() too? The fact of breaking application is not a strong
argument against having compatibility with Oracle to_*() functions. People
that are building applications around these two databases and are using the
to_*() functions are expecting that the behavior being the same (i didn't
remember any complaints about that but ...).

So +1 for this TODO.


--  Euler Taveira de Oliveira http://www.timbira.com/


Re: numeric_to_number() function skipping some digits

From
daveg
Date:
On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
> 
> It seems that Oracle reads formatting string from right-to-left. Here are
> few results:
> ('number','format') ==> Oracle          PG
> --------------------------------------------
> ('34,50','999,99')  ==> 3450            340
> ('34,50','99,99')   ==> 3450            3450
> ('34,50','99,999')  ==> Invalid Number  3450
> ('34,50','999,999') ==> Invalid Number  340

It seems worse to to give a wrong answer silently then to throw an error.
What we do now seems sort of MySqlish.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.


Re: numeric_to_number() function skipping some digits

From
"Kevin Grittner"
Date:
daveg <daveg@sonic.net> wrote: 
> On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
>> 
>> It seems that Oracle reads formatting string from right-to-left.
>> Here are few results:
>> ('number','format') ==> Oracle          PG
>> --------------------------------------------
>> ('34,50','999,99')  ==> 3450            340
>> ('34,50','99,99')   ==> 3450            3450
>> ('34,50','99,999')  ==> Invalid Number  3450
>> ('34,50','999,999') ==> Invalid Number  340
> 
> It seems worse to to give a wrong answer silently then to throw an
> error.  What we do now seems sort of MySqlish.
I agree with David on that.
Further, it seems to me that functions which are there only for Oracle
compatibility (i.e., they are not part of the SQL standard and are not
particularly sensible, but are popular in Oracle) should behave like
the corresponding Oracle function.  An argument could even be made for
"bug compatibility" for such functions, at least to some extent;
although we're not talking about that here -- the Oracle results seem
more sane than current PostgreSQL behavior..
-Kevin


Re: numeric_to_number() function skipping some digits

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> daveg <daveg@sonic.net> wrote: 
>> On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
>>> It seems that Oracle reads formatting string from right-to-left.

>> It seems worse to to give a wrong answer silently then to throw an
>> error.  What we do now seems sort of MySqlish.
> I agree with David on that.

So who is volunteering to do the presumably rather major rewrite
involved?
        regards, tom lane


Re: numeric_to_number() function skipping some digits

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> daveg <daveg@sonic.net> wrote: 
>>> On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
>>>> It seems that Oracle reads formatting string from right-to-left.
> 
>>> It seems worse to to give a wrong answer silently then to throw an
>>> error.  What we do now seems sort of MySqlish.
>  
>> I agree with David on that.
> 
> So who is volunteering to do the presumably rather major rewrite
> involved?
We don't typically identify an author before putting something on the
TODO list.
That said, this one seems like it would be within my skill set.  I
seem to have trouble finding tasks that are for which we can get
consensus, so I just might pick this one up if we achieve such
consensus on it. The biggest problem, should I take it on, would be
that I don't currently have access to an Oracle database, so someone
would have to supply me with accurate specs for how it should behave,
or point to unambiguous documentation.
-Kevin


Re: numeric_to_number() function skipping some digits

From
"Joshua D. Drake"
Date:
On Wed, 2009-09-30 at 19:08 -0500, Kevin Grittner wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> >> daveg <daveg@sonic.net> wrote:
> >>> On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
> >>>> It seems that Oracle reads formatting string from right-to-left.
> >
> >>> It seems worse to to give a wrong answer silently then to throw an
> >>> error.  What we do now seems sort of MySqlish.
> >
> >> I agree with David on that.
> >
> > So who is volunteering to do the presumably rather major rewrite
> > involved?
>
> We don't typically identify an author before putting something on the
> TODO list.
>
> That said, this one seems like it would be within my skill set.  I
> seem to have trouble finding tasks that are for which we can get
> consensus, so I just might pick this one up if we achieve such
> consensus on it. The biggest problem, should I take it on, would be
> that I don't currently have access to an Oracle database, so someone
> would have to supply me with accurate specs for how it should behave,
> or point to unambiguous documentation.

Just download developer edition?

>
> -Kevin
>
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - Salamander

Re: numeric_to_number() function skipping some digits

From
"Kevin Grittner"
Date:
"Joshua D. Drake" <jd@commandprompt.com> wrote: 
> On Wed, 2009-09-30 at 19:08 -0500, Kevin Grittner wrote:
>> I don't currently have access to an Oracle database
> Just download developer edition?
[quick google search]
Looks like that would do it.  Thanks.
-Kevin


Re: numeric_to_number() function skipping some digits

From
"Joshua D. Drake"
Date:
On Wed, 2009-09-30 at 19:08 -0500, Kevin Grittner wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> >> daveg <daveg@sonic.net> wrote: 
> >>> On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
> >>>> It seems that Oracle reads formatting string from right-to-left.
> > 
> >>> It seems worse to to give a wrong answer silently then to throw an
> >>> error.  What we do now seems sort of MySqlish.
> >  
> >> I agree with David on that.
> > 
> > So who is volunteering to do the presumably rather major rewrite
> > involved?
>  
> We don't typically identify an author before putting something on the
> TODO list.
>  
> That said, this one seems like it would be within my skill set.  I
> seem to have trouble finding tasks that are for which we can get
> consensus, so I just might pick this one up if we achieve such
> consensus on it. The biggest problem, should I take it on, would be
> that I don't currently have access to an Oracle database, so someone
> would have to supply me with accurate specs for how it should behave,
> or point to unambiguous documentation.

Just download developer edition?

>  
> -Kevin
> 
-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - Salamander



Re: numeric_to_number() function skipping some digits

From
Bruce Momjian
Date:
Added to TODO:
|Fix to_number() handling for values not matching the format string



---------------------------------------------------------------------------

Jeevan Chalke wrote:
> Hi,
> 
> On Mon, Sep 21, 2009 at 12:36 PM, Brendan Jurd <direvus@gmail.com> wrote:
> 
> > 2009/9/21 Jeevan Chalke <jeevan.chalke@enterprisedb.com>:
> > > Oracle returns "19-SEP-09" irrespective of the format.
> > > Here in PG, we have getting the proper date irrespective of the format as
> > > Oracle. But in the case to to_number the returned value is wrong. For
> > > example following query returns '340' on PG where as it returns '3450' on
> > > Oracle.
> > >
> > > select to_number('34,50','999,99') from dual;
> > >
> >
> > Hi Jeevan,
> >
> > Thanks for checking up on the Oracle behaviour.  It appears to
> > silently disregard grouping characters in the format pattern, and also
> > disregard them wherever they appear in the input string (or else it
> > reads the string from right-to-left?).
> >
> 
> It seems that Oracle reads formatting string from right-to-left. Here are
> few results:
> ('number','format') ==> Oracle          PG
> --------------------------------------------
> ('34,50','999,99')  ==> 3450            340
> ('34,50','99,99')   ==> 3450            3450
> ('34,50','99,999')  ==> Invalid Number  3450
> ('34,50','999,999') ==> Invalid Number  340
> 
> 
> >
> > It seems that, to match Oracle, we'd need to teach the code that 'G'
> > and ',' are no-ops for to_number(), and also that such characters
> > should be ignored in the input.
> >
> 
> That means we cannot simply ignore such characters from the input. Rather we
> can process the string R-L. But yes this will definitely going to break the
> current applications running today.
> 
> 
> > To be honest, though, I'm not sure it's worth pursuing.  If you want
> > to feed in numbers that have decorative characters all through them,
> > it's far more predictable to just regex out the cruft and use ordinary
> > numeric parsing than to use to_number(), which is infamous for its
> > idiosyncrasies:
> >
> > # SELECT regexp_replace('34,50', E'[\\d.]', '', 'g')::numeric;
> > 3450
> >
> 
> This (with E'[^\\d.]') ignores/replaces all the characters except digits
> from the input which we certainly not wishing to do. Instead we can continue
> with the current implementation. But IMHO, somewhere in the time-line we
> need to fix this.
> 
> 
> > Cheers,
> > BJ
> >
> 
> 
> Thanks
> -- 
> Jeevan B Chalke
> EnterpriseDB Software India Private Limited, Pune
> Visit us at: www.enterprisedb.com
> ---
> If better is possible, then good is not enough

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.comPG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard
drive,Christ can be your backup. +