Thread: WIP: to_char, support for EEEE format
Hello I was surprised so PostgreSQL doesn't support this basic output format. Regards Pavel Stehule
Attachment
On Sat, Apr 11, 2009 at 2:16 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > I was surprised so PostgreSQL doesn't support this basic output format. > Hi Pavel, I had a look through your patch and would like to suggest improvements to the new error messages you've introduced. 1. "invalid using of format EEEE" This message occurs several times in the patch. For one thing, the grammar is wrong; it should be "use", not "using". Additionally, this message on its own is not very helpful. If I was trying to use to_char and got "invalid use of format" my first thought would be "Invalid how?" The message should at minimum have a DETAIL, and possibly a HINT as well to make it effective. 2. "cannot use EEEE and others" The wording on this message is a bit awkward. I think what you meant to say is that EEEE cannot be used with certain other formatting codes, but this should be made explicit in the message. Cheers, BJ
Hi, I know very well, so all texts in my patches should be translated to English. My language skills are really minimal. So, please, if you can, propose these error messages (with hints)- result will be much better. Thank you Pavel 2009/4/10 Brendan Jurd <direvus@gmail.com>: > On Sat, Apr 11, 2009 at 2:16 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> I was surprised so PostgreSQL doesn't support this basic output format. >> > > Hi Pavel, > > I had a look through your patch and would like to suggest improvements > to the new error messages you've introduced. > > 1. "invalid using of format EEEE" > > This message occurs several times in the patch. For one thing, the > grammar is wrong; it should be "use", not "using". > > Additionally, this message on its own is not very helpful. If I was > trying to use to_char and got "invalid use of format" my first thought > would be "Invalid how?" The message should at minimum have a DETAIL, > and possibly a HINT as well to make it effective. > > 2. "cannot use EEEE and others" > > The wording on this message is a bit awkward. I think what you meant > to say is that EEEE cannot be used with certain other formatting > codes, but this should be made explicit in the message. > > Cheers, > BJ >
On Sat, Apr 11, 2009 at 3:51 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > So, please, if you can, propose these error messages (with hints)- > result will be much better. > Hi Pavel, I was doing some work on rewording these error messages, and I noticed that the following code segment occurs identically in four different locations: numstr = orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1); if (Num.pre != 1) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("invalid using of format EEEE"))); Rather than rewording all four copies of the message, I wonder if this test might be better factored out into a separate function? Cheers, BJ
2009/4/21 Brendan Jurd <direvus@gmail.com>: > On Sat, Apr 11, 2009 at 3:51 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> So, please, if you can, propose these error messages (with hints)- >> result will be much better. >> > > Hi Pavel, > > I was doing some work on rewording these error messages, and I noticed > that the following code segment occurs identically in four different > locations: > > numstr = orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1); > if (Num.pre != 1) > ereport(ERROR, > (errcode(ERRCODE_SYNTAX_ERROR), > errmsg("invalid using of format EEEE"))); > > Rather than rewording all four copies of the message, I wonder if this > test might be better factored out into a separate function? maybe macro is better - it is too short and without any semantic for function, but maybe not. The length of source code is not problem - the short function will be inlined, so total length will be same. What should be name for this function or for this macro? It hasn't any semantic. There should be readable macro only for ereport function - some #define RAISE_EEEE_FORMAT_ERROR ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("invalid using of format EEEE"))); regards Pavel Stehule > > Cheers, > BJ >
On Wed, Apr 22, 2009 at 10:13 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2009/4/21 Brendan Jurd <direvus@gmail.com>: >> numstr = orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1); >> if (Num.pre != 1) >> ereport(ERROR, >> (errcode(ERRCODE_SYNTAX_ERROR), >> errmsg("invalid using of format EEEE"))); >> >> Rather than rewording all four copies of the message, I wonder if this >> test might be better factored out into a separate function? > > maybe macro is better - it is too short and without any semantic for > function, but maybe not. The length of source code is not problem - > the short function will be inlined, so total length will be same. What > should be name for this function or for this macro? It hasn't any > semantic. There should be readable macro only for ereport function - > some I was thinking of factoring out the *test*, not just the error message. If I've been reading this code correctly, the purpose of if (Num.pre != 1) is to make sure that the numeric format has been given with one digit before the decimal place (so 9.99EEEE would be acceptable but 99.999EEEE would cause the ERROR). Because this check is made from various places in the code, it makes sense to me that it should be a function. Duplicated code makes me itchy. Perhaps called something like sci_notation_check_format(NUMDesc *). Cheers, BJ
On Sat, Apr 11, 2009 at 3:51 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > So, please, if you can, propose these error messages (with hints)- > result will be much better. > Hi Pavel, hackers. I've done some work updating Pavel's sci notation patch for to_char(). I've attached a patch against HEAD (EEEE_2.diff.bz2) and against Pavel's patch as originally submitted to the list (EEEE_1-to-2.diff.bz2). A summary of my changes: * Improve the wording of error messages, and add some detail/hint parts. * Update the documentation. * Move the duplicated "one digit before decimal point" test into a macro. * Fix a bug in the int8_to_char() implementation (was using the wrong variable). Hope you find this useful. Cheers, BJ
Attachment
Hi Brendan this looks well, so please, add it to commitfest page regards Pavel Stehule 2009/4/26 Brendan Jurd <direvus@gmail.com>: > On Sat, Apr 11, 2009 at 3:51 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> So, please, if you can, propose these error messages (with hints)- >> result will be much better. >> > > Hi Pavel, hackers. > > I've done some work updating Pavel's sci notation patch for to_char(). > I've attached a patch against HEAD (EEEE_2.diff.bz2) and against > Pavel's patch as originally submitted to the list > (EEEE_1-to-2.diff.bz2). > > A summary of my changes: > > * Improve the wording of error messages, and add some detail/hint parts. > * Update the documentation. > * Move the duplicated "one digit before decimal point" test into a macro. > * Fix a bug in the int8_to_char() implementation (was using the wrong > variable). > > Hope you find this useful. > > Cheers, > BJ >
2009/4/26 Brendan Jurd <direvus@gmail.com>: > I've done some work updating Pavel's sci notation patch for to_char(). That patch again, now with a couple of minor tweaks to make it apply cleanly against the current HEAD. Cheers, BJ
Attachment
Brendan Jurd escreveu: > 2009/4/26 Brendan Jurd <direvus@gmail.com>: >> I've done some work updating Pavel's sci notation patch for to_char(). > > That patch again, now with a couple of minor tweaks to make it apply > cleanly against the current HEAD. > Here is my review. The patch applied without problems. The docs and regression tests are included. Both of them worked as expected. Also, you included a fix in RN format, do it in another patch. The behavior is not the same as Oracle. Oracle accepts an invalid scientific notation '999.9EEEE'. Will we support it too? I think so. euler=# SELECT to_char(1234.56789, '999.9EEEE'); ERRO: invalid format for scientific notation DETALHE: "EEEE" requires exactly one digit before the decimal point. DICA: For example, "9.999EEEE" is a valid format. TO_CHAR(1234.56789,'999.9EEEE') ------------------------------- 1.2E+03 1 rows selected The '9.999eeee' format error message is misleading. euler=# select to_char(123, '9.999eeee'); ERRO: cannot use "EEEE" twice You could include an example in manual too. You could add the two failing cases above in regression tests too. -- Euler Taveira de Oliveira http://www.timbira.com/
2009/7/24 Euler Taveira de Oliveira <euler@timbira.com>: > Here is my review. The patch applied without problems. The docs and regression > tests are included. Both of them worked as expected. Also, you included a fix > in RN format, do it in another patch. > Well, I updated an error message for RN to keep it consistent with the change I made to the nearby EEEE error message. Neither RN or EEEE is supported for input, and the error messages were vague on this point (they just said "not supported"). I understand that separate improvements should be submitted as separate patches, but this is really part of the one improvement. Implementing EEEE required improving the error messages, and consistency required that we improve the RN error message also. > The behavior is not the same as Oracle. Oracle accepts an invalid scientific > notation '999.9EEEE'. Will we support it too? I think so. > > euler=# SELECT to_char(1234.56789, '999.9EEEE'); > ERRO: invalid format for scientific notation > DETALHE: "EEEE" requires exactly one digit before the decimal point. > DICA: For example, "9.999EEEE" is a valid format. > > TO_CHAR(1234.56789,'999.9EEEE') > ------------------------------- > 1.2E+03 *shakes fist at Oracle* yes, I suppose we had better follow suit. > The '9.999eeee' format error message is misleading. > > euler=# select to_char(123, '9.999eeee'); > ERRO: cannot use "EEEE" twice Ah, thanks for picking up on this. This was a bug in the original patch. Looks like we forgot to update the formatting keyword for lowercase "e". > > You could include an example in manual too. You could add the two failing > cases above in regression tests too. > I had already added an example to the manual. Please find attached version 4 of the patch, and incremental diff from version 3. It fixes the "eeee" bug ("eeee" is now accepted as a valid form of "EEEE"), and lifts the restriction on only having one digit before the decimal point. Cheers, BJ
Attachment
On Sat, Jul 25, 2009 at 2:08 AM, Brendan Jurd<direvus@gmail.com> wrote: > 2009/7/24 Euler Taveira de Oliveira <euler@timbira.com>: >> Here is my review. The patch applied without problems. The docs and regression >> tests are included. Both of them worked as expected. Also, you included a fix >> in RN format, do it in another patch. >> > > Well, I updated an error message for RN to keep it consistent with the > change I made to the nearby EEEE error message. > > Neither RN or EEEE is supported for input, and the error messages were > vague on this point (they just said "not supported"). > > I understand that separate improvements should be submitted as > separate patches, but this is really part of the one improvement. > Implementing EEEE required improving the error messages, and > consistency required that we improve the RN error message also. > >> The behavior is not the same as Oracle. Oracle accepts an invalid scientific >> notation '999.9EEEE'. Will we support it too? I think so. >> >> euler=# SELECT to_char(1234.56789, '999.9EEEE'); >> ERRO: invalid format for scientific notation >> DETALHE: "EEEE" requires exactly one digit before the decimal point. >> DICA: For example, "9.999EEEE" is a valid format. >> >> TO_CHAR(1234.56789,'999.9EEEE') >> ------------------------------- >> 1.2E+03 > > *shakes fist at Oracle* yes, I suppose we had better follow suit. > >> The '9.999eeee' format error message is misleading. >> >> euler=# select to_char(123, '9.999eeee'); >> ERRO: cannot use "EEEE" twice > > Ah, thanks for picking up on this. This was a bug in the original > patch. Looks like we forgot to update the formatting keyword for > lowercase "e". > >> >> You could include an example in manual too. You could add the two failing >> cases above in regression tests too. >> > > I had already added an example to the manual. > > Please find attached version 4 of the patch, and incremental diff from > version 3. It fixes the "eeee" bug ("eeee" is now accepted as a valid > form of "EEEE"), and lifts the restriction on only having one digit > before the decimal point. Euler, Are you going to review this again? We need to know if it is Ready for Committer. ...Robert
Brendan Jurd escreveu: > Please find attached version 4 of the patch, and incremental diff from > version 3. It fixes the "eeee" bug ("eeee" is now accepted as a valid > form of "EEEE"), and lifts the restriction on only having one digit > before the decimal point. > Looks better but I did some tests and caught some strange behaviors. SQL> SELECT to_char(1234.56789, '8.999EEEE') FROM DUAL; SELECT to_char(1234.56789, '8.999EEEE') FROM DUAL ERROR at line 1: ORA-01481: invalid number format model SQL> SELECT to_char(1234.56789, '9.080EEEE') FROM DUAL; SELECT to_char(1234.56789, '9.080EEEE') FROM DUAL ERROR at line 1: ORA-01481: invalid number format model This is not a problem with your patch but something that needs to be fixed in PostgreSQL to match Oracle behavior. The following example should emit an error. IMHO, filling the string with # is very strange. TODO? euler=# SELECT to_char(1234.56789, '9.080');to_char --------- #.#8# (1 row) Couldn't the following code be put inside switch clause? If not, you should add a comment why the validation is outside switch. + if (IS_EEEE(num) && n->key->id != NUM_E) + { + NUM_cache_remove(last_NUMCacheEntry); + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("\"EEEE\" must be the last pattern used"))); + } + switch (n->key->id) { case NUM_9: Oracle has a diferent overflow limit [1] but I think we could stay with the PostgreSQL one. But the #.#### is not the intended behavior. IIRC you're limited to 99 exponent. SQL> SELECT to_char(1.234567E+308, '9.999EEEE'); SELECT to_char(1.234567E+308, '9.999EEEE') ERROR at line 1: ORA-01426: numeric overflow euler=# SELECT to_char(1.234567E+308, '9.999EEEE'); to_char -----------#.####### (1 row) The problem is in numeric_to_char() and float8_to_char(). You could fix it with the following code. Besides that I think you should comment why '5' or '6' in the other *_to_char() functions. + /* 6 means '.' (decimal point), 'E', '+', and 3 exponent digits */ + if (isnan(value) || is_infinite(value) || + len > Num.pre + Num.post + 6) + { + numstr = (char *) palloc(Num.pre + Num.post + 7); + fill_str(numstr, '#', Num.pre + Num.post + 6); + *(numstr + Num.pre) = '.'; + } I can't see more problems in your patch. When you fix it, it'll be ready for a committer. [1] http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements001.htm#sthref80 -- Euler Taveira de Oliveira http://www.timbira.com/
2009/7/29 Euler Taveira de Oliveira <euler@timbira.com>: > Looks better but I did some tests and caught some strange behaviors. > Pavel, Any comment on these cases? When I looked at the patch I wasn't really sure why it filled the string with '#' either, but I didn't question it. Cheers, BJ
2009/7/29 Brendan Jurd <direvus@gmail.com>: > 2009/7/29 Euler Taveira de Oliveira <euler@timbira.com>: >> Looks better but I did some tests and caught some strange behaviors. >> > > Pavel, > > Any comment on these cases? When I looked at the patch I wasn't > really sure why it filled the string with '#' either, but I didn't > question it. This is consistent with postgres. Pavel > > Cheers, > BJ >
2009/7/29 Euler Taveira de Oliveira <euler@timbira.com>: > This is not a problem with your patch but something that needs to be fixed in > PostgreSQL to match Oracle behavior. The following example should emit an > error. IMHO, filling the string with # is very strange. TODO? > > euler=# SELECT to_char(1234.56789, '9.080'); > to_char > --------- > #.#8# > (1 row) The formatting functions have a lot of weird corner cases. I've been trying to clean up some of the more bizarre behaviours in the date/time formatting functions, but haven't touched the numeric formatting because I haven't ever needed to use it. Filling unused characters in the string with "#" may be strange, but changing it would require a much broader patch that covers all of the numeric formatting styles, not just EEEE. A TODO is probably the way to go. > > Couldn't the following code be put inside switch clause? If not, you should > add a comment why the validation is outside switch. > > + if (IS_EEEE(num) && n->key->id != NUM_E) > + { > + NUM_cache_remove(last_NUMCacheEntry); > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("\"EEEE\" must be the last pattern used"))); > + } > + > switch (n->key->id) > { > case NUM_9: The switch is on (n->key->id), but the test you mentioned above is looking for any keywords *other than* the EEEE keyword, where EEEE has previously been parsed. So if you put the test inside the switch, it would need to appear in every single branch of the switch except for the NUM_E one. I'm confused about why you think this needs a comment. Perhaps I misunderstood you? > > Oracle has a diferent overflow limit [1] but I think we could stay with the > PostgreSQL one. But the #.#### is not the intended behavior. IIRC you're > limited to 99 exponent. > > SQL> SELECT to_char(1.234567E+308, '9.999EEEE'); > SELECT to_char(1.234567E+308, '9.999EEEE') > > ERROR at line 1: > ORA-01426: numeric overflow > > euler=# SELECT to_char(1.234567E+308, '9.999EEEE'); > to_char > ----------- > #.####### > (1 row) I don't see any problem with extending this to allow up to 3 exponent digits ... Pavel, any comment? > > The problem is in numeric_to_char() and float8_to_char(). You could fix it > with the following code. Besides that I think you should comment why '5' or > '6' in the other *_to_char() functions. > > + /* 6 means '.' (decimal point), 'E', '+', and 3 exponent digits */ Agreed about the comment; I'll add it. Cheers, BJ
2009/7/29 Brendan Jurd <direvus@gmail.com>: > 2009/7/29 Euler Taveira de Oliveira <euler@timbira.com>: >> This is not a problem with your patch but something that needs to be fixed in >> PostgreSQL to match Oracle behavior. The following example should emit an >> error. IMHO, filling the string with # is very strange. TODO? >> >> euler=# SELECT to_char(1234.56789, '9.080'); >> to_char >> --------- >> #.#8# >> (1 row) > > The formatting functions have a lot of weird corner cases. I've been > trying to clean up some of the more bizarre behaviours in the > date/time formatting functions, but haven't touched the numeric > formatting because I haven't ever needed to use it. > > Filling unused characters in the string with "#" may be strange, but > changing it would require a much broader patch that covers all of the > numeric formatting styles, not just EEEE. A TODO is probably the way > to go. > >> >> Couldn't the following code be put inside switch clause? If not, you should >> add a comment why the validation is outside switch. >> >> + if (IS_EEEE(num) && n->key->id != NUM_E) >> + { >> + NUM_cache_remove(last_NUMCacheEntry); >> + ereport(ERROR, >> + (errcode(ERRCODE_SYNTAX_ERROR), >> + errmsg("\"EEEE\" must be the last pattern used"))); >> + } >> + >> switch (n->key->id) >> { >> case NUM_9: > > The switch is on (n->key->id), but the test you mentioned above is > looking for any keywords *other than* the EEEE keyword, where EEEE has > previously been parsed. > > So if you put the test inside the switch, it would need to appear in > every single branch of the switch except for the NUM_E one. I'm > confused about why you think this needs a comment. Perhaps I > misunderstood you? > >> >> Oracle has a diferent overflow limit [1] but I think we could stay with the >> PostgreSQL one. But the #.#### is not the intended behavior. IIRC you're >> limited to 99 exponent. >> >> SQL> SELECT to_char(1.234567E+308, '9.999EEEE'); >> SELECT to_char(1.234567E+308, '9.999EEEE') >> >> ERROR at line 1: >> ORA-01426: numeric overflow >> >> euler=# SELECT to_char(1.234567E+308, '9.999EEEE'); >> to_char >> ----------- >> #.####### >> (1 row) > > I don't see any problem with extending this to allow up to 3 exponent > digits ... Pavel, any comment? > I am not sure - this function should be used in reports witl fixed line's width. And I am thinking, so it's should be problem - I prefer showing some #.### chars. It's clean signal, so some is wrong, but it doesn't break generating long run reports (like exception in Oracle) and doesn't broke formating like 3 exponent digits. Pavel >> >> The problem is in numeric_to_char() and float8_to_char(). You could fix it >> with the following code. Besides that I think you should comment why '5' or >> '6' in the other *_to_char() functions. >> >> + /* 6 means '.' (decimal point), 'E', '+', and 3 exponent digits */ > > Agreed about the comment; I'll add it. > > Cheers, > BJ >
Brendan Jurd escreveu: > Filling unused characters in the string with "#" may be strange, but > changing it would require a much broader patch that covers all of the > numeric formatting styles, not just EEEE. A TODO is probably the way > to go. > That was my suggestion: a TODO. > So if you put the test inside the switch, it would need to appear in > every single branch of the switch except for the NUM_E one. I'm > confused about why you think this needs a comment. Perhaps I > misunderstood you? > Yes, I know you need to modify every 'case' clause to test if EEEE was previously used (that was one suggestion) but I said if you don't want to go that way, add a comment explaining why you're using that 'if' above the 'switch' and not inside it. -- Euler Taveira de Oliveira http://www.timbira.com/
2009/7/30 Pavel Stehule <pavel.stehule@gmail.com>: > 2009/7/29 Brendan Jurd <direvus@gmail.com>: >> I don't see any problem with extending this to allow up to 3 exponent >> digits ... Pavel, any comment? > > I am not sure - this function should be used in reports witl fixed > line's width. And I am thinking, so it's should be problem - I prefer > showing some #.### chars. It's clean signal, so some is wrong, but it > doesn't break generating long run reports (like exception in Oracle) > and doesn't broke formating like 3 exponent digits. Hmm. For what it's worth, I think Pavel makes a good point about the number of exponent digits -- a large chunk of the use case for numeric formatting would be fixed-width reporting. Limiting to two exponent digits also has the nice property that the output always matches the length of the format pattern: 9.99EEEE 1.23E+02 I don't know whether being able to represent 3-digit exponents outweighs the value of reliable fixed-width output. Would anyone else care to throw in their opinion? However we end up handling it, we will probably need to flesh out the docs regarding this. Cheers, BJ
2009/7/30 Euler Taveira de Oliveira <euler@timbira.com>: >> So if you put the test inside the switch, it would need to appear in >> every single branch of the switch except for the NUM_E one. I'm >> confused about why you think this needs a comment. Perhaps I >> misunderstood you? >> > Yes, I know you need to modify every 'case' clause to test if EEEE was > previously used (that was one suggestion) but I said if you don't want to go > that way, add a comment explaining why you're using that 'if' above the > 'switch' and not inside it. > I think we've pretty much reached an impasse on this one. I can't imagine anyone reading the code getting confused about this, and don't know how to go about writing a comment explaining something that is intuitively obvious to me. I don't understand what aspect of it requires explanation. The test is not in the switch because it doesn't belong there. Perhaps someone else could weigh in and help us to resolve this? Cheers, BJ
Brendan Jurd escreveu: > Hmm. For what it's worth, I think Pavel makes a good point about the > number of exponent digits -- a large chunk of the use case for numeric > formatting would be fixed-width reporting. > But it doesn't cover all numbers in the interval. And in this case, all numbers can be easily represented. > Limiting to two exponent digits also has the nice property that the > output always matches the length of the format pattern: > > 9.99EEEE > 1.23E+02 > I don't think neglecting to represent a valid number is a "nice property". What does the length of format pattern have to do with output format? -- Euler Taveira de Oliveira http://www.timbira.com/
Brendan Jurd escreveu: > I can't imagine anyone reading the code getting confused about this, > and don't know how to go about writing a comment explaining something > that is intuitively obvious to me. I don't understand what aspect of > it requires explanation. The test is not in the switch because it > doesn't belong there. > It was just a suggestion. If you think it is self-explanatory, so it is. But I (as the first time reading that piece of code) take some time to figure out why that test is outside the switch. -- Euler Taveira de Oliveira http://www.timbira.com/
On Thu, Jul 30, 2009 at 12:17 PM, Euler Taveira de Oliveira<euler@timbira.com> wrote: > Brendan Jurd escreveu: >> I can't imagine anyone reading the code getting confused about this, >> and don't know how to go about writing a comment explaining something >> that is intuitively obvious to me. I don't understand what aspect of >> it requires explanation. The test is not in the switch because it >> doesn't belong there. >> > It was just a suggestion. If you think it is self-explanatory, so it is. But I > (as the first time reading that piece of code) take some time to figure out > why that test is outside the switch. I don't think we should worry about this particular issue too much. If the biggest problem this patch has is whether or not this deserves a comment, we're doing well; the committer can (and doubtless will) adjust that as they may see fit. ...Robert
On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: > 2009/7/30 Pavel Stehule <pavel.stehule@gmail.com>: >> 2009/7/29 Brendan Jurd <direvus@gmail.com>: >>> I don't see any problem with extending this to allow up to 3 exponent >>> digits ... Pavel, any comment? >> >> I am not sure - this function should be used in reports witl fixed >> line's width. And I am thinking, so it's should be problem - I prefer >> showing some #.### chars. It's clean signal, so some is wrong, but it >> doesn't break generating long run reports (like exception in Oracle) >> and doesn't broke formating like 3 exponent digits. > > Hmm. For what it's worth, I think Pavel makes a good point about the > number of exponent digits -- a large chunk of the use case for numeric > formatting would be fixed-width reporting. > > Limiting to two exponent digits also has the nice property that the > output always matches the length of the format pattern: > > 9.99EEEE > 1.23E+02 > > I don't know whether being able to represent 3-digit exponents > outweighs the value of reliable fixed-width output. Would anyone else > care to throw in their opinion? However we end up handling it, we > will probably need to flesh out the docs regarding this. Well, what if my whole database is full of numbers with three and four digit exponents? Do I have an out, or am I just hosed? Apologies if this is a stupid question, I haven't read this patch. ...Robert
2009/7/30 Robert Haas <robertmhaas@gmail.com>: > On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: >> 2009/7/30 Pavel Stehule <pavel.stehule@gmail.com>: >>> 2009/7/29 Brendan Jurd <direvus@gmail.com>: >>>> I don't see any problem with extending this to allow up to 3 exponent >>>> digits ... Pavel, any comment? >>> >>> I am not sure - this function should be used in reports witl fixed >>> line's width. And I am thinking, so it's should be problem - I prefer >>> showing some #.### chars. It's clean signal, so some is wrong, but it >>> doesn't break generating long run reports (like exception in Oracle) >>> and doesn't broke formating like 3 exponent digits. >> >> Hmm. For what it's worth, I think Pavel makes a good point about the >> number of exponent digits -- a large chunk of the use case for numeric >> formatting would be fixed-width reporting. >> >> Limiting to two exponent digits also has the nice property that the >> output always matches the length of the format pattern: >> >> 9.99EEEE >> 1.23E+02 >> >> I don't know whether being able to represent 3-digit exponents >> outweighs the value of reliable fixed-width output. Would anyone else >> care to throw in their opinion? However we end up handling it, we >> will probably need to flesh out the docs regarding this. > > Well, what if my whole database is full of numbers with three and four > digit exponents? Do I have an out, or am I just hosed? > Maybe we should to support some modificator like Large EEEE - LEEEE or EEEEE ?? that it use 3-digit exponents Pavel > Apologies if this is a stupid question, I haven't read this patch. > > ...Robert >
2009/7/31 Euler Taveira de Oliveira <euler@timbira.com>: > Brendan Jurd escreveu: >> Limiting to two exponent digits also has the nice property that the >> output always matches the length of the format pattern: >> >> 9.99EEEE >> 1.23E+02 >> > I don't think neglecting to represent a valid number is a "nice property". > What does the length of format pattern have to do with output format? Most of the format patterns in to_char() are chosen to match the length of their expected output. The output of "DD" is two characters, "Mon" is three and so on. That's why the scientific notation pattern is "EEEE" and not just "E". There're certainly exceptions though. "SG" is actually output as one character, and "YYYY" expands to however many digits you happen to have in your year value. In fact "YYYY" is probably our closest analogy in to_char() to the number of digits in the exponent. In almost all cases, it will come out as four characters but for extraordinarily large values it will occupy more characters as necessary. This breaks fixed-width, but is probably a better outcome than failure to represent the value. If we apply the same logic to EEEE then it should expand to admit whatever exponent was given to it. Cheers, BJ
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2009/7/30 Robert Haas <robertmhaas@gmail.com>: >> On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: >>> Hmm. For what it's worth, I think Pavel makes a good point about the >>> number of exponent digits -- a large chunk of the use case for numeric >>> formatting would be fixed-width reporting. +1. If you aren't trying to get the format exactly so, it's not clear why you're bothering with to_char() at all. > Maybe we should to support some modificator like Large EEEE - LEEEE or EEEEE Five (or more?) E's seems like a natural extension to me. However, that still leaves us with the question of what to do when the exponent doesn't fit in however many digits we'd like to print. Seems like the options are* print #'s* force the output wider* throw an error None of these are very nice, but the first two could cause problems that you don't find out until it's too late to fix. What about throwing an error? Sorry if this was already covered in the thread, but: anybody know what Oracle does for this? regards, tom lane
On Thu, Jul 30, 2009 at 12:46 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> 2009/7/30 Robert Haas <robertmhaas@gmail.com>: >>> On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: >>>> Hmm. For what it's worth, I think Pavel makes a good point about the >>>> number of exponent digits -- a large chunk of the use case for numeric >>>> formatting would be fixed-width reporting. > > +1. If you aren't trying to get the format exactly so, it's not clear > why you're bothering with to_char() at all. > >> Maybe we should to support some modificator like Large EEEE - LEEEE or EEEEE > > Five (or more?) E's seems like a natural extension to me. However, that > still leaves us with the question of what to do when the exponent > doesn't fit in however many digits we'd like to print. Seems like the > options are The tricky thing here is that there are two independent parameters here - the MINIMUM number of exponent digits you want (which might be 1, 2, 3, or more), and the MAXIMUM number of exponent digits you want (which might be the same as minimum, or a larger number, or infinity). ...Robert
Brendan Jurd escreveu: > 2009/7/31 Euler Taveira de Oliveira <euler@timbira.com>: >> Brendan Jurd escreveu: >>> Limiting to two exponent digits also has the nice property that the >>> output always matches the length of the format pattern: >>> >>> 9.99EEEE >>> 1.23E+02 >>> >> I don't think neglecting to represent a valid number is a "nice property". >> What does the length of format pattern have to do with output format? > > Most of the format patterns in to_char() are chosen to match the > length of their expected output. The output of "DD" is two > characters, "Mon" is three and so on. > Brendan, the main point of to_*() functions is Oracle-compatibility. So let's just do it. No matter it seems bizarre (like 999.99EEEE). > That's why the scientific notation pattern is "EEEE" and not just "E". > I don't know. Ask Oracle. ;) We don't need to bother with that. Let's just be compatible. -- Euler Taveira de Oliveira http://www.timbira.com/
2009/7/30 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> 2009/7/30 Robert Haas <robertmhaas@gmail.com>: >>> On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: >>>> Hmm. For what it's worth, I think Pavel makes a good point about the >>>> number of exponent digits -- a large chunk of the use case for numeric >>>> formatting would be fixed-width reporting. > > +1. If you aren't trying to get the format exactly so, it's not clear > why you're bothering with to_char() at all. > >> Maybe we should to support some modificator like Large EEEE - LEEEE or EEEEE > > Five (or more?) E's seems like a natural extension to me. However, that > still leaves us with the question of what to do when the exponent > doesn't fit in however many digits we'd like to print. Seems like the > options are > * print #'s > * force the output wider > * throw an error > None of these are very nice, but the first two could cause problems that > you don't find out until it's too late to fix. What about throwing an > error? I thing, so Oracle raise error. But I don't thing, so it is necessary repeat all Oracle the behave - mainly when is maybe not too much practical. * print #s, and force the output wider has one disadvantage - it cannot put clean signal about data problem in development time, maybe we should to add raise warning. * throw an error should to break "bad" written application in production, when is too late too. So anybody should have not complete test data set and there are a problem. I prefer print # with raising an warning. Pavel > > Sorry if this was already covered in the thread, but: anybody know what > Oracle does for this? > > regards, tom lane >
On Thu, Jul 30, 2009 at 1:18 PM, Pavel Stehule<pavel.stehule@gmail.com> wrote: > 2009/7/30 Tom Lane <tgl@sss.pgh.pa.us>: >> Pavel Stehule <pavel.stehule@gmail.com> writes: >>> 2009/7/30 Robert Haas <robertmhaas@gmail.com>: >>>> On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: >>>>> Hmm. For what it's worth, I think Pavel makes a good point about the >>>>> number of exponent digits -- a large chunk of the use case for numeric >>>>> formatting would be fixed-width reporting. >> >> +1. If you aren't trying to get the format exactly so, it's not clear >> why you're bothering with to_char() at all. >> >>> Maybe we should to support some modificator like Large EEEE - LEEEE or EEEEE >> >> Five (or more?) E's seems like a natural extension to me. However, that >> still leaves us with the question of what to do when the exponent >> doesn't fit in however many digits we'd like to print. Seems like the >> options are >> * print #'s >> * force the output wider >> * throw an error >> None of these are very nice, but the first two could cause problems that >> you don't find out until it's too late to fix. What about throwing an >> error? > > I thing, so Oracle raise error. But I don't thing, so it is necessary > repeat all Oracle the behave - mainly when is maybe not too much > practical. > > * print #s, and force the output wider has one disadvantage - it > cannot put clean signal about data problem in development time, maybe > we should to add raise warning. > > * throw an error should to break "bad" written application in > production, when is too late too. So anybody should have not complete > test data set and there are a problem. > > I prefer print # with raising an warning. It seems like the discussion here has kind of died. We need to settle on an approach and get a final patch soon, or else defer this until next CommitFest. ...Robert
2009/8/2 Robert Haas <robertmhaas@gmail.com>: > On Thu, Jul 30, 2009 at 1:18 PM, Pavel Stehule<pavel.stehule@gmail.com> wrote: >> 2009/7/30 Tom Lane <tgl@sss.pgh.pa.us>: >>> Pavel Stehule <pavel.stehule@gmail.com> writes: >>>> 2009/7/30 Robert Haas <robertmhaas@gmail.com>: >>>>> On Thu, Jul 30, 2009 at 10:35 AM, Brendan Jurd<direvus@gmail.com> wrote: >>>>>> Hmm. For what it's worth, I think Pavel makes a good point about the >>>>>> number of exponent digits -- a large chunk of the use case for numeric >>>>>> formatting would be fixed-width reporting. >>> >>> +1. If you aren't trying to get the format exactly so, it's not clear >>> why you're bothering with to_char() at all. >>> >>>> Maybe we should to support some modificator like Large EEEE - LEEEE or EEEEE >>> >>> Five (or more?) E's seems like a natural extension to me. However, that >>> still leaves us with the question of what to do when the exponent >>> doesn't fit in however many digits we'd like to print. Seems like the >>> options are >>> * print #'s >>> * force the output wider >>> * throw an error >>> None of these are very nice, but the first two could cause problems that >>> you don't find out until it's too late to fix. What about throwing an >>> error? >> >> I thing, so Oracle raise error. But I don't thing, so it is necessary >> repeat all Oracle the behave - mainly when is maybe not too much >> practical. >> >> * print #s, and force the output wider has one disadvantage - it >> cannot put clean signal about data problem in development time, maybe >> we should to add raise warning. >> >> * throw an error should to break "bad" written application in >> production, when is too late too. So anybody should have not complete >> test data set and there are a problem. >> >> I prefer print # with raising an warning. > > It seems like the discussion here has kind of died. We need to settle > on an approach and get a final patch soon, or else defer this until > next CommitFest. Tom, please, can you write your opinion on my last proposal - print ### with raise warning. regards Pavel > > ...Robert >
Pavel Stehule <pavel.stehule@gmail.com> writes: > Tom, please, can you write your opinion on my last proposal - print > ### with raise warning. The idea of printing a warning seems completely horrid to me. From a logical point of view, either we think it's an error or we don't. From a practical point of view, warnings usually accomplish little except to bloat log files that a human might or might not ever look at. The real bottom line for to_char issues is almost always that we should do what Oracle does. Has anyone checked this behavior on Oracle? regards, tom lane
Tom Lane escreveu: > The real bottom line for to_char issues is almost always that we should > do what Oracle does. Has anyone checked this behavior on Oracle? > That was my point too. See [1]. [1] http://archives.postgresql.org/pgsql-hackers/2009-07/msg01870.php -- Euler Taveira de Oliveira http://www.timbira.com/
2009/8/2 Euler Taveira de Oliveira <euler@timbira.com>: > Tom Lane escreveu: >> The real bottom line for to_char issues is almost always that we should >> do what Oracle does. Has anyone checked this behavior on Oracle? >> > That was my point too. See [1]. > > [1] http://archives.postgresql.org/pgsql-hackers/2009-07/msg01870.php > so, Brendan, please, can you adjust patch to raise exception like Oracle? Thank You Pavel > > -- > Euler Taveira de Oliveira > http://www.timbira.com/ >
2009/8/3 Pavel Stehule <pavel.stehule@gmail.com>: > 2009/8/2 Euler Taveira de Oliveira <euler@timbira.com>: >> Tom Lane escreveu: >>> The real bottom line for to_char issues is almost always that we should >>> do what Oracle does. Has anyone checked this behavior on Oracle? >>> >> That was my point too. See [1]. >> >> [1] http://archives.postgresql.org/pgsql-hackers/2009-07/msg01870.php >> > > so, Brendan, please, can you adjust patch to raise exception like Oracle? > Well, the examples Euler posted in the linked message above were using E+308. If I'm reading the Oracle docs correctly, that would have triggered Oracle's data type overflow error before even getting to to_char(); Oracle's NUMBER type only supports up to E+126. So we still don't really know how Oracle handles a (legal) value with too many exponent digits for EEEE. Euler, could you post results for a number which fits within Oracle's data type but has three exponent digits (like 1E+100)? Cheers, BJ
Brendan Jurd escreveu: > Well, the examples Euler posted in the linked message above were using > E+308. If I'm reading the Oracle docs correctly, that would have > triggered Oracle's data type overflow error before even getting to > to_char(); Oracle's NUMBER type only supports up to E+126. So we > still don't really know how Oracle handles a (legal) value with too > many exponent digits for EEEE. > As I said in a prior e-mail, Oracle has a diferent overflow limit (-84 to 127). In PostgreSQL, the numeric datatype can have up to 1000 digits (ie 1e+999) and the double precision datatype can have up to 309 digits (ie 1e-307 or 1e+308). We should support up to 3 exponent digits so all of our native datatypes are covered by the to_char() function. > Euler, could you post results for a number which fits within Oracle's > data type but has three exponent digits (like 1E+100)? > I don't access to an Oracle Server now but it works fine up to the 127 limit. And differently to what Pavel proposed, the number of E's is not related to the number of characters (at least not anymore). So I would like to see the EEEE being used if we have 2 or 3 exponent digits (that is the same behavior Oracle has). -- Euler Taveira de Oliveira http://www.timbira.com/
Euler Taveira de Oliveira <euler@timbira.com> writes: > As I said in a prior e-mail, Oracle has a diferent overflow limit (-84 to 127). > In PostgreSQL, the numeric datatype can have up to 1000 digits (ie 1e+999) and > the double precision datatype can have up to 309 digits (ie 1e-307 or 1e+308). > We should support up to 3 exponent digits so all of our native datatypes are > covered by the to_char() function. Uh, no, we had better support more. The actual limit of the current numeric format is 1e+131072. As long as we consider that EEEE should emit as many exponent digits as needed, this isn't particularly critical. But it would be if we try to specify an exact number of output digits. regards, tom lane
2009/8/3 Euler Taveira de Oliveira <euler@timbira.com>: > Brendan Jurd escreveu: >> Euler, could you post results for a number which fits within Oracle's >> data type but has three exponent digits (like 1E+100)? >> > I don't access to an Oracle Server now but it works fine up to the 127 limit. > And differently to what Pavel proposed, the number of E's is not related to > the number of characters (at least not anymore). So I would like to see the > EEEE being used if we have 2 or 3 exponent digits (that is the same behavior > Oracle has). > Okay, so Oracle just forces the output wider to accomodate the exponent (i.e., you can't rely on it being fixed-width). I can adjust the patch to imitate this behaviour; should be able to post a new revision within 24 hours. Cheers, BJ
2009/8/3 Brendan Jurd <direvus@gmail.com>: > Okay, so Oracle just forces the output wider to accomodate the > exponent (i.e., you can't rely on it being fixed-width). > > I can adjust the patch to imitate this behaviour; should be able to > post a new revision within 24 hours. > Please find attached version 5 of the patch, which allows for exponents with any number of digits (up to a total string length of MAXDOUBLEWIDTH). Cheers, BJ
Attachment
2009/8/3 Tom Lane <tgl@sss.pgh.pa.us>: > Euler Taveira de Oliveira <euler@timbira.com> writes: >> As I said in a prior e-mail, Oracle has a diferent overflow limit (-84 to 127). >> In PostgreSQL, the numeric datatype can have up to 1000 digits (ie 1e+999) and >> the double precision datatype can have up to 309 digits (ie 1e-307 or 1e+308). >> We should support up to 3 exponent digits so all of our native datatypes are >> covered by the to_char() function. > > Uh, no, we had better support more. The actual limit of the current > numeric format is 1e+131072. > > As long as we consider that EEEE should emit as many exponent digits > as needed, this isn't particularly critical. But it would be if we > try to specify an exact number of output digits. > Well, I tried this and as it turns out the patch casts the value to a float8 in order to pass it on to snprintf for sci-notation formatting.See the following chunk in numeric_to_char(): else if (IS_EEEE(&Num)){ float8 val; val = DatumGetFloat8(DirectFunctionCall1(numeric_float8, NumericGetDatum(value))); numstr = orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1); len = snprintf(orgnum, MAXDOUBLEWIDTH + 1, "%.*e", Num.post,val);} This leads to the following behaviour: =# select to_char(1e+1000::numeric, '9.99EEEE'); ERROR: "10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" is out of range for type double precision If we want to support the full range of numeric values with EEEE, I guess we would need to write our own implementation of scientific notation rather than depend on sprintf()'s? Cheers, BJ
Brendan Jurd <direvus@gmail.com> writes: > Well, I tried this and as it turns out the patch casts the value to a > float8 in order to pass it on to snprintf for sci-notation formatting. Well, that's pretty dumb. Quite aside from the range problem, that would mean that you lose everything past the sixteenth or so digit. I think that's sufficient grounds for bouncing the patch back for rework right there. What I'd consider instead is calling numeric_out and then working with the result of that. It would always be f-format, so you'd have to do your own conversion to e-format, but you could do it without any risk of precision or range loss. regards, tom lane
2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>: > Brendan Jurd <direvus@gmail.com> writes: >> Well, I tried this and as it turns out the patch casts the value to a >> float8 in order to pass it on to snprintf for sci-notation formatting. > > Well, that's pretty dumb. Quite aside from the range problem, that > would mean that you lose everything past the sixteenth or so digit. > I think that's sufficient grounds for bouncing the patch back for > rework right there. > I agree. > What I'd consider instead is calling numeric_out and then working > with the result of that. It would always be f-format, so you'd > have to do your own conversion to e-format, but you could do it > without any risk of precision or range loss. > Yeah, I figured as much. I'll see what I can do about reworking the numeric case. I should be able to post a new revision in the next day or so, but I certainly won't cry foul if this gets punted to September. Cheers, BJ
Brendan Jurd <direvus@gmail.com> writes: > 2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>: >> What I'd consider instead is calling numeric_out and then working >> with the result of that. �It would always be f-format, so you'd >> have to do your own conversion to e-format, but you could do it >> without any risk of precision or range loss. > Yeah, I figured as much. I'll see what I can do about reworking the > numeric case. I should be able to post a new revision in the next day > or so, but I certainly won't cry foul if this gets punted to > September. BTW, there's no rule saying you have to fix that strictly within to_char(). It might make more sense to have numeric.c export a function that is like numeric_out but produces e-style output. Your choice as the patch writer, but keep it in mind ... regards, tom lane
2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>: > BTW, there's no rule saying you have to fix that strictly within > to_char(). It might make more sense to have numeric.c export a > function that is like numeric_out but produces e-style output. > Your choice as the patch writer, but keep it in mind ... > Ah, thanks for the suggestion. I think you're right, numeric.c would be a nice place for this code. Going directly from the guts of a NumericVar to a scientific notation seems more elegant than performing cut-and-paste style surgery on the string from numeric_out(). It does require a bit more effort and headspace than I was expecting. I'm still keen to do it, but my "next day or so" timeframe may not be so realistic. It's looking more like "by the end of the week". Cheers, BJ
2009/8/3 Tom Lane <tgl@sss.pgh.pa.us>: > Uh, no, we had better support more. The actual limit of the current > numeric format is 1e+131072. > Given your comment above I'm thinking it reasonable to use an int32 to store the exponent -- will that be safe? That would allow for a maximum of 10 exponent digits. As an aside, I note that int4out() hardcodes the maximum number of digits rather than exposing a constant (c.f. MAXINT8LEN in int8.c). I'm considering adding MAXINT2LEN and MAXINT4LEN to int.c in passing. Excessive tinkering, or worthy improvement? Cheers, BJ
Brendan Jurd <direvus@gmail.com> writes: > 2009/8/3 Tom Lane <tgl@sss.pgh.pa.us>: >> Uh, no, we had better support more. �The actual limit of the current >> numeric format is 1e+131072. > Given your comment above I'm thinking it reasonable to use an int32 to > store the exponent -- will that be safe? Seems reasonable to me. > That would allow for a maximum of 10 exponent digits. As an aside, I > note that int4out() hardcodes the maximum number of digits rather than > exposing a constant (c.f. MAXINT8LEN in int8.c). I'm considering > adding MAXINT2LEN and MAXINT4LEN to int.c in passing. Excessive > tinkering, or worthy improvement? Don't really care. short and int are the same sizes on all platforms of interest, and are likely to remain so --- if they don't, we'll have way more places to fix than this one. INT8 has historically been more platform-dependent. regards, tom lane
2009/8/9 Tom Lane <tgl@sss.pgh.pa.us>: > Brendan Jurd <direvus@gmail.com> writes: >> That would allow for a maximum of 10 exponent digits. As an aside, I >> note that int4out() hardcodes the maximum number of digits rather than >> exposing a constant (c.f. MAXINT8LEN in int8.c). I'm considering >> adding MAXINT2LEN and MAXINT4LEN to int.c in passing. Excessive >> tinkering, or worthy improvement? > > Don't really care. short and int are the same sizes on all platforms of > interest, and are likely to remain so --- if they don't, we'll have way > more places to fix than this one. INT8 has historically been more > platform-dependent. > "Excessive tinkering" it is, then. I went ahead with hardcoding the expected 10 digits. Here's version 6 of the EEEE patch, now with an all-new implementation of (normalised) scientific notation in numeric.c, via the functions numeric_out_sci() and get_str_from_var_sci(). So EEEE should now be able to represent the full gamut of the numeric type. regression=# select to_char(-1e-1000, '9.99EEEE'); to_char ------------- -1.00e-1000 (1 row) I also noticed that EEEE wasn't outputting a leading space for positive numbers like the rest of to_char does. This meant that positive and negative numbers weren't aligned. The logic to do this for ordinary decimal representations is tied up in NUM_processor() and NUM_numpart_to_char(), so unfortunately I wasn't able to reuse it. Instead I just frobbed the code in the *_to_char() functions to make this work. Noting that Robert has dropped the patch from the July CF (and having no objection to that) I'm going to submit this for September. Cheers, BJ
Attachment
Brendan Jurd escribió: > Here's version 6 of the EEEE patch, now with an all-new implementation > of (normalised) scientific notation in numeric.c, via the functions > numeric_out_sci() and get_str_from_var_sci(). So EEEE should now be > able to represent the full gamut of the numeric type. I noticed an ugly pattern in NUMDesc_prepare calling a cleanup function before every ereport(ERROR). I think it's cleaner to replace that with a PG_TRY block; see attached. I didn't go over the patch in much more detail. (But the numeric_out_sci business got me thinking.) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
2009/8/9 Alvaro Herrera <alvherre@commandprompt.com>: > Brendan Jurd escribió: > >> Here's version 6 of the EEEE patch, now with an all-new implementation >> of (normalised) scientific notation in numeric.c, via the functions >> numeric_out_sci() and get_str_from_var_sci(). So EEEE should now be >> able to represent the full gamut of the numeric type. > > I noticed an ugly pattern in NUMDesc_prepare calling a cleanup function > before every ereport(ERROR). I think it's cleaner to replace that with > a PG_TRY block; see attached. Looks nice -- although doesn't have anything to do with the EEEE patch so perhaps deserves its own thread? > > I didn't go over the patch in much more detail. (But the > numeric_out_sci business got me thinking.) Got you thinking about what? I'd welcome any comments you have. Cheers, BJ
Brendan Jurd escribió: > 2009/8/9 Alvaro Herrera <alvherre@commandprompt.com>: > > I noticed an ugly pattern in NUMDesc_prepare calling a cleanup function > > before every ereport(ERROR). I think it's cleaner to replace that with > > a PG_TRY block; see attached. > > Looks nice -- although doesn't have anything to do with the EEEE patch > so perhaps deserves its own thread? Yes, it just popped up while skimming the patch. > > I didn't go over the patch in much more detail. (But the > > numeric_out_sci business got me thinking.) > > Got you thinking about what? I'd welcome any comments you have. I wondered if it should just return char *. If we want to have this functionality as its own fmgr-blessed function, shouldn't it return text instead of cstring? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: >> Got you thinking about what? I'd welcome any comments you have. > I wondered if it should just return char *. If we want to have this > functionality as its own fmgr-blessed function, shouldn't it return > text instead of cstring? If we expose it at fmgr level it should definitely not return cstring. However, I wasn't foreseeing doing that --- does the submitted patch expose it? regards, tom lane
2009/8/11 Tom Lane <tgl@sss.pgh.pa.us>: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> I wondered if it should just return char *. If we want to have this >> functionality as its own fmgr-blessed function, shouldn't it return >> text instead of cstring? > > If we expose it at fmgr level it should definitely not return cstring. > However, I wasn't foreseeing doing that --- does the submitted patch > expose it? > Sorry, I'm a little hazy on the terminology here. If by "expose it at fmgr level" you mean did I add it to pg_proc, then no, I didn't. The function is declared in builtins.h as "extern Datum numeric_out_sci(PG_FUNCTION_ARGS);", and called from formatting.c using the DirectFunctionCall arrangement. numeric_out_sci() returns using PG_RETURN_CSTRING, same as numeric_out does. If this is the wrong way to expose the function, please let me know and I'll happily fix it. Cheers, BJ
Brendan Jurd <direvus@gmail.com> writes: > 2009/8/11 Tom Lane <tgl@sss.pgh.pa.us>: >> If we expose it at fmgr level it should definitely not return cstring. >> However, I wasn't foreseeing doing that --- does the submitted patch >> expose it? > Sorry, I'm a little hazy on the terminology here. If by "expose it at > fmgr level" you mean did I add it to pg_proc, then no, I didn't. OK. > The function is declared in builtins.h as "extern Datum > numeric_out_sci(PG_FUNCTION_ARGS);", and called from formatting.c > using the DirectFunctionCall arrangement. If it's not meant to be in pg_proc, I wouldn't bother with using the V1 call protocol for it. "extern char *numeric_out_sci(Numeric x)" would be sufficient, and less notation on both caller and callee sides. regards, tom lane
2009/8/11 Tom Lane <tgl@sss.pgh.pa.us>: > If it's not meant to be in pg_proc, I wouldn't bother with using the V1 > call protocol for it. "extern char *numeric_out_sci(Numeric x)" would > be sufficient, and less notation on both caller and callee sides. > Thanks Tom. I have removed the V1 stuff as you suggest, and placed the declaration in numeric.h. Here's version 7. Cheers, BJ
Attachment
Brendan Jurd <direvus@gmail.com> writes: > Thanks Tom. I have removed the V1 stuff as you suggest, and placed > the declaration in numeric.h. > Here's version 7. Working through this now, and I noticed that the example added to the manual seems to be wrong: <entry><literal>to_char(0.000485, '9.99EEEE')</literal></entry> <entry><literal>' 4.850e-04'</literal></entry> With 9.99 as the pattern, I'd expect (and indeed I get) 4.85e-04 not 4.850e-04. This is correct behavior, no? Also, I'm wondering what should happen with regression=# select to_char(0.000485, '99.99EEEE'); to_char ----------- 4.85e-04 (1 row) Doesn't seem quite right. Should we throw error if the number of 9's before the decimal point isn't 1? regards, tom lane
Tom Lane escribió: > Also, I'm wondering what should happen with > > regression=# select to_char(0.000485, '99.99EEEE'); > to_char > ----------- > 4.85e-04 > (1 row) > > Doesn't seem quite right. Should we throw error if the number of 9's > before the decimal point isn't 1? No, see http://archives.postgresql.org/message-id/4A68FAE4.50206@timbira.com -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> Doesn't seem quite right. Should we throw error if the number of 9's >> before the decimal point isn't 1? > No, see > http://archives.postgresql.org/message-id/4A68FAE4.50206@timbira.com Ah, nothing like being bug-compatible with a bad implementation. But I agree, if Oracle ignores the number of 9's there then we should too. BTW, this patch adds more NUM_cache_remove() calls. I'm planning to commit it that way, unless you're just about to commit your PG_TRY change? I agree with doing that, but figured it should be a separate commit. regards, tom lane
Tom Lane escribió: > BTW, this patch adds more NUM_cache_remove() calls. I'm planning > to commit it that way, unless you're just about to commit your PG_TRY > change? I agree with doing that, but figured it should be a separate > commit. No, go ahead, I will commit that separately. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
2009/8/11 Tom Lane <tgl@sss.pgh.pa.us>: > Working through this now, and I noticed that the example added to the > manual seems to be wrong: > > <entry><literal>to_char(0.000485, '9.99EEEE')</literal></entry> > <entry><literal>' 4.850e-04'</literal></entry> > > With 9.99 as the pattern, I'd expect (and indeed I get) 4.85e-04 > not 4.850e-04. This is correct behavior, no? Correct. I apologise for the oversight. The example output should lose the trailing zero, or else the example query needs an extra '9' after the decimal point. I don't think it makes much difference which. Cheers, BJ
Brendan Jurd <direvus@gmail.com> writes: > Here's version 7. Applied with a couple of corrections: the numeric case wasn't dealing with NaNs in the same way as the float cases, and the int8 case was converting to float8 which would lose precision. I made it go through numeric instead, which is pretty expensive but I doubt this is worth expending extra code on. regards, tom lane
2009/8/10 Tom Lane <tgl@sss.pgh.pa.us>: > Brendan Jurd <direvus@gmail.com> writes: >> Here's version 7. > > Applied with a couple of corrections: the numeric case wasn't dealing > with NaNs in the same way as the float cases, and the int8 case was > converting to float8 which would lose precision. I made it go through > numeric instead, which is pretty expensive but I doubt this is worth > expending extra code on. > > regards, tom lane > It's nice. I am playing with it, and now I found some potential issue. The parser is maybe too tolerant: postgres=# select to_char(3.14323,'9.9(aaaaaEEEE');to_char ---------- 3.1e+00 (1 row) regards Pavel Stehule
2009/8/11 Pavel Stehule <pavel.stehule@gmail.com>: > It's nice. I am playing with it, and now I found some potential issue. > The parser is maybe too tolerant: > > postgres=# select to_char(3.14323,'9.9(aaaaaEEEE'); > to_char > ---------- > 3.1e+00 > (1 row) > I guess we *could* add code to throw an error where the 9's aren't placed immediately before the E's in the format pattern, but to be honest I can't work up any enthusiasm about it. Cheers, BJ
2009/8/11 Tom Lane <tgl@sss.pgh.pa.us>: > Brendan Jurd <direvus@gmail.com> writes: >> Here's version 7. > > Applied with a couple of corrections: the numeric case wasn't dealing > with NaNs in the same way as the float cases, Thanks for that. I do think that the whole business of printing "#.#####" is highly bogus, and I might look at getting it to throw errors instead as a separate patch. I've added a TODO for same. Cheers, BJ
Brendan Jurd wrote: > 2009/8/11 Tom Lane <tgl@sss.pgh.pa.us>: > > Brendan Jurd <direvus@gmail.com> writes: > >> Here's version 7. > > > > Applied with a couple of corrections: the numeric case wasn't dealing > > with NaNs in the same way as the float cases, > > Thanks for that. > > I do think that the whole business of printing "#.#####" is highly > bogus, and I might look at getting it to throw errors instead as a > separate patch. > > I've added a TODO for same. Everyone, please, when you update the TODO list, put some text in the "Summary" box at the bottom before you commit the change so people know what has been changed. I usually just copy the first sentence of my new text and add a tag at the beginning like "Added" or "Done". -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +