Thread: proposal: minscale, rtrim, btrim functions for numeric
Hi,
four years ago Marko Tiikkaja send a patch for numeric_trim functions. This functions removed ending zeroes from numeric value. This is useful feature, but there was not any progress on this patch. I think so this feature can be interesting, so I would to revitalize this patch.
Original discussion https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html
Based on this discussion I would to implement three functions - prototype implementation is in plpsql and sql - final implementation will be in C.
-- returns minimal scale when the rounding the value to this scale doesn't
-- lost any informations.
CREATE OR REPLACE FUNCTION pg_catalog.minscale(numeric)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
for i in 0..256
loop
if round($1, i) = $1 then
return i;
end if;
end loop;
end;
$function$
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
for i in 0..256
loop
if round($1, i) = $1 then
return i;
end if;
end loop;
end;
$function$
-- trailing zeroes from end
-- trimming only zero for numeric type has sense
CREATE OR REPLACE FUNCTION pg_catalog.rtrim(numeric)
RETURNS numeric AS $$
SELECT round($1, pg_catalog.minscale($1))
$$ LANGUAGE sql;
-- this is due support trim function
CREATE OR REPLACE FUNCTION pg_catalog.btrim(numeric)
RETURNS numeric AS $$
SELECT pg_catalog.rtrim($1)
$$ LANGUAGE sql;
postgres=# select trim(10.22000);
┌───────┐
│ btrim │
╞═══════╡
│ 10.22 │
└───────┘
(1 row)
postgres=# select rtrim(10.34900);
┌────────┐
│ rtrim │
╞════════╡
│ 10.349 │
└────────┘
(1 row)
┌───────┐
│ btrim │
╞═══════╡
│ 10.22 │
└───────┘
(1 row)
postgres=# select rtrim(10.34900);
┌────────┐
│ rtrim │
╞════════╡
│ 10.349 │
└────────┘
(1 row)
What do you think about it?
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > four years ago Marko Tiikkaja send a patch for numeric_trim functions. This > functions removed ending zeroes from numeric value. This is useful feature, > but there was not any progress on this patch. I think so this feature can > be interesting, so I would to revitalize this patch. > Original discussion > https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html A more useful link is https://www.postgresql.org/message-id/flat/564D3ADB.7000808%40joh.to and the earlier discussion is at https://www.postgresql.org/message-id/flat/5643125E.1030605%40joh.to Re-reading that thread, I don't really think there's much support for anything beyond the minscale() function. The rest are just inviting confusion with string-related functions. And I really don't like establishing a precedent that btrim() and rtrim() are the same. regards, tom lane
so 9. 11. 2019 v 21:34 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> four years ago Marko Tiikkaja send a patch for numeric_trim functions. This
> functions removed ending zeroes from numeric value. This is useful feature,
> but there was not any progress on this patch. I think so this feature can
> be interesting, so I would to revitalize this patch.
> Original discussion
> https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html
A more useful link is
https://www.postgresql.org/message-id/flat/564D3ADB.7000808%40joh.to
and the earlier discussion is at
https://www.postgresql.org/message-id/flat/5643125E.1030605%40joh.to
Re-reading that thread, I don't really think there's much support for
anything beyond the minscale() function. The rest are just inviting
confusion with string-related functions. And I really don't like
establishing a precedent that btrim() and rtrim() are the same.
I have to agree, so using trim, rtrim names is not best. On second hand, probably to most often usage of minscale function will be inside expression round(x, minscale(x)), so this functionality can be in core. A question is a name.
maybe to_minscale(numeric) ?
Regards
Pavel
regards, tom lane
ne 10. 11. 2019 v 7:35 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
so 9. 11. 2019 v 21:34 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
> four years ago Marko Tiikkaja send a patch for numeric_trim functions. This
> functions removed ending zeroes from numeric value. This is useful feature,
> but there was not any progress on this patch. I think so this feature can
> be interesting, so I would to revitalize this patch.
> Original discussion
> https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html
A more useful link is
https://www.postgresql.org/message-id/flat/564D3ADB.7000808%40joh.to
and the earlier discussion is at
https://www.postgresql.org/message-id/flat/5643125E.1030605%40joh.to
Re-reading that thread, I don't really think there's much support for
anything beyond the minscale() function. The rest are just inviting
confusion with string-related functions. And I really don't like
establishing a precedent that btrim() and rtrim() are the same.I have to agree, so using trim, rtrim names is not best. On second hand, probably to most often usage of minscale function will be inside expression round(x, minscale(x)), so this functionality can be in core. A question is a name.maybe to_minscale(numeric) ?
Here is a patch. It's based on Dean's suggestions.
I implemented two functions - first minscale, second trim_scale. The overhead of second is minimal - so I think it can be good to have it. I started design with the name "trim_scale", but the name can be any other.
Regards
Pavel
RegardsPavel
regards, tom lane
Attachment
Hello Pavel, On Mon, 11 Nov 2019 15:47:37 +0100 Pavel Stehule <pavel.stehule@gmail.com> wrote: > Here is a patch. It's based on Dean's suggestions. > > I implemented two functions - first minscale, second trim_scale. The > overhead of second is minimal - so I think it can be good to have it. > I started design with the name "trim_scale", but the name can be any > other. Here are my thoughts on your patch. My one substantial criticism is that I believe that trim_scale('NaN'::numeric) should return NaN. So the test output should look like: template1=# select trim_scale('nan'::numeric) = 'nan'::numeric; ?column? ---------- t (1 row) FWIW, I bumped around the Internet and looked at Oracle docs to see if there's any reason why minscale() might not be a good function name. I couldn't find any problems. I also couldn't think of a better name than trim_scale() and don't have any problems with the name. My other suggestions mostly have to do with documentation. Your code looks pretty good to me, looks like the existing code, you name variables and function names as in existing code, etc. I comment on various hunks in line below: diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 28eb322f3f..6f142cd679 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -918,6 +918,19 @@ <entry><literal>6.0000000000</literal></entry> </row> + <row> + <entry> + <indexterm> + <primary>minscale</primary> + </indexterm> + <literal><function>minscale(<type>numeric</type>)</function></literal> + </entry> + <entry><type>integer</type></entry> + <entry>returns minimal scale of the argument (the number of decimal digits in the fractional part)</entry> + <entry><literal>scale(8.4100)</literal></entry> + <entry><literal>2</literal></entry> + </row> + <row> <entry> <indexterm> ***** Your description does not say what the minimal scale is. How about: minimal scale (number of decimal digits in the fractional part) needed to store the supplied value without data loss ***** @@ -1041,6 +1054,19 @@ <entry><literal>1.4142135623731</literal></entry> </row> + <row> + <entry> + <indexterm> + <primary>trim_scale</primary> + </indexterm> + <literal><function>trim_scale(<type>numeric</type>)</function></literal> + </entry> + <entry><type>numeric</type></entry> + <entry>reduce scale of the argument (the number of decimal digits in the fractional part)</entry> + <entry><literal>scale(8.4100)</literal></entry> + <entry><literal>8.41</literal></entry> + </row> + <row> <entry> <indexterm> **** How about: reduce the scale (the number of decimal digits in the fractional part) to the minimum needed to store the supplied value without data loss ***** diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c **** I believe the hunks in this file should start at about line# 3181. This is right after numeric_scale(). Seems like all the scale related functions should be together. There's no hard standard but I don't see why lines (comment lines in your case) should be longer than 78 characters without good reason. Please reformat. **** @@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS) PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum)); } +/* + * Calculate minimal display scale. The var should be stripped already. **** I think you can get rid of the word "display" in the comment. **** + */ +static int +get_min_scale(NumericVar *var) +{ + int minscale = 0; + + if (var->ndigits > 0) + { + NumericDigit last_digit; + + /* maximal size of minscale, can be lower */ + minscale = (var->ndigits - var->weight - 1) * DEC_DIGITS; + + /* + * When there are not digits after decimal point, the previous expression **** s/not/no/ **** + * can be negative. In this case, the minscale must be zero. + */ **** s/can be/is/ **** + if (minscale > 0) + { + /* reduce minscale if trailing digits in last numeric digits are zero */ + last_digit = var->digits[var->ndigits - 1]; + + while (last_digit % 10 == 0) + { + minscale--; + last_digit /= 10; + } + } + else + minscale = 0; + } + + return minscale; +} + +/* + * Returns minimal scale of numeric value when value is not changed **** Improve comment, something like: minimal scale required to represent supplied value without loss **** + */ +Datum +numeric_minscale(PG_FUNCTION_ARGS) +{ + Numeric num = PG_GETARG_NUMERIC(0); + NumericVar arg; + int minscale; + + if (NUMERIC_IS_NAN(num)) + PG_RETURN_NULL(); + + init_var_from_num(num, &arg); + strip_var(&arg); + + minscale = get_min_scale(&arg); + free_var(&arg); + + PG_RETURN_INT32(minscale); +} + +/* + * Reduce scale of numeric value so value is not changed **** Likewise, comment text could be improved **** + */ +Datum +numeric_trim_scale(PG_FUNCTION_ARGS) +{ + Numeric num = PG_GETARG_NUMERIC(0); + Numeric res; + NumericVar result; + + if (NUMERIC_IS_NAN(num)) + PG_RETURN_NULL(); + + init_var_from_num(num, &result); + strip_var(&result); + + result.dscale = get_min_scale(&result); + + res = make_result(&result); + free_var(&result); + + PG_RETURN_NUMERIC(res); +} /* ---------------------------------------------------------------------- * diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 58ea5b982b..e603a5d8dd 100644 **** How about moving these new lines to right after line# 4255, the scale() function? **** --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -4288,6 +4288,12 @@ proname => 'width_bucket', prorettype => 'int4', proargtypes => 'numeric numeric numeric int4', prosrc => 'width_bucket_numeric' }, +{ oid => '3434', descr => 'returns minimal scale of numeric value', **** How about a descr of?: minimal scale needed to store the supplied value without data loss **** + proname => 'minscale', prorettype => 'int4', proargtypes => 'numeric', + prosrc => 'numeric_minscale' }, +{ oid => '3435', descr => 'returns numeric value with minimal scale', **** And likewise a descr of?: numeric with minimal scale needed to represent the given value **** + proname => 'trim_scale', prorettype => 'numeric', proargtypes => 'numeric', + prosrc => 'numeric_trim_scale' }, { oid => '1747', proname => 'time_pl_interval', prorettype => 'time', diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out index 1cb3c3bfab..778c204b13 100644 **** I have suggestions: Give the 2 functions separate comments (-- Tests for minscale() and -- Tests for trim_scale()) Put () after the function names in the comments because that's what scale() does. Move the lines so the tests are right after the tests of scale(). Be explicit when testing for NULL or NaN. I don't know that this is consistent with the rest of the regression tests but I don't see how being explicit could be wrong. Otherwise NULL and NaN are output the same ("") and you're not really testing. So test with expressions like "foo(NULL) IS NULL" or "foo('NaN'::NUMERIC) = 'NaN::NUMERIC" and look for t (or f) results. **** Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Hi
ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop@meme.com> napsal:
Hello Pavel,
On Mon, 11 Nov 2019 15:47:37 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Here is a patch. It's based on Dean's suggestions.
>
> I implemented two functions - first minscale, second trim_scale. The
> overhead of second is minimal - so I think it can be good to have it.
> I started design with the name "trim_scale", but the name can be any
> other.
Here are my thoughts on your patch.
My one substantial criticism is that I believe that
trim_scale('NaN'::numeric) should return NaN.
So the test output should look like:
template1=# select trim_scale('nan'::numeric) = 'nan'::numeric;
?column?
----------
t
(1 row)
fixed
FWIW, I bumped around the Internet and looked at Oracle docs to see if
there's any reason why minscale() might not be a good function name.
I couldn't find any problems.
I also couldn't think of a better name than trim_scale() and don't
have any problems with the name.
My other suggestions mostly have to do with documentation. Your code
looks pretty good to me, looks like the existing code, you name
variables and function names as in existing code, etc.
I comment on various hunks in line below:
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 28eb322f3f..6f142cd679 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -918,6 +918,19 @@
<entry><literal>6.0000000000</literal></entry>
</row>
+ <row>
+ <entry>
+ <indexterm>
+ <primary>minscale</primary>
+ </indexterm>
+
<literal><function>minscale(<type>numeric</type>)</function></literal>
+ </entry>
+ <entry><type>integer</type></entry>
+ <entry>returns minimal scale of the argument (the number of
decimal digits in the fractional part)</entry>
+ <entry><literal>scale(8.4100)</literal></entry>
+ <entry><literal>2</literal></entry>
+ </row>
+
<row>
<entry>
<indexterm>
*****
Your description does not say what the minimal scale is. How about:
minimal scale (number of decimal digits in the fractional part) needed
to store the supplied value without data loss
*****
sounds better, updated
@@ -1041,6 +1054,19 @@
<entry><literal>1.4142135623731</literal></entry>
</row>
+ <row>
+ <entry>
+ <indexterm>
+ <primary>trim_scale</primary>
+ </indexterm>
+
<literal><function>trim_scale(<type>numeric</type>)</function></literal>
+ </entry>
+ <entry><type>numeric</type></entry>
+ <entry>reduce scale of the argument (the number of decimal
digits in the fractional part)</entry>
+ <entry><literal>scale(8.4100)</literal></entry>
+ <entry><literal>8.41</literal></entry>
+ </row>
+
<row>
<entry>
<indexterm>
****
How about:
reduce the scale (the number of decimal digits in the fractional part)
to the minimum needed to store the supplied value without data loss
*****
ok, changed
diff --git a/src/backend/utils/adt/numeric.c
b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
****
I believe the hunks in this file should start at about line# 3181.
This is right after numeric_scale(). Seems like all the scale
related functions should be together.
There's no hard standard but I don't see why lines (comment lines in
your case) should be longer than 78 characters without good reason.
Please reformat.
****
@@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum));
}
+/*
+ * Calculate minimal display scale. The var should be stripped already.
****
I think you can get rid of the word "display" in the comment.
****
done
+ */
+static int
+get_min_scale(NumericVar *var)
+{
+ int minscale = 0;
+
+ if (var->ndigits > 0)
+ {
+ NumericDigit last_digit;
+
+ /* maximal size of minscale, can be lower */
+ minscale = (var->ndigits - var->weight - 1) *
DEC_DIGITS; +
+ /*
+ * When there are not digits after decimal point, the
previous expression
****
s/not/no/
****
+ * can be negative. In this case, the minscale must be
zero.
+ */
****
s/can be/is/
****
+ if (minscale > 0)
+ {
+ /* reduce minscale if trailing digits in last
numeric digits are zero */
+ last_digit = var->digits[var->ndigits - 1];
+
+ while (last_digit % 10 == 0)
+ {
+ minscale--;
+ last_digit /= 10;
+ }
+ }
+ else
+ minscale = 0;
+ }
+
+ return minscale;
+}
+
+/*
+ * Returns minimal scale of numeric value when value is not changed
****
Improve comment, something like:
minimal scale required to represent supplied value without loss
ok
****
+ */
+Datum
+numeric_minscale(PG_FUNCTION_ARGS)
+{
+ Numeric num = PG_GETARG_NUMERIC(0);
+ NumericVar arg;
+ int minscale;
+
+ if (NUMERIC_IS_NAN(num))
+ PG_RETURN_NULL();
+
+ init_var_from_num(num, &arg);
+ strip_var(&arg);
+
+ minscale = get_min_scale(&arg);
+ free_var(&arg);
+
+ PG_RETURN_INT32(minscale);
+}
+
+/*
+ * Reduce scale of numeric value so value is not changed
****
Likewise, comment text could be improved
****
+ */
+Datum
+numeric_trim_scale(PG_FUNCTION_ARGS)
+{
+ Numeric num = PG_GETARG_NUMERIC(0);
+ Numeric res;
+ NumericVar result;
+
+ if (NUMERIC_IS_NAN(num))
+ PG_RETURN_NULL();
+
+ init_var_from_num(num, &result);
+ strip_var(&result);
+
+ result.dscale = get_min_scale(&result);
+
+ res = make_result(&result);
+ free_var(&result);
+
+ PG_RETURN_NUMERIC(res);
+}
/*
----------------------------------------------------------------------
* diff --git a/src/include/catalog/pg_proc.dat
b/src/include/catalog/pg_proc.dat index 58ea5b982b..e603a5d8dd 100644
****
How about moving these new lines to right after line# 4255, the
scale() function?
****
has sense, moved
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4288,6 +4288,12 @@
proname => 'width_bucket', prorettype => 'int4',
proargtypes => 'numeric numeric numeric int4',
prosrc => 'width_bucket_numeric' },
+{ oid => '3434', descr => 'returns minimal scale of numeric value',
****
How about a descr of?:
minimal scale needed to store the supplied value without data loss
****
done
+ proname => 'minscale', prorettype => 'int4', proargtypes =>
'numeric',
+ prosrc => 'numeric_minscale' },
+{ oid => '3435', descr => 'returns numeric value with minimal scale',
****
And likewise a descr of?:
numeric with minimal scale needed to represent the given value
****
+ proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
'numeric',
+ prosrc => 'numeric_trim_scale' },
done
{ oid => '1747',
proname => 'time_pl_interval', prorettype => 'time',
diff --git a/src/test/regress/expected/numeric.out
b/src/test/regress/expected/numeric.out index 1cb3c3bfab..778c204b13
100644
****
I have suggestions:
Give the 2 functions separate comments (-- Tests for minscale() and
-- Tests for trim_scale())
Put () after the function names in the comments
because that's what scale() does.
Move the lines so the tests are right after the tests of scale().
Be explicit when testing for NULL or NaN. I don't know that this is
consistent with the rest of the regression tests but I don't see how
being explicit could be wrong. Otherwise NULL and NaN are output the
same ("") and you're not really testing.
So test with expressions like "foo(NULL) IS NULL" or
"foo('NaN'::NUMERIC) = 'NaN::NUMERIC" and look for t (or f) results.
ok fixed
Thank you for review - I am sending updated rebased patch. Please, update comments freely - my language skills (about English lang) are basic.
Regards
Pavel
****
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
Attachment
"Karl O. Pinc" <kop@meme.com> writes: > FWIW, I bumped around the Internet and looked at Oracle docs to see if > there's any reason why minscale() might not be a good function name. > I couldn't find any problems. > I also couldn't think of a better name than trim_scale() and don't > have any problems with the name. I'd just comment that it seems weird that the same patch is introducing two functions with inconsistently chosen names. Why does one have an underscore separating the words and the other not? I haven't got a large investment in either naming convention specifically, but it'd be nice if we could at least pretend to be considering consistency. regards, tom lane
On Sun, 08 Dec 2019 13:57:00 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Karl O. Pinc" <kop@meme.com> writes: > > FWIW, I bumped around the Internet and looked at Oracle docs to see > > if there's any reason why minscale() might not be a good function > > name. I couldn't find any problems. > > > I also couldn't think of a better name than trim_scale() and don't > > have any problems with the name. > > I'd just comment that it seems weird that the same patch is > introducing two functions with inconsistently chosen names. Why does > one have an underscore separating the words and the other not? I > haven't got a large investment in either naming convention > specifically, but it'd be nice if we could at least pretend to be > considering consistency. Consistency would be lovely. I don't feel qualified to make the decision but here's what I came up with: I re-read the back-threads and don't see any discussion of the naming of minscale(). My thoughts run toward asking the "what is a word?" question, along with "what is the policy for separating a word?". Is "min" different from the prefix "sub"? "Trim" seems to clearly count as a word and trim_scale() seems mostly consistent with existing function names. (E.g. width_bucket(), convert_to(). But there's no true consistency. Plenty of functions don't separate words with "_". E.g. setseed().) As far as "min" goes, glancing through function names [1] does not help much. The index indicates that when PG puts "min" in a configuration parameter it separates it with "_". (Looking at "min" in the index.) Looking at the function names containing "min" [2] yields: aclitemin brin_minmax_add_value brin_minmax_consistent brin_minmax_opcinfo brin_minmax_union min numeric_uminus pg_terminate_backend range_minus txid_snapshot_xmin Not especially helpful. I'm inclined to want min_scale() instead of "minscale()" based on how config parameters are named and for consistency with trim_scale(). Pavel, if you agree then let's just change minscale() to min_scale() and let people object if they don't like it. Regards. Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein [1] select pg_proc.proname from pg_proc group by pg_proc.proname order by pg_proc.proname; [2] select pg_proc.proname from pg_proc where pg_proc.proname like '%min%' group by pg_proc.proname order by pg_proc.proname;
Hi Pavel, Thanks for your changes. More inline below: On Sun, 8 Dec 2019 08:38:38 +0100 Pavel Stehule <pavel.stehule@gmail.com> wrote: > ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop@meme.com> napsal: > > On Mon, 11 Nov 2019 15:47:37 +0100 > > Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > I implemented two functions - first minscale, second trim_scale. > > > The overhead of second is minimal - so I think it can be good to > > > have it. I started design with the name "trim_scale", but the > > > name can be any other. > > I comment on various hunks in line below: > > > diff --git a/src/backend/utils/adt/numeric.c > > b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c > > 100644 --- a/src/backend/utils/adt/numeric.c > > +++ b/src/backend/utils/adt/numeric.c > > > > **** > > I believe the hunks in this file should start at about line# 3181. > > This is right after numeric_scale(). Seems like all the scale > > related functions should be together. > > > > There's no hard standard but I don't see why lines (comment lines in > > your case) should be longer than 78 characters without good reason. > > Please reformat. > > **** I don't see any response from you regarding the above two suggestions. > > > + */ > > +static int > > +get_min_scale(NumericVar *var) > > +{ > > + int minscale = 0; > > + > > + if (var->ndigits > 0) > > + { > > + NumericDigit last_digit; > > + > > + /* maximal size of minscale, can be lower */ > > + minscale = (var->ndigits - var->weight - 1) * > > DEC_DIGITS; + > > + /* > > + * When there are not digits after decimal point, > > the previous expression > > > > **** > > s/not/no/ > > **** > > > > + * can be negative. In this case, the minscale must > > be zero. > > + */ > > > > **** > > s/can be/is/ > > **** By the above, I intended the comment be changed (after line wrapping) to: /* * When there are no digits after decimal point, * the previous expression is negative. In this * case the minscale must be zero. */ (Oh yes, on re-reading I think the comma is unnecessary so I removed it too.) > > > > + if (minscale > 0) > > + { > > + /* reduce minscale if trailing digits in > > last numeric digits are zero */ And the above comment should either be wrapped (as requested above) or eliminated. I like comments but I'm not sure this one contributes anything. > > --- a/src/include/catalog/pg_proc.dat > > +++ b/src/include/catalog/pg_proc.dat > > @@ -4288,6 +4288,12 @@ > > proname => 'width_bucket', prorettype => 'int4', > > proargtypes => 'numeric numeric numeric int4', > > prosrc => 'width_bucket_numeric' }, > > +{ oid => '3434', descr => 'returns minimal scale of numeric value', > > > > **** > > How about a descr of?: > > > > minimal scale needed to store the supplied value without data loss > > **** > > > > done > > > > > + proname => 'minscale', prorettype => 'int4', proargtypes => > > 'numeric', > > + prosrc => 'numeric_minscale' }, > > +{ oid => '3435', descr => 'returns numeric value with minimal > > scale', > > > > **** > > And likewise a descr of?: > > > > numeric with minimal scale needed to represent the given value > > **** > > > > + proname => 'trim_scale', prorettype => 'numeric', proargtypes => > > 'numeric', > > + prosrc => 'numeric_trim_scale' }, > > > > done Thanks for these changes. Looking at pg_proc.dat there seems to be an effort made to keep the lines to a maximum of 78 or 80 characters. This means starting "descr => '..." on new lines when the description is long. Please reformat, doing this or, if you like, something even more clever to keep the lines short. Looking good. We're making progress. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Hi Pavel, I've had some thoughts about the regression tests. It wouldn't hurt to move them to right after the scale() tests in numeric.sql. I believe your tests are covering all the code paths but it is not clear just what test does what. I don't see a lot of comments in the tests so I don't know that it'd be appropriate to put them in to describe just what's tested. But in any case it could be nice to choose values where it is at least sort of apparent what part of the codebase is tested. FWIW, although the code paths are covered, the possible data permutations are not. E.g. I don't see a case where scale > 0 and the NDIGITS of the last digit is full. There are also some tests (the 0 and 0.00 tests) that duplicates the execution path. In the 0 case I don't see a problem but as a rule there's not a lot of point. Better test values would (mostly) eliminate these. So, my thoughts run along these lines: select minscale(numeric 'NaN') is NULL; -- should be true select minscale(NULL::numeric) is NULL; -- should be true select minscale(0); -- no digits select minscale(0.00); -- no digits again select minscale(1.0); -- no scale select minscale(1.1); -- scale 1 select minscale(1.12); -- scale 2 select minscale(1.123); -- scale 3 select minscale(1.1234); -- scale 4, filled digit select minscale(1.12345); -- scale 5, 2 NDIGITS select minscale(1.1000); -- 1 pos in NDIGITS select minscale(1.1200); -- 2 pos in NDIGITS select minscale(1.1230); -- 3 pos in NDIGITS select minscale(1.1234); -- all pos in NDIGITS select minscale(1.12345000); -- 2 NDIGITS select minscale(1.123400000000); -- strip() required/done select minscale(12345.123456789012345); -- "big" number select minscale(-12345.12345); -- negative number select minscale(1e100); -- very big number select minscale(1e100::numeric + 0.1); -- big number with scale I don't know why you chose some of your values so if there's something you were testing for that the above does not cover please include it. So, a combination of white and black box testing. Having written it out it seems like a lot of testing for such a simple function. On the other hand I don't see a lot of cost in having all these tests. Opinions welcome. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On Mon, 9 Dec 2019 12:15:22 -0600 "Karl O. Pinc" <kop@meme.com> wrote: > I've had some thoughts about the regression tests. > Having written > it out it seems like a lot of testing for such a simple function. FYI. I don't see trim_scale() needing such exhaustive testing because you'll have already tested a lot with the min_scale() tests. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
po 9. 12. 2019 v 19:15 odesílatel Karl O. Pinc <kop@meme.com> napsal:
Hi Pavel,
I've had some thoughts about the regression tests.
It wouldn't hurt to move them to right after the
scale() tests in numeric.sql.
I believe your tests are covering all the code paths
but it is not clear just what test does what.
I don't see a lot of comments in the tests so I don't
know that it'd be appropriate to put them in to
describe just what's tested. But in any case it
could be nice to choose values where it is at least
sort of apparent what part of the codebase is tested.
FWIW, although the code paths are covered, the possible
data permutations are not. E.g. I don't see a case
where scale > 0 and the NDIGITS of the last digit is full.
There are also some tests (the 0 and 0.00 tests) that duplicates
the execution path. In the 0 case I don't see a problem
but as a rule there's not a lot of point. Better test
values would (mostly) eliminate these.
So, my thoughts run along these lines:
select minscale(numeric 'NaN') is NULL; -- should be true
select minscale(NULL::numeric) is NULL; -- should be true
select minscale(0); -- no digits
select minscale(0.00); -- no digits again
select minscale(1.0); -- no scale
select minscale(1.1); -- scale 1
select minscale(1.12); -- scale 2
select minscale(1.123); -- scale 3
select minscale(1.1234); -- scale 4, filled digit
select minscale(1.12345); -- scale 5, 2 NDIGITS
select minscale(1.1000); -- 1 pos in NDIGITS
select minscale(1.1200); -- 2 pos in NDIGITS
select minscale(1.1230); -- 3 pos in NDIGITS
select minscale(1.1234); -- all pos in NDIGITS
select minscale(1.12345000); -- 2 NDIGITS
select minscale(1.123400000000); -- strip() required/done
select minscale(12345.123456789012345); -- "big" number
select minscale(-12345.12345); -- negative number
select minscale(1e100); -- very big number
select minscale(1e100::numeric + 0.1); -- big number with scale
I don't know why you chose some of your values so if there's
something you were testing for that the above does not cover
please include it.
some values was proposed in discussion, others are from tests of scale function.
I used proposed tests by you.
Regards
Pavel
So, a combination of white and black box testing. Having written
it out it seems like a lot of testing for such a simple function.
On the other hand I don't see a lot of cost in having all
these tests. Opinions welcome.
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
po 9. 12. 2019 v 3:51 odesílatel Karl O. Pinc <kop@meme.com> napsal:
Hi Pavel,
Thanks for your changes. More inline below:
On Sun, 8 Dec 2019 08:38:38 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:
> ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop@meme.com> napsal:
> > On Mon, 11 Nov 2019 15:47:37 +0100
> > Pavel Stehule <pavel.stehule@gmail.com> wrote:
> > > I implemented two functions - first minscale, second trim_scale.
> > > The overhead of second is minimal - so I think it can be good to
> > > have it. I started design with the name "trim_scale", but the
> > > name can be any other.
> > I comment on various hunks in line below:
>
> > diff --git a/src/backend/utils/adt/numeric.c
> > b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c
> > 100644 --- a/src/backend/utils/adt/numeric.c
> > +++ b/src/backend/utils/adt/numeric.c
> >
> > ****
> > I believe the hunks in this file should start at about line# 3181.
> > This is right after numeric_scale(). Seems like all the scale
> > related functions should be together.
> >
> > There's no hard standard but I don't see why lines (comment lines in
> > your case) should be longer than 78 characters without good reason.
> > Please reformat.
> > ****
I don't see any response from you regarding the above two suggestions.
>
> > + */
> > +static int
> > +get_min_scale(NumericVar *var)
> > +{
> > + int minscale = 0;
> > +
> > + if (var->ndigits > 0)
> > + {
> > + NumericDigit last_digit;
> > +
> > + /* maximal size of minscale, can be lower */
> > + minscale = (var->ndigits - var->weight - 1) *
> > DEC_DIGITS; +
> > + /*
> > + * When there are not digits after decimal point,
> > the previous expression
> >
> > ****
> > s/not/no/
> > ****
> >
> > + * can be negative. In this case, the minscale must
> > be zero.
> > + */
> >
> > ****
> > s/can be/is/
> > ****
By the above, I intended the comment be changed (after line wrapping)
to:
/*
* When there are no digits after decimal point,
* the previous expression is negative. In this
* case the minscale must be zero.
*/
(Oh yes, on re-reading I think the comma is unnecessary so I removed it too.)
> >
> > + if (minscale > 0)
> > + {
> > + /* reduce minscale if trailing digits in
> > last numeric digits are zero */
And the above comment should either be wrapped (as requested above)
or eliminated. I like comments but I'm not sure this one contributes
anything.
> > --- a/src/include/catalog/pg_proc.dat
> > +++ b/src/include/catalog/pg_proc.dat
> > @@ -4288,6 +4288,12 @@
> > proname => 'width_bucket', prorettype => 'int4',
> > proargtypes => 'numeric numeric numeric int4',
> > prosrc => 'width_bucket_numeric' },
> > +{ oid => '3434', descr => 'returns minimal scale of numeric value',
> >
> > ****
> > How about a descr of?:
> >
> > minimal scale needed to store the supplied value without data loss
> > ****
> >
>
> done
>
> >
> > + proname => 'minscale', prorettype => 'int4', proargtypes =>
> > 'numeric',
> > + prosrc => 'numeric_minscale' },
> > +{ oid => '3435', descr => 'returns numeric value with minimal
> > scale',
> >
> > ****
> > And likewise a descr of?:
> >
> > numeric with minimal scale needed to represent the given value
> > ****
> >
> > + proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
> > 'numeric',
> > + prosrc => 'numeric_trim_scale' },
> >
>
> done
Thanks for these changes. Looking at pg_proc.dat there seems to
be an effort made to keep the lines to a maximum of 78 or 80
characters. This means starting "descr => '..." on new lines
when the description is long. Please reformat, doing this or,
if you like, something even more clever to keep the lines short.
Looking good. We're making progress.
I fixed almost all mentioned issues (that I understand)
I am sending updated patch
Regards
Pavel
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
Attachment
On Mon, 9 Dec 2019 21:04:21 +0100 Pavel Stehule <pavel.stehule@gmail.com> wrote: > I fixed almost all mentioned issues (that I understand) If you don't understand you might ask, or at least say. That way I know you've noticed my remarks and I don't have to repeat them. I have 2 remaining suggestions. 1) As previously suggested: Consider moving all the code you added to numeric.c to right after the scale() related code. This is equivalent to what was done in pg_proc.dat and regression tests where all the scale related stuff is in one place in the file. 2) Now that the function is called min_scale() it might be nice if your "minscale" variable in numeric.c was named "min_scale". I don't feel particularly strongly about either of the above but think them a slight improvement. I also wonder whether all the trim_scale() tests are now necessary, but not enough to make any suggestions. Especially because, well, tests are good. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
út 10. 12. 2019 v 0:03 odesílatel Karl O. Pinc <kop@meme.com> napsal:
On Mon, 9 Dec 2019 21:04:21 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I fixed almost all mentioned issues (that I understand)
If you don't understand you might ask, or at least say.
That way I know you've noticed my remarks and I don't
have to repeat them.
I have 2 remaining suggestions.
1) As previously suggested: Consider moving
all the code you added to numeric.c to right after
the scale() related code. This is equivalent to
what was done in pg_proc.dat and regression tests
where all the scale related stuff is in one
place in the file.
2) Now that the function is called min_scale()
it might be nice if your "minscale" variable
in numeric.c was named "min_scale".
I don't feel particularly strongly about either
of the above but think them a slight improvement.
done
I also wonder whether all the trim_scale() tests
are now necessary, but not enough to make any suggestions.
Especially because, well, tests are good.
I don't think so tests should be minimalistic - there can be some redundancy to coverage some less probable size effects of some future changes. More - there is a small symmetry with min_scale tests - and third argument - some times I use tests (result part) as "documentation". But I have not any problem to reduce tests if there will be requirement to do it.
Regards
Pavel
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
Attachment
On Tue, 10 Dec 2019 07:11:59 +0100 Pavel Stehule <pavel.stehule@gmail.com> wrote: > út 10. 12. 2019 v 0:03 odesílatel Karl O. Pinc <kop@meme.com> napsal: > > I also wonder whether all the trim_scale() tests > > are now necessary, but not enough to make any suggestions. > I don't think so tests should be minimalistic - there can be some > redundancy to coverage some less probable size effects of some future > changes. More - there is a small symmetry with min_scale tests - and > third argument - some times I use tests (result part) as > "documentation". Fine with me. Tests pass against HEAD. Docs build and look good. Patch looks good to me. I'm marking it ready for a committer. Thanks for the work. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
út 10. 12. 2019 v 13:56 odesílatel Karl O. Pinc <kop@meme.com> napsal:
On Tue, 10 Dec 2019 07:11:59 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:
> út 10. 12. 2019 v 0:03 odesílatel Karl O. Pinc <kop@meme.com> napsal:
> > I also wonder whether all the trim_scale() tests
> > are now necessary, but not enough to make any suggestions.
> I don't think so tests should be minimalistic - there can be some
> redundancy to coverage some less probable size effects of some future
> changes. More - there is a small symmetry with min_scale tests - and
> third argument - some times I use tests (result part) as
> "documentation".
Fine with me.
Tests pass against HEAD. Docs build and look good.
Patch looks good to me.
I'm marking it ready for a committer.
Thanks for the work.
Thank you for review
Pavel
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
Pavel Stehule <pavel.stehule@gmail.com> writes: > út 10. 12. 2019 v 13:56 odesílatel Karl O. Pinc <kop@meme.com> napsal: >> I'm marking it ready for a committer. > Thank you for review Pushed with minor adjustments. Notably, I didn't like having get_min_scale() depend on its callers having stripped trailing zeroes to avoid getting into a tight infinite loop. That's just trouble waiting to happen, especially since non-stripped numerics are seldom seen in practice (ones coming into the SQL-level functions should never look like that, ie the strip_var calls you had are almost certainly dead code). If we did have a code path where the situation could occur, and somebody forgot the strip_var call, the omission could easily escape notice. So I got rid of the strip_var calls and made get_min_scale() defend itself against the case. It's hardly any more code, and it should be a shade faster than strip_var anyway. regards, tom lane
po 6. 1. 2020 v 18:22 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> út 10. 12. 2019 v 13:56 odesílatel Karl O. Pinc <kop@meme.com> napsal:
>> I'm marking it ready for a committer.
> Thank you for review
Pushed with minor adjustments. Notably, I didn't like having
get_min_scale() depend on its callers having stripped trailing zeroes
to avoid getting into a tight infinite loop. That's just trouble
waiting to happen, especially since non-stripped numerics are seldom
seen in practice (ones coming into the SQL-level functions should
never look like that, ie the strip_var calls you had are almost
certainly dead code). If we did have a code path where the situation
could occur, and somebody forgot the strip_var call, the omission
could easily escape notice. So I got rid of the strip_var calls and
made get_min_scale() defend itself against the case. It's hardly
any more code, and it should be a shade faster than strip_var anyway.
Thank you very much
Maybe this issue was part of ToDo list
Pavel
regards, tom lane