Thread: proposal: minscale, rtrim, btrim functions for numeric

proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

09 November 2019, 19:48:11

Hi,

four years ago Marko Tiikkaja send a patch for numeric_trim functions. This functions removed ending zeroes from numeric value. This is useful feature, but there was not any progress on this patch. I think so this feature can be interesting, so I would to revitalize this patch.

Original discussion https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html

Based on this discussion I would to implement three functions - prototype implementation is in plpsql and sql - final implementation will be in C.

-- returns minimal scale when the rounding the value to this scale doesn't

-- lost any informations.

CREATE OR REPLACE FUNCTION pg_catalog.minscale(numeric)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
for i in 0..256
loop
if round($1, i) = $1 then
return i;
end if;
end loop;
end;
$function$

-- trailing zeroes from end

-- trimming only zero for numeric type has sense

CREATE OR REPLACE FUNCTION pg_catalog.rtrim(numeric)

RETURNS numeric AS $$

SELECT round($1, pg_catalog.minscale($1))

$$ LANGUAGE sql;

-- this is due support trim function

CREATE OR REPLACE FUNCTION pg_catalog.btrim(numeric)

RETURNS numeric AS $$

SELECT pg_catalog.rtrim($1)

$$ LANGUAGE sql;

postgres=# select trim(10.22000);
┌───────┐
│ btrim │
╞═══════╡
│ 10.22 │
└───────┘
(1 row)

postgres=# select rtrim(10.34900);
┌────────┐
│ rtrim │
╞════════╡
│ 10.349 │
└────────┘
(1 row)

What do you think about it?

Regards

Pavel

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Tom Lane

Date:

09 November 2019, 20:34:46

Pavel Stehule <pavel.stehule@gmail.com> writes:
> four years ago Marko Tiikkaja send a patch for numeric_trim functions. This
> functions removed ending zeroes from numeric value. This is useful feature,
> but there was not any progress on this patch. I think so this feature can
> be interesting, so I would to revitalize this patch.

> Original discussion
> https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html

A more useful link is
https://www.postgresql.org/message-id/flat/564D3ADB.7000808%40joh.to
and the earlier discussion is at
https://www.postgresql.org/message-id/flat/5643125E.1030605%40joh.to

Re-reading that thread, I don't really think there's much support for
anything beyond the minscale() function.  The rest are just inviting
confusion with string-related functions.  And I really don't like
establishing a precedent that btrim() and rtrim() are the same.

            regards, tom lane

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

10 November 2019, 06:35:40

so 9. 11. 2019 v 21:34 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> four years ago Marko Tiikkaja send a patch for numeric_trim functions. This
> functions removed ending zeroes from numeric value. This is useful feature,
> but there was not any progress on this patch. I think so this feature can
> be interesting, so I would to revitalize this patch.

> Original discussion
> https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html

A more useful link is
https://www.postgresql.org/message-id/flat/564D3ADB.7000808%40joh.to
and the earlier discussion is at
https://www.postgresql.org/message-id/flat/5643125E.1030605%40joh.to

Re-reading that thread, I don't really think there's much support for
anything beyond the minscale() function. The rest are just inviting
confusion with string-related functions. And I really don't like
establishing a precedent that btrim() and rtrim() are the same.

I have to agree, so using trim, rtrim names is not best. On second hand, probably to most often usage of minscale function will be inside expression round(x, minscale(x)), so this functionality can be in core. A question is a name.

maybe to_minscale(numeric) ?

Regards

Pavel

regards, tom lane

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

11 November 2019, 14:47:37

ne 10. 11. 2019 v 7:35 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

so 9. 11. 2019 v 21:34 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> four years ago Marko Tiikkaja send a patch for numeric_trim functions. This
> functions removed ending zeroes from numeric value. This is useful feature,
> but there was not any progress on this patch. I think so this feature can
> be interesting, so I would to revitalize this patch.

> Original discussion
> https://www.postgresql-archive.org/Add-numeric-trim-numeric-td5874444.html

A more useful link is
https://www.postgresql.org/message-id/flat/564D3ADB.7000808%40joh.to
and the earlier discussion is at
https://www.postgresql.org/message-id/flat/5643125E.1030605%40joh.to

Re-reading that thread, I don't really think there's much support for
anything beyond the minscale() function. The rest are just inviting
confusion with string-related functions. And I really don't like
establishing a precedent that btrim() and rtrim() are the same.

I have to agree, so using trim, rtrim names is not best. On second hand, probably to most often usage of minscale function will be inside expression round(x, minscale(x)), so this functionality can be in core. A question is a name.

maybe to_minscale(numeric) ?

Here is a patch. It's based on Dean's suggestions.

I implemented two functions - first minscale, second trim_scale. The overhead of second is minimal - so I think it can be good to have it. I started design with the name "trim_scale", but the name can be any other.

Regards

Pavel

Regards

Pavel

regards, tom lane

Attachment

minscale-initial.patch

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

08 December 2019, 01:23:25

Hello Pavel,

On Mon, 11 Nov 2019 15:47:37 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> Here is a patch. It's based on Dean's suggestions.
> 
> I implemented two functions - first minscale, second trim_scale. The
> overhead of second is minimal - so I think it can be good to have it.
> I started design with the name "trim_scale", but the name can be any
> other.

Here are my thoughts on your patch.

My one substantial criticism is that I believe that
trim_scale('NaN'::numeric) should return NaN.
So the test output should look like:

template1=# select trim_scale('nan'::numeric) = 'nan'::numeric;
 ?column? 
----------
 t 
(1 row)


FWIW, I bumped around the Internet and looked at Oracle docs to see if
there's any reason why minscale() might not be a good function name.
I couldn't find any problems.

I also couldn't think of a better name than trim_scale() and don't
have any problems with the name.

My other suggestions mostly have to do with documentation.  Your code
looks pretty good to me, looks like the existing code, you name
variables and function names as in existing code, etc.

I comment on various hunks in line below:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 28eb322f3f..6f142cd679 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -918,6 +918,19 @@
        <entry><literal>6.0000000000</literal></entry>
       </row>
 
+      <row>
+       <entry>
+        <indexterm>
+         <primary>minscale</primary>
+        </indexterm>
+
<literal><function>minscale(<type>numeric</type>)</function></literal>
+       </entry>
+       <entry><type>integer</type></entry>
+       <entry>returns minimal scale of the argument (the number of
decimal digits in the fractional part)</entry>
+       <entry><literal>scale(8.4100)</literal></entry>
+       <entry><literal>2</literal></entry>
+      </row>
+
       <row>
        <entry>
         <indexterm>

*****
Your description does not say what the minimal scale is.  How about:

minimal scale (number of decimal digits in the fractional part) needed
to store the supplied value without data loss
*****

@@ -1041,6 +1054,19 @@
        <entry><literal>1.4142135623731</literal></entry>
       </row>
 
+      <row>
+       <entry>
+        <indexterm>
+         <primary>trim_scale</primary>
+        </indexterm>
+
<literal><function>trim_scale(<type>numeric</type>)</function></literal>
+       </entry>
+       <entry><type>numeric</type></entry>
+       <entry>reduce scale of the argument (the number of decimal
digits in the fractional part)</entry>
+       <entry><literal>scale(8.4100)</literal></entry>
+       <entry><literal>8.41</literal></entry>
+      </row>
+
       <row>
        <entry>
         <indexterm>

****
How about:

reduce the scale (the number of decimal digits in the fractional part)
to the minimum needed to store the supplied value without data loss
*****

diff --git a/src/backend/utils/adt/numeric.c
b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c

****
I believe the hunks in this file should start at about line# 3181.
This is right after numeric_scale().  Seems like all the scale
related functions should be together.

There's no hard standard but I don't see why lines (comment lines in
your case) should be longer than 78 characters without good reason.
Please reformat.
****

@@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS)
     PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum));
 }
 
+/*
+ * Calculate minimal display scale. The var should be stripped already.

****
I think you can get rid of the word "display" in the comment.
****

+ */
+static int
+get_min_scale(NumericVar *var)
+{
+    int        minscale = 0;
+
+    if (var->ndigits > 0)
+    {
+        NumericDigit last_digit;
+
+        /* maximal size of minscale, can be lower */
+        minscale = (var->ndigits - var->weight - 1) *
  DEC_DIGITS; +
+        /*
+         * When there are not digits after decimal point, the
  previous expression

****
s/not/no/
****

+         * can be negative. In this case, the minscale must be
  zero.
+         */

****
s/can be/is/
****

+        if (minscale > 0)
+        {
+            /* reduce minscale if trailing digits in last
  numeric digits are zero */
+            last_digit = var->digits[var->ndigits - 1];
+
+            while (last_digit % 10 ==  0)
+            {
+                minscale--;
+                last_digit /= 10;
+            }
+        }
+        else
+            minscale = 0;
+    }
+
+    return minscale;
+}
+
+/*
+ * Returns minimal scale of numeric value when value is not changed

****
Improve comment, something like:
  minimal scale required to represent supplied value without loss
****

+ */
+Datum
+numeric_minscale(PG_FUNCTION_ARGS)
+{
+    Numeric        num = PG_GETARG_NUMERIC(0);
+    NumericVar    arg;
+    int            minscale;
+
+    if (NUMERIC_IS_NAN(num))
+        PG_RETURN_NULL();
+
+    init_var_from_num(num, &arg);
+    strip_var(&arg);
+
+    minscale = get_min_scale(&arg);
+    free_var(&arg);
+
+    PG_RETURN_INT32(minscale);
+}
+
+/*
+ * Reduce scale of numeric value so value is not changed

****
Likewise, comment text could be improved
****

+ */
+Datum
+numeric_trim_scale(PG_FUNCTION_ARGS)
+{
+    Numeric        num = PG_GETARG_NUMERIC(0);
+    Numeric        res;
+    NumericVar    result;
+
+    if (NUMERIC_IS_NAN(num))
+        PG_RETURN_NULL();
+
+    init_var_from_num(num, &result);
+    strip_var(&result);
+
+    result.dscale = get_min_scale(&result);
+
+    res = make_result(&result);
+    free_var(&result);
+
+    PG_RETURN_NUMERIC(res);
+}
 
 /*
  ----------------------------------------------------------------------
  * diff --git a/src/include/catalog/pg_proc.dat
  b/src/include/catalog/pg_proc.dat index 58ea5b982b..e603a5d8dd 100644

****
How about moving these new lines to right after line# 4255, the
scale() function?
****

--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4288,6 +4288,12 @@
   proname => 'width_bucket', prorettype => 'int4',
   proargtypes => 'numeric numeric numeric int4',
   prosrc => 'width_bucket_numeric' },
+{ oid => '3434', descr => 'returns minimal scale of numeric value',

****
How about a descr of?:

  minimal scale needed to store the supplied value without data loss
****

+  proname => 'minscale', prorettype => 'int4', proargtypes =>
  'numeric',
+  prosrc => 'numeric_minscale' },
+{ oid => '3435', descr => 'returns numeric value with minimal scale',

****
And likewise a descr of?:

  numeric with minimal scale needed to represent the given value
****

+  proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
  'numeric',
+  prosrc => 'numeric_trim_scale' },
 
 { oid => '1747',
   proname => 'time_pl_interval', prorettype => 'time',
diff --git a/src/test/regress/expected/numeric.out
  b/src/test/regress/expected/numeric.out index 1cb3c3bfab..778c204b13
  100644

****
I have suggestions:

Give the 2 functions separate comments (-- Tests for minscale() and
-- Tests for trim_scale())

Put () after the function names in the comments
because that's what scale() does.

Move the lines so the tests are right after the tests of scale().

Be explicit when testing for NULL or NaN.  I don't know that this is
consistent with the rest of the regression tests but I don't see how
being explicit could be wrong.  Otherwise NULL and NaN are output the
same ("") and you're not really testing.

So test with expressions like "foo(NULL) IS NULL" or
"foo('NaN'::NUMERIC) = 'NaN::NUMERIC" and look for t (or f) results.

****

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

08 December 2019, 07:38:38

ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop@meme.com> napsal:

Hello Pavel,

On Mon, 11 Nov 2019 15:47:37 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> Here is a patch. It's based on Dean's suggestions.
>
> I implemented two functions - first minscale, second trim_scale. The
> overhead of second is minimal - so I think it can be good to have it.
> I started design with the name "trim_scale", but the name can be any
> other.

Here are my thoughts on your patch.

My one substantial criticism is that I believe that
trim_scale('NaN'::numeric) should return NaN.
So the test output should look like:

template1=# select trim_scale('nan'::numeric) = 'nan'::numeric;
?column?
----------
t
(1 row)

fixed

FWIW, I bumped around the Internet and looked at Oracle docs to see if
there's any reason why minscale() might not be a good function name.
I couldn't find any problems.

I also couldn't think of a better name than trim_scale() and don't
have any problems with the name.

My other suggestions mostly have to do with documentation. Your code
looks pretty good to me, looks like the existing code, you name
variables and function names as in existing code, etc.

I comment on various hunks in line below:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 28eb322f3f..6f142cd679 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -918,6 +918,19 @@
<entry><literal>6.0000000000</literal></entry>
</row>

+ <row>
+ <entry>
+ <indexterm>
+ <primary>minscale</primary>
+ </indexterm>
+
<literal><function>minscale(<type>numeric</type>)</function></literal>
+ </entry>
+ <entry><type>integer</type></entry>
+ <entry>returns minimal scale of the argument (the number of
decimal digits in the fractional part)</entry>
+ <entry><literal>scale(8.4100)</literal></entry>
+ <entry><literal>2</literal></entry>
+ </row>
+
<row>
<entry>
<indexterm>

*****
Your description does not say what the minimal scale is. How about:

minimal scale (number of decimal digits in the fractional part) needed
to store the supplied value without data loss
*****

sounds better, updated

@@ -1041,6 +1054,19 @@
<entry><literal>1.4142135623731</literal></entry>
</row>

+ <row>
+ <entry>
+ <indexterm>
+ <primary>trim_scale</primary>
+ </indexterm>
+
<literal><function>trim_scale(<type>numeric</type>)</function></literal>
+ </entry>
+ <entry><type>numeric</type></entry>
+ <entry>reduce scale of the argument (the number of decimal
digits in the fractional part)</entry>
+ <entry><literal>scale(8.4100)</literal></entry>
+ <entry><literal>8.41</literal></entry>
+ </row>
+
<row>
<entry>
<indexterm>

****
How about:

reduce the scale (the number of decimal digits in the fractional part)
to the minimum needed to store the supplied value without data loss
*****

ok, changed

diff --git a/src/backend/utils/adt/numeric.c
b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c

****
I believe the hunks in this file should start at about line# 3181.
This is right after numeric_scale(). Seems like all the scale
related functions should be together.

There's no hard standard but I don't see why lines (comment lines in
your case) should be longer than 78 characters without good reason.
Please reformat.
****

@@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum));
}

+/*
+ * Calculate minimal display scale. The var should be stripped already.

****
I think you can get rid of the word "display" in the comment.
****

done

+ */
+static int
+get_min_scale(NumericVar *var)
+{
+ int minscale = 0;
+
+ if (var->ndigits > 0)
+ {
+ NumericDigit last_digit;
+
+ /* maximal size of minscale, can be lower */
+ minscale = (var->ndigits - var->weight - 1) *
DEC_DIGITS; +
+ /*
+ * When there are not digits after decimal point, the
previous expression

****
s/not/no/
****

+ * can be negative. In this case, the minscale must be
zero.
+ */

****
s/can be/is/
****

+ if (minscale > 0)
+ {
+ /* reduce minscale if trailing digits in last
numeric digits are zero */
+ last_digit = var->digits[var->ndigits - 1];
+
+ while (last_digit % 10 == 0)
+ {
+ minscale--;
+ last_digit /= 10;
+ }
+ }
+ else
+ minscale = 0;
+ }
+
+ return minscale;
+}
+
+/*
+ * Returns minimal scale of numeric value when value is not changed

****
Improve comment, something like:
minimal scale required to represent supplied value without loss

****

+ */
+Datum
+numeric_minscale(PG_FUNCTION_ARGS)
+{
+ Numeric num = PG_GETARG_NUMERIC(0);
+ NumericVar arg;
+ int minscale;
+
+ if (NUMERIC_IS_NAN(num))
+ PG_RETURN_NULL();
+
+ init_var_from_num(num, &arg);
+ strip_var(&arg);
+
+ minscale = get_min_scale(&arg);
+ free_var(&arg);
+
+ PG_RETURN_INT32(minscale);
+}
+
+/*
+ * Reduce scale of numeric value so value is not changed

****
Likewise, comment text could be improved
****

+ */
+Datum
+numeric_trim_scale(PG_FUNCTION_ARGS)
+{
+ Numeric num = PG_GETARG_NUMERIC(0);
+ Numeric res;
+ NumericVar result;
+
+ if (NUMERIC_IS_NAN(num))
+ PG_RETURN_NULL();
+
+ init_var_from_num(num, &result);
+ strip_var(&result);
+
+ result.dscale = get_min_scale(&result);
+
+ res = make_result(&result);
+ free_var(&result);
+
+ PG_RETURN_NUMERIC(res);
+}

/*
----------------------------------------------------------------------
* diff --git a/src/include/catalog/pg_proc.dat
b/src/include/catalog/pg_proc.dat index 58ea5b982b..e603a5d8dd 100644

****
How about moving these new lines to right after line# 4255, the
scale() function?
****

has sense, moved

--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4288,6 +4288,12 @@
proname => 'width_bucket', prorettype => 'int4',
proargtypes => 'numeric numeric numeric int4',
prosrc => 'width_bucket_numeric' },
+{ oid => '3434', descr => 'returns minimal scale of numeric value',

****
How about a descr of?:

minimal scale needed to store the supplied value without data loss
****

done

+ proname => 'minscale', prorettype => 'int4', proargtypes =>
'numeric',
+ prosrc => 'numeric_minscale' },
+{ oid => '3435', descr => 'returns numeric value with minimal scale',

****
And likewise a descr of?:

numeric with minimal scale needed to represent the given value
****

+ proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
'numeric',
+ prosrc => 'numeric_trim_scale' },

done

{ oid => '1747',
proname => 'time_pl_interval', prorettype => 'time',
diff --git a/src/test/regress/expected/numeric.out
b/src/test/regress/expected/numeric.out index 1cb3c3bfab..778c204b13
100644

****
I have suggestions:

Give the 2 functions separate comments (-- Tests for minscale() and
-- Tests for trim_scale())

Put () after the function names in the comments
because that's what scale() does.

Move the lines so the tests are right after the tests of scale().

Be explicit when testing for NULL or NaN. I don't know that this is
consistent with the rest of the regression tests but I don't see how
being explicit could be wrong. Otherwise NULL and NaN are output the
same ("") and you're not really testing.

So test with expressions like "foo(NULL) IS NULL" or
"foo('NaN'::NUMERIC) = 'NaN::NUMERIC" and look for t (or f) results.

ok fixed

Thank you for review - I am sending updated rebased patch. Please, update comments freely - my language skills (about English lang) are basic.

Regards

Pavel

****

Regards,

Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Attachment

minscale-20191208.patch

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Tom Lane

Date:

08 December 2019, 18:57:00

"Karl O. Pinc" <kop@meme.com> writes:
> FWIW, I bumped around the Internet and looked at Oracle docs to see if
> there's any reason why minscale() might not be a good function name.
> I couldn't find any problems.

> I also couldn't think of a better name than trim_scale() and don't
> have any problems with the name.

I'd just comment that it seems weird that the same patch is introducing
two functions with inconsistently chosen names.  Why does one have
an underscore separating the words and the other not?  I haven't got
a large investment in either naming convention specifically, but it'd
be nice if we could at least pretend to be considering consistency.

            regards, tom lane

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

09 December 2019, 02:22:39

On Sun, 08 Dec 2019 13:57:00 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> "Karl O. Pinc" <kop@meme.com> writes:
> > FWIW, I bumped around the Internet and looked at Oracle docs to see
> > if there's any reason why minscale() might not be a good function
> > name. I couldn't find any problems.  
> 
> > I also couldn't think of a better name than trim_scale() and don't
> > have any problems with the name.  
> 
> I'd just comment that it seems weird that the same patch is
> introducing two functions with inconsistently chosen names.  Why does
> one have an underscore separating the words and the other not?  I
> haven't got a large investment in either naming convention
> specifically, but it'd be nice if we could at least pretend to be
> considering consistency.

Consistency would be lovely.  I don't feel qualified
to make the decision but here's what I came up with:

I re-read the back-threads and don't see any discussion
of the naming of minscale().

My thoughts run toward asking
the "what is a word?" question, along with "what is the
policy for separating a word?".  Is "min" different
from the prefix "sub"?

"Trim" seems to clearly count as a word and trim_scale()
seems mostly consistent with existing function names.
(E.g. width_bucket(), convert_to().  But there's no
true consistency.  Plenty of functions don't separate
words with "_".  E.g. setseed().)

As far as "min" goes, glancing through function names [1]
does not help much.  The index indicates that when PG puts "min"
in a configuration parameter it separates it with "_".
(Looking at "min" in the index.)
Looking at the function names containing "min" [2] yields:

 aclitemin
 brin_minmax_add_value
 brin_minmax_consistent
 brin_minmax_opcinfo
 brin_minmax_union
 min
 numeric_uminus
 pg_terminate_backend
 range_minus
 txid_snapshot_xmin

Not especially helpful.   

I'm inclined to want
min_scale() instead of "minscale()" based on
how config parameters are named and for consistency
with trim_scale().  Pavel, if you agree then
let's just change minscale() to min_scale() and
let people object if they don't like it.

Regards.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

[1] 
select pg_proc.proname
  from pg_proc
  group by pg_proc.proname
  order by pg_proc.proname;

[2]
select pg_proc.proname
  from pg_proc
  where pg_proc.proname like '%min%'
  group by pg_proc.proname
  order by pg_proc.proname;

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

09 December 2019, 02:51:18

Hi Pavel,

Thanks for your changes.  More inline below:

On Sun, 8 Dec 2019 08:38:38 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop@meme.com> napsal:

> > On Mon, 11 Nov 2019 15:47:37 +0100
> > Pavel Stehule <pavel.stehule@gmail.com> wrote:

> > > I implemented two functions - first minscale, second trim_scale.
> > > The overhead of second is minimal - so I think it can be good to
> > > have it. I started design with the name "trim_scale", but the
> > > name can be any other.


> > I comment on various hunks in line below:

>
> > diff --git a/src/backend/utils/adt/numeric.c
> > b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c
> > 100644 --- a/src/backend/utils/adt/numeric.c
> > +++ b/src/backend/utils/adt/numeric.c
> >
> > ****
> > I believe the hunks in this file should start at about line# 3181.
> > This is right after numeric_scale().  Seems like all the scale
> > related functions should be together.
> >
> > There's no hard standard but I don't see why lines (comment lines in
> > your case) should be longer than 78 characters without good reason.
> > Please reformat.
> > ****

I don't see any response from you regarding the above two suggestions.


>
> > + */
> > +static int
> > +get_min_scale(NumericVar *var)
> > +{
> > +       int             minscale = 0;
> > +
> > +       if (var->ndigits > 0)
> > +       {
> > +               NumericDigit last_digit;
> > +
> > +               /* maximal size of minscale, can be lower */
> > +               minscale = (var->ndigits - var->weight - 1) *
> >   DEC_DIGITS; +
> > +               /*
> > +                * When there are not digits after decimal point,
> > the previous expression
> >
> > ****
> > s/not/no/
> > ****
> >
> > +                * can be negative. In this case, the minscale must
> > be zero.
> > +                */
> >
> > ****
> > s/can be/is/
> > ****

By the above, I intended the comment be changed (after line wrapping)
to:
               /*
                 * When there are no digits after decimal point,
                 * the previous expression is negative. In this
                 * case the minscale must be zero.
                 */

(Oh yes, on re-reading I think the comma is unnecessary so I removed it too.)



> >
> > +               if (minscale > 0)
> > +               {
> > +                       /* reduce minscale if trailing digits in
> > last numeric digits are zero */

And the above comment should either be wrapped (as requested above)
or eliminated.  I like comments but I'm not sure this one contributes
anything.


> > --- a/src/include/catalog/pg_proc.dat
> > +++ b/src/include/catalog/pg_proc.dat
> > @@ -4288,6 +4288,12 @@
> >    proname => 'width_bucket', prorettype => 'int4',
> >    proargtypes => 'numeric numeric numeric int4',
> >    prosrc => 'width_bucket_numeric' },
> > +{ oid => '3434', descr => 'returns minimal scale of numeric value',
> >
> > ****
> > How about a descr of?:
> >
> >   minimal scale needed to store the supplied value without data loss
> > ****
> >
>
> done
>
> >
> > +  proname => 'minscale', prorettype => 'int4', proargtypes =>
> >   'numeric',
> > +  prosrc => 'numeric_minscale' },
> > +{ oid => '3435', descr => 'returns numeric value with minimal
> > scale',
> >
> > ****
> > And likewise a descr of?:
> >
> >   numeric with minimal scale needed to represent the given value
> > ****
> >
> > +  proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
> >   'numeric',
> > +  prosrc => 'numeric_trim_scale' },
> >
>
> done

Thanks for these changes.  Looking at pg_proc.dat there seems to
be an effort made to keep the lines to a maximum of 78 or 80
characters.  This means starting "descr => '..." on new lines
when the description is long.  Please reformat, doing this or,
if you like, something even more clever to keep the lines short.

Looking good.  We're making progress.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

09 December 2019, 18:15:22

Hi Pavel,

I've had some thoughts about the regression tests.

It wouldn't hurt to move them to right after the
scale() tests in numeric.sql.

I believe your tests are covering all the code paths
but it is not clear just what test does what.
I don't see a lot of comments in the tests so I don't
know that it'd be appropriate to put them in to
describe just what's tested.  But in any case it
could be nice to choose values where it is at least
sort of apparent what part of the codebase is tested.

FWIW, although the code paths are covered, the possible
data permutations are not.  E.g.  I don't see a case
where scale > 0 and the NDIGITS of the last digit is full.

There are also some tests (the 0 and 0.00 tests) that duplicates
the execution path.  In the 0 case I don't see a problem
but as a rule there's not a lot of point.  Better test
values would (mostly) eliminate these.

So, my thoughts run along these lines:

select minscale(numeric 'NaN') is NULL; -- should be true
select minscale(NULL::numeric) is NULL; -- should be true
select minscale(0);                     -- no digits
select minscale(0.00);                  -- no digits again
select minscale(1.0);                   -- no scale
select minscale(1.1);                   -- scale 1
select minscale(1.12);                  -- scale 2
select minscale(1.123);                 -- scale 3
select minscale(1.1234);                -- scale 4, filled digit
select minscale(1.12345);               -- scale 5, 2 NDIGITS
select minscale(1.1000);                -- 1 pos in NDIGITS
select minscale(1.1200);                -- 2 pos in NDIGITS
select minscale(1.1230);                -- 3 pos in NDIGITS
select minscale(1.1234);                -- all pos in NDIGITS
select minscale(1.12345000);            -- 2 NDIGITS
select minscale(1.123400000000);        -- strip() required/done
select minscale(12345.123456789012345); -- "big" number
select minscale(-12345.12345);          -- negative number
select minscale(1e100);                 -- very big number
select minscale(1e100::numeric + 0.1);  -- big number with scale

I don't know why you chose some of your values so if there's
something you were testing for that the above does not cover
please include it.

So, a combination of white and black box testing.  Having written
it out it seems like a lot of testing for such a simple function.
On the other hand I don't see a lot of cost in having all
these tests.  Opinions welcome.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

09 December 2019, 18:25:23

On Mon, 9 Dec 2019 12:15:22 -0600
"Karl O. Pinc" <kop@meme.com> wrote:

> I've had some thoughts about the regression tests.

> Having written
> it out it seems like a lot of testing for such a simple function.

FYI.

I don't see trim_scale() needing such exhaustive testing because you'll
have already tested a lot with the min_scale() tests.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

09 December 2019, 19:51:21

po 9. 12. 2019 v 19:15 odesílatel Karl O. Pinc <kop@meme.com> napsal:

Hi Pavel,

I've had some thoughts about the regression tests.

It wouldn't hurt to move them to right after the
scale() tests in numeric.sql.

I believe your tests are covering all the code paths
but it is not clear just what test does what.
I don't see a lot of comments in the tests so I don't
know that it'd be appropriate to put them in to
describe just what's tested. But in any case it
could be nice to choose values where it is at least
sort of apparent what part of the codebase is tested.

FWIW, although the code paths are covered, the possible
data permutations are not. E.g. I don't see a case
where scale > 0 and the NDIGITS of the last digit is full.

There are also some tests (the 0 and 0.00 tests) that duplicates
the execution path. In the 0 case I don't see a problem
but as a rule there's not a lot of point. Better test
values would (mostly) eliminate these.

So, my thoughts run along these lines:

select minscale(numeric 'NaN') is NULL; -- should be true
select minscale(NULL::numeric) is NULL; -- should be true
select minscale(0); -- no digits
select minscale(0.00); -- no digits again
select minscale(1.0); -- no scale
select minscale(1.1); -- scale 1
select minscale(1.12); -- scale 2
select minscale(1.123); -- scale 3
select minscale(1.1234); -- scale 4, filled digit
select minscale(1.12345); -- scale 5, 2 NDIGITS
select minscale(1.1000); -- 1 pos in NDIGITS
select minscale(1.1200); -- 2 pos in NDIGITS
select minscale(1.1230); -- 3 pos in NDIGITS
select minscale(1.1234); -- all pos in NDIGITS
select minscale(1.12345000); -- 2 NDIGITS
select minscale(1.123400000000); -- strip() required/done
select minscale(12345.123456789012345); -- "big" number
select minscale(-12345.12345); -- negative number
select minscale(1e100); -- very big number
select minscale(1e100::numeric + 0.1); -- big number with scale

I don't know why you chose some of your values so if there's
something you were testing for that the above does not cover
please include it.

some values was proposed in discussion, others are from tests of scale function.

I used proposed tests by you.

Regards

Pavel

So, a combination of white and black box testing. Having written
it out it seems like a lot of testing for such a simple function.
On the other hand I don't see a lot of cost in having all
these tests. Opinions welcome.

Regards,

Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

09 December 2019, 20:04:21

po 9. 12. 2019 v 3:51 odesílatel Karl O. Pinc <kop@meme.com> napsal:

Hi Pavel,

Thanks for your changes. More inline below:

On Sun, 8 Dec 2019 08:38:38 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop@meme.com> napsal:

> > On Mon, 11 Nov 2019 15:47:37 +0100
> > Pavel Stehule <pavel.stehule@gmail.com> wrote:

> > > I implemented two functions - first minscale, second trim_scale.
> > > The overhead of second is minimal - so I think it can be good to
> > > have it. I started design with the name "trim_scale", but the
> > > name can be any other.

> > I comment on various hunks in line below:

>
> > diff --git a/src/backend/utils/adt/numeric.c
> > b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c
> > 100644 --- a/src/backend/utils/adt/numeric.c
> > +++ b/src/backend/utils/adt/numeric.c
> >
> > ****
> > I believe the hunks in this file should start at about line# 3181.
> > This is right after numeric_scale(). Seems like all the scale
> > related functions should be together.
> >
> > There's no hard standard but I don't see why lines (comment lines in
> > your case) should be longer than 78 characters without good reason.
> > Please reformat.
> > ****

I don't see any response from you regarding the above two suggestions.

>
> > + */
> > +static int
> > +get_min_scale(NumericVar *var)
> > +{
> > + int minscale = 0;
> > +
> > + if (var->ndigits > 0)
> > + {
> > + NumericDigit last_digit;
> > +
> > + /* maximal size of minscale, can be lower */
> > + minscale = (var->ndigits - var->weight - 1) *
> > DEC_DIGITS; +
> > + /*
> > + * When there are not digits after decimal point,
> > the previous expression
> >
> > ****
> > s/not/no/
> > ****
> >
> > + * can be negative. In this case, the minscale must
> > be zero.
> > + */
> >
> > ****
> > s/can be/is/
> > ****

By the above, I intended the comment be changed (after line wrapping)
to:
/*
* When there are no digits after decimal point,
* the previous expression is negative. In this
* case the minscale must be zero.
*/

(Oh yes, on re-reading I think the comma is unnecessary so I removed it too.)

> >
> > + if (minscale > 0)
> > + {
> > + /* reduce minscale if trailing digits in
> > last numeric digits are zero */

And the above comment should either be wrapped (as requested above)
or eliminated. I like comments but I'm not sure this one contributes
anything.

> > --- a/src/include/catalog/pg_proc.dat
> > +++ b/src/include/catalog/pg_proc.dat
> > @@ -4288,6 +4288,12 @@
> > proname => 'width_bucket', prorettype => 'int4',
> > proargtypes => 'numeric numeric numeric int4',
> > prosrc => 'width_bucket_numeric' },
> > +{ oid => '3434', descr => 'returns minimal scale of numeric value',
> >
> > ****
> > How about a descr of?:
> >
> > minimal scale needed to store the supplied value without data loss
> > ****
> >
>
> done
>
> >
> > + proname => 'minscale', prorettype => 'int4', proargtypes =>
> > 'numeric',
> > + prosrc => 'numeric_minscale' },
> > +{ oid => '3435', descr => 'returns numeric value with minimal
> > scale',
> >
> > ****
> > And likewise a descr of?:
> >
> > numeric with minimal scale needed to represent the given value
> > ****
> >
> > + proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
> > 'numeric',
> > + prosrc => 'numeric_trim_scale' },
> >
>
> done

Thanks for these changes. Looking at pg_proc.dat there seems to
be an effort made to keep the lines to a maximum of 78 or 80
characters. This means starting "descr => '..." on new lines
when the description is long. Please reformat, doing this or,
if you like, something even more clever to keep the lines short.

Looking good. We're making progress.

I fixed almost all mentioned issues (that I understand)

I am sending updated patch

Regards

Pavel

Regards,

Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Attachment

min_scale.patch

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

09 December 2019, 23:03:43

On Mon, 9 Dec 2019 21:04:21 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> I fixed almost all mentioned issues (that I understand)

If you don't understand you might ask, or at least say.
That way I know you've noticed my remarks and I don't
have to repeat them.

I have 2 remaining suggestions.

1) As previously suggested: Consider moving
all the code you added to numeric.c to right after
the scale() related code.  This is equivalent to
what was done in pg_proc.dat and regression tests
where all the scale related stuff is in one
place in the file.

2) Now that the function is called min_scale()
it might be nice if your "minscale" variable
in numeric.c was named "min_scale".

I don't feel particularly strongly about either
of the above but think them a slight improvement.

I also wonder whether all the trim_scale() tests
are now necessary, but not enough to make any suggestions.
Especially because, well, tests are good.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

10 December 2019, 06:11:59

út 10. 12. 2019 v 0:03 odesílatel Karl O. Pinc <kop@meme.com> napsal:

On Mon, 9 Dec 2019 21:04:21 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> I fixed almost all mentioned issues (that I understand)

If you don't understand you might ask, or at least say.
That way I know you've noticed my remarks and I don't
have to repeat them.

I have 2 remaining suggestions.

1) As previously suggested: Consider moving
all the code you added to numeric.c to right after
the scale() related code. This is equivalent to
what was done in pg_proc.dat and regression tests
where all the scale related stuff is in one
place in the file.

2) Now that the function is called min_scale()
it might be nice if your "minscale" variable
in numeric.c was named "min_scale".

I don't feel particularly strongly about either
of the above but think them a slight improvement.

done

I also wonder whether all the trim_scale() tests
are now necessary, but not enough to make any suggestions.
Especially because, well, tests are good.

I don't think so tests should be minimalistic - there can be some redundancy to coverage some less probable size effects of some future changes. More - there is a small symmetry with min_scale tests - and third argument - some times I use tests (result part) as "documentation". But I have not any problem to reduce tests if there will be requirement to do it.

Regards

Pavel

Regards,

Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Attachment

min_scale-20201210.patch

Re: proposal: minscale, rtrim, btrim functions for numeric

From

"Karl O. Pinc"

Date:

10 December 2019, 12:56:33

On Tue, 10 Dec 2019 07:11:59 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> út 10. 12. 2019 v 0:03 odesílatel Karl O. Pinc <kop@meme.com> napsal:
> > I also wonder whether all the trim_scale() tests
> > are now necessary, but not enough to make any suggestions.

> I don't think so tests should be minimalistic - there can be some
> redundancy to coverage some less probable size effects of some future
> changes. More - there is a small symmetry with min_scale tests - and
> third argument - some times I use tests (result part) as
> "documentation".

Fine with me.

Tests pass against HEAD.  Docs build and look good.
Patch looks good to me.

I'm marking it ready for a committer.

Thanks for the work.

Regards,

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

10 December 2019, 13:47:03

út 10. 12. 2019 v 13:56 odesílatel Karl O. Pinc <kop@meme.com> napsal:

On Tue, 10 Dec 2019 07:11:59 +0100
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> út 10. 12. 2019 v 0:03 odesílatel Karl O. Pinc <kop@meme.com> napsal:
> > I also wonder whether all the trim_scale() tests
> > are now necessary, but not enough to make any suggestions.

> I don't think so tests should be minimalistic - there can be some
> redundancy to coverage some less probable size effects of some future
> changes. More - there is a small symmetry with min_scale tests - and
> third argument - some times I use tests (result part) as
> "documentation".

Fine with me.

Tests pass against HEAD. Docs build and look good.
Patch looks good to me.

I'm marking it ready for a committer.

Thanks for the work.

Thank you for review

Pavel

Regards,

Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Tom Lane

Date:

06 January 2020, 17:22:50

Pavel Stehule <pavel.stehule@gmail.com> writes:
> út 10. 12. 2019 v 13:56 odesílatel Karl O. Pinc <kop@meme.com> napsal:
>> I'm marking it ready for a committer.

> Thank you for review

Pushed with minor adjustments.  Notably, I didn't like having
get_min_scale() depend on its callers having stripped trailing zeroes
to avoid getting into a tight infinite loop.  That's just trouble
waiting to happen, especially since non-stripped numerics are seldom
seen in practice (ones coming into the SQL-level functions should
never look like that, ie the strip_var calls you had are almost
certainly dead code).  If we did have a code path where the situation
could occur, and somebody forgot the strip_var call, the omission
could easily escape notice.  So I got rid of the strip_var calls and
made get_min_scale() defend itself against the case.  It's hardly
any more code, and it should be a shade faster than strip_var anyway.

            regards, tom lane

Re: proposal: minscale, rtrim, btrim functions for numeric

From

Pavel Stehule

Date:

06 January 2020, 18:08:23

po 6. 1. 2020 v 18:22 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> út 10. 12. 2019 v 13:56 odesílatel Karl O. Pinc <kop@meme.com> napsal:
>> I'm marking it ready for a committer.

> Thank you for review

Pushed with minor adjustments. Notably, I didn't like having
get_min_scale() depend on its callers having stripped trailing zeroes
to avoid getting into a tight infinite loop. That's just trouble
waiting to happen, especially since non-stripped numerics are seldom
seen in practice (ones coming into the SQL-level functions should
never look like that, ie the strip_var calls you had are almost
certainly dead code). If we did have a code path where the situation
could occur, and somebody forgot the strip_var call, the omission
could easily escape notice. So I got rid of the strip_var calls and
made get_min_scale() defend itself against the case. It's hardly
any more code, and it should be a shade faster than strip_var anyway.

Thank you very much

Maybe this issue was part of ToDo list

Pavel

regards, tom lane