Thread: WIP: Relaxing the constraints on numeric scale

WIP: Relaxing the constraints on numeric scale

From
Dean Rasheed
Date:
When specifying NUMERIC(precision, scale) the scale is constrained to
the range [0, precision], which is per SQL spec. However, at least one
other major database vendor intentionally does not impose this
restriction, since allowing scales outside this range can be useful.

A negative scale implies rounding before the decimal point. For
example, a column declared as NUMERIC(3,-3) rounds values to the
nearest thousand, and can hold values up to 999000.

(Note that the display scale remains non-negative, so all digits
before the decimal point are displayed, and none of the internals of
numeric.c need to worry about negative dscale values. Only the scale
in the typemod is negative.)

A scale greater than the precision constrains the value to be less
than 0.1. For example, a column declared as NUMERIC(3,6) can hold
"micro" quantities up to 0.000999.

Attached is a WIP patch supporting this.

Regards,
Dean

Attachment

Re: WIP: Relaxing the constraints on numeric scale

From
Robert Haas
Date:
On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> When specifying NUMERIC(precision, scale) the scale is constrained to
> the range [0, precision], which is per SQL spec. However, at least one
> other major database vendor intentionally does not impose this
> restriction, since allowing scales outside this range can be useful.

I thought about this too, but
http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that
it would be an on-disk format break. Maybe it's not, though?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: WIP: Relaxing the constraints on numeric scale

From
Dean Rasheed
Date:
On Tue, 29 Jun 2021 at 21:34, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I thought about this too, but
> http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that
> it would be an on-disk format break. Maybe it's not, though?
>

No, because the numeric dscale remains non-negative, so there's no
change to the way numeric values are stored. The only change is to
extend the allowed scale in the numeric typemod.

Regards,
Dean



Re: WIP: Relaxing the constraints on numeric scale

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> When specifying NUMERIC(precision, scale) the scale is constrained to
>> the range [0, precision], which is per SQL spec. However, at least one
>> other major database vendor intentionally does not impose this
>> restriction, since allowing scales outside this range can be useful.

> I thought about this too, but
> http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that
> it would be an on-disk format break. Maybe it's not, though?

See further down in that thread --- I don't think there's actually
a need for negative dscale on-disk.  However, there remains the question
of whether any external code knows enough about numeric typmods to become
confused by a negative scale field within those.

After reflecting for a bit, I suspect the answer is "probably", but
it seems like it wouldn't be much worse of an update than any number
of other catalog changes we make every release.

            regards, tom lane



Re: WIP: Relaxing the constraints on numeric scale

From
Robert Haas
Date:
On Tue, Jun 29, 2021 at 4:46 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On Tue, 29 Jun 2021 at 21:34, Robert Haas <robertmhaas@gmail.com> wrote:
> > I thought about this too, but
> > http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that
> > it would be an on-disk format break. Maybe it's not, though?
>
> No, because the numeric dscale remains non-negative, so there's no
> change to the way numeric values are stored. The only change is to
> extend the allowed scale in the numeric typemod.

Ah! Well, in that case, this sounds great.

(I haven't looked at the patch, so this is just an endorsement of the concept.)

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: WIP: Relaxing the constraints on numeric scale

From
Dean Rasheed
Date:
Attached is a more complete patch, with updated docs and tests.

I chose to allow the scale to be in the range -1000 to 1000, which, to
some extent, is quite arbitrary. The upper limit of 1000 makes sense,
because nearly all numeric computations (other than multiply, add and
subtract) have that as their upper scale limit (that's the maximum
display scale). It also has to be at least 1000 for SQL compliance,
since the precision can be up to 1000.

The lower limit, on the other hand, really is quite arbitrary. -1000
is a nice round number, giving it a certain symmetry, and is almost
certainly sufficient for any realistic use case (-1000 means numbers
are rounded to the nearest multiple of 10^1000).

Also, keeping some spare bits in the typemod might come in handy one
day for something else (e.g., rounding mode choice).

Regards,
Dean

Attachment

Re: WIP: Relaxing the constraints on numeric scale

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> Attached is a more complete patch, with updated docs and tests.

I took a brief look at this and have a couple of quick suggestions:

* As you mention, keeping some spare bits in the typmod might come
in handy some day, but as given this patch isn't really doing so.
I think it might be advisable to mask the scale off at 11 bits,
preserving the high 5 bits of the low-order half of the word for future
use.  The main objection to that I guess is that it would complicate
doing sign extension in TYPMOD_SCALE().  But it doesn't seem like we
use that logic in any really hot code paths, so another instruction
or three probably is not much of a cost.

* I agree with wrapping the typmod construction/extraction into macros
(or maybe they should be inline functions?) but the names you chose
seem generic enough to possibly confuse onlookers.  I'd suggest
changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD.  The comment for them
should probably also explicitly explain "For purely historical reasons,
VARHDRSZ is added to the typmod value after these fields are combined",
or words to that effect.

* It might be advisable to write NUMERIC_MIN_SCALE with parens:

#define NUMERIC_MIN_SCALE            (-1000)

to avoid any precedence gotchas.

* I'd be inclined to leave the num_typemod_test table in place,
rather than dropping it, so that it serves to exercise pg_dump
for these cases during the pg_upgrade test.

Haven't read the code in detail yet.

            regards, tom lane



Re: WIP: Relaxing the constraints on numeric scale

From
Dean Rasheed
Date:
On Wed, 21 Jul 2021 at 22:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I took a brief look at this and have a couple of quick suggestions:
>

Thanks for looking at this!

> * As you mention, keeping some spare bits in the typmod might come
> in handy some day, but as given this patch isn't really doing so.
> I think it might be advisable to mask the scale off at 11 bits,
> preserving the high 5 bits of the low-order half of the word for future
> use.  The main objection to that I guess is that it would complicate
> doing sign extension in TYPMOD_SCALE().  But it doesn't seem like we
> use that logic in any really hot code paths, so another instruction
> or three probably is not much of a cost.
>

Yeah, that makes sense, and it's worth documenting where the spare bits are.

Interestingly, gcc recognised the bit hack I used for sign extension
and turned it into (x << 21) >> 21 using x86 shl and sar instructions,
though I didn't write it that way because apparently that's not
portable.

> * I agree with wrapping the typmod construction/extraction into macros
> (or maybe they should be inline functions?) but the names you chose
> seem generic enough to possibly confuse onlookers.  I'd suggest
> changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD.  The comment for them
> should probably also explicitly explain "For purely historical reasons,
> VARHDRSZ is added to the typmod value after these fields are combined",
> or words to that effect.
>

I've turned them into inline functions, since that makes them easier
to read, and debug if necessary.

All your other suggestions make sense too. Attached is a new version.

Regards,
Dean

Attachment

Re: WIP: Relaxing the constraints on numeric scale

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> All your other suggestions make sense too. Attached is a new version.

OK, I've now studied this more closely, and have some additional
nitpicks:

* I felt the way you did the documentation was confusing.  It seems
better to explain the normal case first, and then describe the two
extended cases.

* As long as we're encapsulating typmod construction/extraction, let's
also encapsulate the checks for valid typmods.

* Other places are fairly careful to declare typmod values as "int32",
so I think this code should too.

Attached is a proposed delta patch making those changes.

(I made the docs mention that the extension cases are allowed as of v15.
While useful in the short run, that will look like noise in ten years;
so I could go either way on whether to do that.)

If you're good with these, then I think it's ready to go.
I'll mark it RfC in the commitfest.

            regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 6abda2f1d2..d3c70667a3 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -545,8 +545,8 @@
 <programlisting>
 NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>)
 </programlisting>
-     The precision must be positive, the scale may be positive or negative
-     (see below).  Alternatively:
+     The precision must be positive, while the scale may be positive or
+     negative (see below).  Alternatively:
 <programlisting>
 NUMERIC(<replaceable>precision</replaceable>)
 </programlisting>
@@ -569,8 +569,8 @@ NUMERIC
     <note>
      <para>
       The maximum precision that can be explicitly specified in
-      a <type>NUMERIC</type> type declaration is 1000.  An
-      unconstrained <type>NUMERIC</type> column is subject to the limits
+      a <type>numeric</type> type declaration is 1000.  An
+      unconstrained <type>numeric</type> column is subject to the limits
       described in <xref linkend="datatype-numeric-table"/>.
      </para>
     </note>
@@ -578,38 +578,48 @@ NUMERIC
     <para>
      If the scale of a value to be stored is greater than the declared
      scale of the column, the system will round the value to the specified
-     number of fractional digits.  If the declared scale of the column is
-     negative, the value will be rounded to the left of the decimal point.
-     If, after rounding, the number of digits to the left of the decimal point
-     exceeds the declared precision minus the declared scale, an error is
-     raised.  Similarly, if the declared scale exceeds the declared precision
-     and the number of zero digits to the right of the decimal point is less
-     than the declared scale minus the declared precision, an error is raised.
+     number of fractional digits.  Then, if the number of digits to the
+     left of the decimal point exceeds the declared precision minus the
+     declared scale, an error is raised.
      For example, a column declared as
 <programlisting>
 NUMERIC(3, 1)
 </programlisting>
-     will round values to 1 decimal place and be able to store values between
-     -99.9 and 99.9, inclusive.  A column declared as
+     will round values to 1 decimal place and can store values between
+     -99.9 and 99.9, inclusive.
+    </para>
+
+    <para>
+     Beginning in <productname>PostgreSQL</productname> 15, it is allowed
+     to declare a <type>numeric</type> column with a negative scale.  Then
+     values will be rounded to the left of the decimal point.  The
+     precision still represents the maximum number of non-rounded digits.
+     Thus, a column declared as
 <programlisting>
 NUMERIC(2, -3)
 </programlisting>
-     will round values to the nearest thousand and be able to store values
-     between -99000 and 99000, inclusive.  A column declared as
+     will round values to the nearest thousand and can store values
+     between -99000 and 99000, inclusive.
+     It is also allowed to declare a scale larger than the declared
+     precision.  Such a column can only hold fractional values, and it
+     requires the number of zero digits just to the right of the decimal
+     point to be at least the declared scale minus the declared precision.
+     For example, a column declared as
 <programlisting>
 NUMERIC(3, 5)
 </programlisting>
-     will round values to 5 decimal places and be able to store values between
+     will round values to 5 decimal places and can store values between
      -0.00999 and 0.00999, inclusive.
     </para>

     <note>
      <para>
-      The scale in a <type>NUMERIC</type> type declaration may be any value in
-      the range -1000 to 1000.  (The <acronym>SQL</acronym> standard requires
-      the scale to be in the range 0 to <replaceable>precision</replaceable>.
-      Using values outside this range may not be portable to other database
-      systems.)
+      <productname>PostgreSQL</productname> permits the scale in
+      a <type>numeric</type> type declaration to be any value in the range
+      -1000 to 1000.  However, the <acronym>SQL</acronym> standard requires
+      the scale to be in the range 0
+      to <replaceable>precision</replaceable>.  Using scales outside that
+      range may not be portable to other database systems.
      </para>
     </note>

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 46cb37cea1..faff09f5d5 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -827,21 +827,31 @@ numeric_is_integral(Numeric num)
  *
  *    For purely historical reasons VARHDRSZ is then added to the result, thus
  *    the unused space in the upper 16 bits is not all as freely available as it
- *    might seem.
+ *    might seem.  (We can't let the result overflow to a negative int32, as
+ *    other parts of the system would interpret that as not-a-valid-typmod.)
  */
-static inline int
+static inline int32
 make_numeric_typmod(int precision, int scale)
 {
     return ((precision << 16) | (scale & 0x7ff)) + VARHDRSZ;
 }

+/*
+ * Because of the offset, valid numeric typmods are at least VARHDRSZ
+ */
+static inline bool
+is_valid_numeric_typmod(int32 typmod)
+{
+    return typmod >= (int32) VARHDRSZ;
+}
+
 /*
  * numeric_typmod_precision() -
  *
  *    Extract the precision from a numeric typmod --- see make_numeric_typmod().
  */
 static inline int
-numeric_typmod_precision(int typmod)
+numeric_typmod_precision(int32 typmod)
 {
     return ((typmod - VARHDRSZ) >> 16) & 0xffff;
 }
@@ -856,7 +866,7 @@ numeric_typmod_precision(int typmod)
  *    extends an 11-bit two's complement number x.
  */
 static inline int
-numeric_typmod_scale(int typmod)
+numeric_typmod_scale(int32 typmod)
 {
     return (((typmod - VARHDRSZ) & 0x7ff) ^ 1024) - 1024;
 }
@@ -872,7 +882,7 @@ numeric_maximum_size(int32 typmod)
     int            precision;
     int            numeric_digits;

-    if (typmod < (int32) (VARHDRSZ))
+    if (!is_valid_numeric_typmod(typmod))
         return -1;

     /* precision (ie, max # of digits) is in upper bits of typmod */
@@ -1136,14 +1146,14 @@ numeric_support(PG_FUNCTION_ARGS)
             int32        new_precision = numeric_typmod_precision(new_typmod);

             /*
-             * If new_typmod < VARHDRSZ, the destination is unconstrained;
-             * that's always OK.  If old_typmod >= VARHDRSZ, the source is
+             * If new_typmod is invalid, the destination is unconstrained;
+             * that's always OK.  If old_typmod is valid, the source is
              * constrained, and we're OK if the scale is unchanged and the
              * precision is not decreasing.  See further notes in function
              * header comment.
              */
-            if (new_typmod < (int32) VARHDRSZ ||
-                (old_typmod >= (int32) VARHDRSZ &&
+            if (!is_valid_numeric_typmod(new_typmod) ||
+                (is_valid_numeric_typmod(old_typmod) &&
                  new_scale == old_scale && new_precision >= old_precision))
                 ret = relabel_to_typmod(source, new_typmod);
         }
@@ -1186,7 +1196,7 @@ numeric        (PG_FUNCTION_ARGS)
      * If the value isn't a valid type modifier, simply return a copy of the
      * input value
      */
-    if (typmod < (int32) (VARHDRSZ))
+    if (!is_valid_numeric_typmod(typmod))
         PG_RETURN_NUMERIC(duplicate_numeric(num));

     /*
@@ -1288,7 +1298,7 @@ numerictypmodout(PG_FUNCTION_ARGS)
     int32        typmod = PG_GETARG_INT32(0);
     char       *res = (char *) palloc(64);

-    if (typmod >= 0)
+    if (is_valid_numeric_typmod(typmod))
         snprintf(res, 64, "(%d,%d)",
                  numeric_typmod_precision(typmod),
                  numeric_typmod_scale(typmod));
@@ -7476,8 +7486,8 @@ apply_typmod(NumericVar *var, int32 typmod)
     int            ddigits;
     int            i;

-    /* Do nothing if we have a default typmod (-1) */
-    if (typmod < (int32) (VARHDRSZ))
+    /* Do nothing if we have an invalid typmod */
+    if (!is_valid_numeric_typmod(typmod))
         return;

     precision = numeric_typmod_precision(typmod);
@@ -7565,7 +7575,7 @@ apply_typmod_special(Numeric num, int32 typmod)
         return;

     /* Do nothing if we have a default typmod (-1) */
-    if (typmod < (int32) (VARHDRSZ))
+    if (!is_valid_numeric_typmod(typmod))
         return;

     precision = numeric_typmod_precision(typmod);

Re: WIP: Relaxing the constraints on numeric scale

From
Dean Rasheed
Date:
On Fri, 23 Jul 2021 at 16:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> OK, I've now studied this more closely, and have some additional
> nitpicks:
>
> * I felt the way you did the documentation was confusing.  It seems
> better to explain the normal case first, and then describe the two
> extended cases.

OK, that looks much better. Re-reading the entire section, I think
it's much clearer now.

> * As long as we're encapsulating typmod construction/extraction, let's
> also encapsulate the checks for valid typmods.

Good idea.

> * Other places are fairly careful to declare typmod values as "int32",
> so I think this code should too.

OK, that seems sensible.

> Attached is a proposed delta patch making those changes.
>
> (I made the docs mention that the extension cases are allowed as of v15.
> While useful in the short run, that will look like noise in ten years;
> so I could go either way on whether to do that.)

Hmm, yeah. In general,I find such things in the documentation useful
for quite a few years. I'm regularly looking to see when a particular
feature was added, to see if I can use it in a particular situation.
But eventually, it'll become irrelevant, and I don't know if anyone
will go around tidying these things up. I have left it in, but perhaps
there is a wider discussion to be had about whether we should be doing
that more (or less) often. FWIW, I like the way some docs include an
"available since" tag (e.g,, Java's @since tag).

> If you're good with these, then I think it's ready to go.
> I'll mark it RfC in the commitfest.

Thanks. That all looked good, so I have pushed it.

Regards,
Dean