Thread: WIP: Relaxing the constraints on numeric scale
When specifying NUMERIC(precision, scale) the scale is constrained to the range [0, precision], which is per SQL spec. However, at least one other major database vendor intentionally does not impose this restriction, since allowing scales outside this range can be useful. A negative scale implies rounding before the decimal point. For example, a column declared as NUMERIC(3,-3) rounds values to the nearest thousand, and can hold values up to 999000. (Note that the display scale remains non-negative, so all digits before the decimal point are displayed, and none of the internals of numeric.c need to worry about negative dscale values. Only the scale in the typemod is negative.) A scale greater than the precision constrains the value to be less than 0.1. For example, a column declared as NUMERIC(3,6) can hold "micro" quantities up to 0.000999. Attached is a WIP patch supporting this. Regards, Dean
Attachment
On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > When specifying NUMERIC(precision, scale) the scale is constrained to > the range [0, precision], which is per SQL spec. However, at least one > other major database vendor intentionally does not impose this > restriction, since allowing scales outside this range can be useful. I thought about this too, but http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that it would be an on-disk format break. Maybe it's not, though? -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, 29 Jun 2021 at 21:34, Robert Haas <robertmhaas@gmail.com> wrote: > > I thought about this too, but > http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that > it would be an on-disk format break. Maybe it's not, though? > No, because the numeric dscale remains non-negative, so there's no change to the way numeric values are stored. The only change is to extend the allowed scale in the numeric typemod. Regards, Dean
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> When specifying NUMERIC(precision, scale) the scale is constrained to >> the range [0, precision], which is per SQL spec. However, at least one >> other major database vendor intentionally does not impose this >> restriction, since allowing scales outside this range can be useful. > I thought about this too, but > http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that > it would be an on-disk format break. Maybe it's not, though? See further down in that thread --- I don't think there's actually a need for negative dscale on-disk. However, there remains the question of whether any external code knows enough about numeric typmods to become confused by a negative scale field within those. After reflecting for a bit, I suspect the answer is "probably", but it seems like it wouldn't be much worse of an update than any number of other catalog changes we make every release. regards, tom lane
On Tue, Jun 29, 2021 at 4:46 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On Tue, 29 Jun 2021 at 21:34, Robert Haas <robertmhaas@gmail.com> wrote: > > I thought about this too, but > > http://postgr.es/m/774767.1591985683@sss.pgh.pa.us made me think that > > it would be an on-disk format break. Maybe it's not, though? > > No, because the numeric dscale remains non-negative, so there's no > change to the way numeric values are stored. The only change is to > extend the allowed scale in the numeric typemod. Ah! Well, in that case, this sounds great. (I haven't looked at the patch, so this is just an endorsement of the concept.) -- Robert Haas EDB: http://www.enterprisedb.com
Attached is a more complete patch, with updated docs and tests. I chose to allow the scale to be in the range -1000 to 1000, which, to some extent, is quite arbitrary. The upper limit of 1000 makes sense, because nearly all numeric computations (other than multiply, add and subtract) have that as their upper scale limit (that's the maximum display scale). It also has to be at least 1000 for SQL compliance, since the precision can be up to 1000. The lower limit, on the other hand, really is quite arbitrary. -1000 is a nice round number, giving it a certain symmetry, and is almost certainly sufficient for any realistic use case (-1000 means numbers are rounded to the nearest multiple of 10^1000). Also, keeping some spare bits in the typemod might come in handy one day for something else (e.g., rounding mode choice). Regards, Dean
Attachment
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > Attached is a more complete patch, with updated docs and tests. I took a brief look at this and have a couple of quick suggestions: * As you mention, keeping some spare bits in the typmod might come in handy some day, but as given this patch isn't really doing so. I think it might be advisable to mask the scale off at 11 bits, preserving the high 5 bits of the low-order half of the word for future use. The main objection to that I guess is that it would complicate doing sign extension in TYPMOD_SCALE(). But it doesn't seem like we use that logic in any really hot code paths, so another instruction or three probably is not much of a cost. * I agree with wrapping the typmod construction/extraction into macros (or maybe they should be inline functions?) but the names you chose seem generic enough to possibly confuse onlookers. I'd suggest changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD. The comment for them should probably also explicitly explain "For purely historical reasons, VARHDRSZ is added to the typmod value after these fields are combined", or words to that effect. * It might be advisable to write NUMERIC_MIN_SCALE with parens: #define NUMERIC_MIN_SCALE (-1000) to avoid any precedence gotchas. * I'd be inclined to leave the num_typemod_test table in place, rather than dropping it, so that it serves to exercise pg_dump for these cases during the pg_upgrade test. Haven't read the code in detail yet. regards, tom lane
On Wed, 21 Jul 2021 at 22:33, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I took a brief look at this and have a couple of quick suggestions: > Thanks for looking at this! > * As you mention, keeping some spare bits in the typmod might come > in handy some day, but as given this patch isn't really doing so. > I think it might be advisable to mask the scale off at 11 bits, > preserving the high 5 bits of the low-order half of the word for future > use. The main objection to that I guess is that it would complicate > doing sign extension in TYPMOD_SCALE(). But it doesn't seem like we > use that logic in any really hot code paths, so another instruction > or three probably is not much of a cost. > Yeah, that makes sense, and it's worth documenting where the spare bits are. Interestingly, gcc recognised the bit hack I used for sign extension and turned it into (x << 21) >> 21 using x86 shl and sar instructions, though I didn't write it that way because apparently that's not portable. > * I agree with wrapping the typmod construction/extraction into macros > (or maybe they should be inline functions?) but the names you chose > seem generic enough to possibly confuse onlookers. I'd suggest > changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD. The comment for them > should probably also explicitly explain "For purely historical reasons, > VARHDRSZ is added to the typmod value after these fields are combined", > or words to that effect. > I've turned them into inline functions, since that makes them easier to read, and debug if necessary. All your other suggestions make sense too. Attached is a new version. Regards, Dean
Attachment
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > All your other suggestions make sense too. Attached is a new version. OK, I've now studied this more closely, and have some additional nitpicks: * I felt the way you did the documentation was confusing. It seems better to explain the normal case first, and then describe the two extended cases. * As long as we're encapsulating typmod construction/extraction, let's also encapsulate the checks for valid typmods. * Other places are fairly careful to declare typmod values as "int32", so I think this code should too. Attached is a proposed delta patch making those changes. (I made the docs mention that the extension cases are allowed as of v15. While useful in the short run, that will look like noise in ten years; so I could go either way on whether to do that.) If you're good with these, then I think it's ready to go. I'll mark it RfC in the commitfest. regards, tom lane diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 6abda2f1d2..d3c70667a3 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -545,8 +545,8 @@ <programlisting> NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>) </programlisting> - The precision must be positive, the scale may be positive or negative - (see below). Alternatively: + The precision must be positive, while the scale may be positive or + negative (see below). Alternatively: <programlisting> NUMERIC(<replaceable>precision</replaceable>) </programlisting> @@ -569,8 +569,8 @@ NUMERIC <note> <para> The maximum precision that can be explicitly specified in - a <type>NUMERIC</type> type declaration is 1000. An - unconstrained <type>NUMERIC</type> column is subject to the limits + a <type>numeric</type> type declaration is 1000. An + unconstrained <type>numeric</type> column is subject to the limits described in <xref linkend="datatype-numeric-table"/>. </para> </note> @@ -578,38 +578,48 @@ NUMERIC <para> If the scale of a value to be stored is greater than the declared scale of the column, the system will round the value to the specified - number of fractional digits. If the declared scale of the column is - negative, the value will be rounded to the left of the decimal point. - If, after rounding, the number of digits to the left of the decimal point - exceeds the declared precision minus the declared scale, an error is - raised. Similarly, if the declared scale exceeds the declared precision - and the number of zero digits to the right of the decimal point is less - than the declared scale minus the declared precision, an error is raised. + number of fractional digits. Then, if the number of digits to the + left of the decimal point exceeds the declared precision minus the + declared scale, an error is raised. For example, a column declared as <programlisting> NUMERIC(3, 1) </programlisting> - will round values to 1 decimal place and be able to store values between - -99.9 and 99.9, inclusive. A column declared as + will round values to 1 decimal place and can store values between + -99.9 and 99.9, inclusive. + </para> + + <para> + Beginning in <productname>PostgreSQL</productname> 15, it is allowed + to declare a <type>numeric</type> column with a negative scale. Then + values will be rounded to the left of the decimal point. The + precision still represents the maximum number of non-rounded digits. + Thus, a column declared as <programlisting> NUMERIC(2, -3) </programlisting> - will round values to the nearest thousand and be able to store values - between -99000 and 99000, inclusive. A column declared as + will round values to the nearest thousand and can store values + between -99000 and 99000, inclusive. + It is also allowed to declare a scale larger than the declared + precision. Such a column can only hold fractional values, and it + requires the number of zero digits just to the right of the decimal + point to be at least the declared scale minus the declared precision. + For example, a column declared as <programlisting> NUMERIC(3, 5) </programlisting> - will round values to 5 decimal places and be able to store values between + will round values to 5 decimal places and can store values between -0.00999 and 0.00999, inclusive. </para> <note> <para> - The scale in a <type>NUMERIC</type> type declaration may be any value in - the range -1000 to 1000. (The <acronym>SQL</acronym> standard requires - the scale to be in the range 0 to <replaceable>precision</replaceable>. - Using values outside this range may not be portable to other database - systems.) + <productname>PostgreSQL</productname> permits the scale in + a <type>numeric</type> type declaration to be any value in the range + -1000 to 1000. However, the <acronym>SQL</acronym> standard requires + the scale to be in the range 0 + to <replaceable>precision</replaceable>. Using scales outside that + range may not be portable to other database systems. </para> </note> diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 46cb37cea1..faff09f5d5 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -827,21 +827,31 @@ numeric_is_integral(Numeric num) * * For purely historical reasons VARHDRSZ is then added to the result, thus * the unused space in the upper 16 bits is not all as freely available as it - * might seem. + * might seem. (We can't let the result overflow to a negative int32, as + * other parts of the system would interpret that as not-a-valid-typmod.) */ -static inline int +static inline int32 make_numeric_typmod(int precision, int scale) { return ((precision << 16) | (scale & 0x7ff)) + VARHDRSZ; } +/* + * Because of the offset, valid numeric typmods are at least VARHDRSZ + */ +static inline bool +is_valid_numeric_typmod(int32 typmod) +{ + return typmod >= (int32) VARHDRSZ; +} + /* * numeric_typmod_precision() - * * Extract the precision from a numeric typmod --- see make_numeric_typmod(). */ static inline int -numeric_typmod_precision(int typmod) +numeric_typmod_precision(int32 typmod) { return ((typmod - VARHDRSZ) >> 16) & 0xffff; } @@ -856,7 +866,7 @@ numeric_typmod_precision(int typmod) * extends an 11-bit two's complement number x. */ static inline int -numeric_typmod_scale(int typmod) +numeric_typmod_scale(int32 typmod) { return (((typmod - VARHDRSZ) & 0x7ff) ^ 1024) - 1024; } @@ -872,7 +882,7 @@ numeric_maximum_size(int32 typmod) int precision; int numeric_digits; - if (typmod < (int32) (VARHDRSZ)) + if (!is_valid_numeric_typmod(typmod)) return -1; /* precision (ie, max # of digits) is in upper bits of typmod */ @@ -1136,14 +1146,14 @@ numeric_support(PG_FUNCTION_ARGS) int32 new_precision = numeric_typmod_precision(new_typmod); /* - * If new_typmod < VARHDRSZ, the destination is unconstrained; - * that's always OK. If old_typmod >= VARHDRSZ, the source is + * If new_typmod is invalid, the destination is unconstrained; + * that's always OK. If old_typmod is valid, the source is * constrained, and we're OK if the scale is unchanged and the * precision is not decreasing. See further notes in function * header comment. */ - if (new_typmod < (int32) VARHDRSZ || - (old_typmod >= (int32) VARHDRSZ && + if (!is_valid_numeric_typmod(new_typmod) || + (is_valid_numeric_typmod(old_typmod) && new_scale == old_scale && new_precision >= old_precision)) ret = relabel_to_typmod(source, new_typmod); } @@ -1186,7 +1196,7 @@ numeric (PG_FUNCTION_ARGS) * If the value isn't a valid type modifier, simply return a copy of the * input value */ - if (typmod < (int32) (VARHDRSZ)) + if (!is_valid_numeric_typmod(typmod)) PG_RETURN_NUMERIC(duplicate_numeric(num)); /* @@ -1288,7 +1298,7 @@ numerictypmodout(PG_FUNCTION_ARGS) int32 typmod = PG_GETARG_INT32(0); char *res = (char *) palloc(64); - if (typmod >= 0) + if (is_valid_numeric_typmod(typmod)) snprintf(res, 64, "(%d,%d)", numeric_typmod_precision(typmod), numeric_typmod_scale(typmod)); @@ -7476,8 +7486,8 @@ apply_typmod(NumericVar *var, int32 typmod) int ddigits; int i; - /* Do nothing if we have a default typmod (-1) */ - if (typmod < (int32) (VARHDRSZ)) + /* Do nothing if we have an invalid typmod */ + if (!is_valid_numeric_typmod(typmod)) return; precision = numeric_typmod_precision(typmod); @@ -7565,7 +7575,7 @@ apply_typmod_special(Numeric num, int32 typmod) return; /* Do nothing if we have a default typmod (-1) */ - if (typmod < (int32) (VARHDRSZ)) + if (!is_valid_numeric_typmod(typmod)) return; precision = numeric_typmod_precision(typmod);
On Fri, 23 Jul 2021 at 16:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > OK, I've now studied this more closely, and have some additional > nitpicks: > > * I felt the way you did the documentation was confusing. It seems > better to explain the normal case first, and then describe the two > extended cases. OK, that looks much better. Re-reading the entire section, I think it's much clearer now. > * As long as we're encapsulating typmod construction/extraction, let's > also encapsulate the checks for valid typmods. Good idea. > * Other places are fairly careful to declare typmod values as "int32", > so I think this code should too. OK, that seems sensible. > Attached is a proposed delta patch making those changes. > > (I made the docs mention that the extension cases are allowed as of v15. > While useful in the short run, that will look like noise in ten years; > so I could go either way on whether to do that.) Hmm, yeah. In general,I find such things in the documentation useful for quite a few years. I'm regularly looking to see when a particular feature was added, to see if I can use it in a particular situation. But eventually, it'll become irrelevant, and I don't know if anyone will go around tidying these things up. I have left it in, but perhaps there is a wider discussion to be had about whether we should be doing that more (or less) often. FWIW, I like the way some docs include an "available since" tag (e.g,, Java's @since tag). > If you're good with these, then I think it's ready to go. > I'll mark it RfC in the commitfest. Thanks. That all looked good, so I have pushed it. Regards, Dean