Re: Reduce palloc's in numeric operations. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Reduce palloc's in numeric operations.
Date
Msg-id 20121112.184515.247859873.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Reduce palloc's in numeric operations.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Thanks for comments,

> Have to be careful to really not modify the
> operands. numeric_out() and numeric_out_sci() are wrong; they
> call get_str_from_var(), which modifies the argument. Same with
> numeric_expr(): it passes the argument to
> numericvar_to_double_no_overflow(), which passes it to
> get_str_from_var(). numericvar_to_int8() also modifies its
> argument, so all the functions that use that, directly or
> indirectly, must make a copy.

mmm. My carefulness was a bit short to pick up them...

I overlooked that get_str_from_var() and numeric_to_int8() calls
round_var() which rewrites the operand. I reverted numeric_out()
and numeric_int8(), numeric_int2().

Altough, I couldn't find in get_str_from_var_sci() where the
operand would be modified.

The lines where var showing in get_str_from_var_sci() is listed
below.

|  2:get_str_from_var_sci(NumericVar *var, int rscale)
| 21:  if (var->ndigits > 0)
| 23:    exponent = (var->weight + 1) * DEC_DIGITS;
| 29:    exponent -= DEC_DIGITS - (int) log10(var->digits[0]);
| 59:  div_var(var, &denominator, &significand, rscale, true);

The only suspect is div_var at this level, and do the same thing
for var1 in div_var.

|   2:div_var(NumericVar *var1, NumericVar *var2, NumericVar *result,
|  20:  int      var1ndigits = var1->ndigits;
|  35:  if (var1ndigits == 0)
|  47:  if (var1->sign == var2->sign)
|  51:  res_weight = var1->weight - var2->weight;
|  68:  div_ndigits = Max(div_ndigits, var1ndigits);
|  83:  memcpy(dividend + 1, var1->digits, var1ndigits * sizeof(NumericDigit));
| 132:      for (i = var1ndigits; i >= 0; i--)

No line seems to modify var1 as far as I see so I've left
numeric_out_sci() modified in this patch.


Well, I found some other bugs in numeric_stddev_internal.
vN was errorniously freed and vsumX2 in is used as work.
They are fixed in this new patch.

> Perhaps get_str_from_var(), and the other functions that
> currently scribble on the arguments, should be modified to not
> do so. They could easily make a copy of the argument within the
> function. Then the callers could safely use
> set_var_from_num_nocopy(). The performance would be the same,
> you would have the same number of pallocs, but you would get
> rid of the surprising argument-modifying behavior of those
> functions.

I agree with that. const qualifiers on parameters would rule this
mechanically. I try this for the next version of this patch.


> SELECT SUM(col) FROM numtest;
> 
> The execution time of that query fell from about 5300 ms to 4300 ms, ie. about 20%.

Wow, it seems more promising than I expected. Thanks.

regards,

-- 
堀口恭太郎

日本電信電話株式会社 NTTオープンソフトウェアセンタ
Phone: 03-5860-5115 / Fax: 03-5463-5490
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 68c1f1d..fcff325 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -367,6 +367,7 @@ static void zero_var(NumericVar *var);static const char *set_var_from_str(const char *str, const
char*cp,                 NumericVar *dest);static void set_var_from_num(Numeric value, NumericVar *dest);
 
+static void set_var_from_num_nocopy(Numeric num, NumericVar *dest);static void set_var_from_var(NumericVar *value,
NumericVar*dest);static char *get_str_from_var(NumericVar *var, int dscale);static char
*get_str_from_var_sci(NumericVar*var, int rscale);
 
@@ -617,11 +618,10 @@ numeric_out_sci(Numeric num, int scale)        return pstrdup("NaN");    init_var(&x);
-    set_var_from_num(num, &x);
+    set_var_from_num_nocopy(num, &x);    str = get_str_from_var_sci(&x, scale);
-    free_var(&x);    return str;}
@@ -696,7 +696,7 @@ numeric_send(PG_FUNCTION_ARGS)    int            i;    init_var(&x);
-    set_var_from_num(num, &x);
+    set_var_from_num_nocopy(num, &x);    pq_begintypsend(&buf);
@@ -707,8 +707,6 @@ numeric_send(PG_FUNCTION_ARGS)    for (i = 0; i < x.ndigits; i++)        pq_sendint(&buf,
x.digits[i],sizeof(NumericDigit));
 
-    free_var(&x);
-    PG_RETURN_BYTEA_P(pq_endtypsend(&buf));}
@@ -1577,15 +1575,13 @@ numeric_add(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    add_var(&arg1, &arg2, &result);    res = make_result(&result);
-    free_var(&arg1);
-    free_var(&arg2);    free_var(&result);    PG_RETURN_NUMERIC(res);
@@ -1620,15 +1616,13 @@ numeric_sub(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    sub_var(&arg1, &arg2, &result);    res = make_result(&result);
-    free_var(&arg1);
-    free_var(&arg2);    free_var(&result);    PG_RETURN_NUMERIC(res);
@@ -1667,15 +1661,13 @@ numeric_mul(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    mul_var(&arg1, &arg2, &result, arg1.dscale + arg2.dscale);    res =
make_result(&result);
-    free_var(&arg1);
-    free_var(&arg2);    free_var(&result);    PG_RETURN_NUMERIC(res);
@@ -1711,8 +1703,8 @@ numeric_div(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    /*     * Select scale for division result
@@ -1726,8 +1718,6 @@ numeric_div(PG_FUNCTION_ARGS)    res = make_result(&result);
-    free_var(&arg1);
-    free_var(&arg2);    free_var(&result);    PG_RETURN_NUMERIC(res);
@@ -1762,8 +1752,8 @@ numeric_div_trunc(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    /*     * Do the divide and return the result
@@ -1772,8 +1762,6 @@ numeric_div_trunc(PG_FUNCTION_ARGS)    res = make_result(&result);
-    free_var(&arg1);
-    free_var(&arg2);    free_var(&result);    PG_RETURN_NUMERIC(res);
@@ -1802,15 +1790,13 @@ numeric_mod(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    mod_var(&arg1, &arg2, &result);    res = make_result(&result);
-    free_var(&result);
-    free_var(&arg2);    free_var(&arg1);    PG_RETURN_NUMERIC(res);
@@ -1980,7 +1966,7 @@ numeric_sqrt(PG_FUNCTION_ARGS)    init_var(&arg);    init_var(&result);
-    set_var_from_num(num, &arg);
+    set_var_from_num_nocopy(num, &arg);    /* Assume the input was normalized, so arg.weight is accurate */    sweight
=(arg.weight + 1) * DEC_DIGITS / 2 - 1;
 
@@ -1998,7 +1984,6 @@ numeric_sqrt(PG_FUNCTION_ARGS)    res = make_result(&result);    free_var(&result);
-    free_var(&arg);    PG_RETURN_NUMERIC(res);}
@@ -2033,7 +2018,7 @@ numeric_exp(PG_FUNCTION_ARGS)    init_var(&arg);    init_var(&result);
-    set_var_from_num(num, &arg);
+    set_var_from_num_nocopy(num, &arg);    /* convert input to float8, ignoring overflow */    val =
numericvar_to_double_no_overflow(&arg);
@@ -2061,7 +2046,6 @@ numeric_exp(PG_FUNCTION_ARGS)    res = make_result(&result);    free_var(&result);
-    free_var(&arg);    PG_RETURN_NUMERIC(res);}
@@ -2091,7 +2075,7 @@ numeric_ln(PG_FUNCTION_ARGS)    init_var(&arg);    init_var(&result);
-    set_var_from_num(num, &arg);
+    set_var_from_num_nocopy(num, &arg);    /* Approx decimal digits before decimal point */    dec_digits =
(arg.weight+ 1) * DEC_DIGITS;
 
@@ -2112,7 +2096,6 @@ numeric_ln(PG_FUNCTION_ARGS)    res = make_result(&result);    free_var(&result);
-    free_var(&arg);    PG_RETURN_NUMERIC(res);}
@@ -2146,8 +2129,8 @@ numeric_log(PG_FUNCTION_ARGS)    init_var(&arg2);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    /*     * Call log_var() to compute and return the result; note it handles
scale
@@ -2158,8 +2141,6 @@ numeric_log(PG_FUNCTION_ARGS)    res = make_result(&result);    free_var(&result);
-    free_var(&arg2);
-    free_var(&arg1);    PG_RETURN_NUMERIC(res);}
@@ -2195,8 +2176,8 @@ numeric_power(PG_FUNCTION_ARGS)    init_var(&arg2_trunc);    init_var(&result);
-    set_var_from_num(num1, &arg1);
-    set_var_from_num(num2, &arg2);
+    set_var_from_num_nocopy(num1, &arg1);
+    set_var_from_num_nocopy(num2, &arg2);    set_var_from_var(&arg2, &arg2_trunc);    trunc_var(&arg2_trunc, 0);
@@ -2227,9 +2208,7 @@ numeric_power(PG_FUNCTION_ARGS)    res = make_result(&result);    free_var(&result);
-    free_var(&arg2);    free_var(&arg2_trunc);
-    free_var(&arg1);    PG_RETURN_NUMERIC(res);}
@@ -2277,9 +2256,8 @@ numeric_int4(PG_FUNCTION_ARGS)    /* Convert to variable format, then convert to int4 */
init_var(&x);
-    set_var_from_num(num, &x);
+    set_var_from_num_nocopy(num, &x);    result = numericvar_to_int4(&x);
-    free_var(&x);    PG_RETURN_INT32(result);}
@@ -2400,8 +2378,6 @@ numeric_int2(PG_FUNCTION_ARGS)                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
         errmsg("smallint out of range")));
 
-    free_var(&x);
-    /* Down-convert to int2 */    result = (int16) val;
@@ -2411,6 +2387,8 @@ numeric_int2(PG_FUNCTION_ARGS)                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
         errmsg("smallint out of range")));
 
+    free_var(&x);
+    PG_RETURN_INT16(result);}
@@ -2764,7 +2742,7 @@ numeric_stddev_internal(ArrayType *transarray,        return make_result(&const_nan);
init_var(&vN);
-    set_var_from_num(N, &vN);
+    set_var_from_num_nocopy(N, &vN);    /*     * Sample stddev and variance are undefined when N <= 1; population
stddev
@@ -2777,7 +2755,6 @@ numeric_stddev_internal(ArrayType *transarray,    if (cmp_var(&vN, comp) <= 0)    {
-        free_var(&vN);        *is_null = true;        return NULL;    }
@@ -2786,7 +2763,7 @@ numeric_stddev_internal(ArrayType *transarray,    sub_var(&vN, &const_one, &vNminus1);
init_var(&vsumX);
-    set_var_from_num(sumX, &vsumX);
+    set_var_from_num_nocopy(sumX, &vsumX);    init_var(&vsumX2);    set_var_from_num(sumX2, &vsumX2);
@@ -2816,9 +2793,7 @@ numeric_stddev_internal(ArrayType *transarray,        res = make_result(&vsumX);    }
-    free_var(&vN);    free_var(&vNminus1);
-    free_var(&vsumX);    free_var(&vsumX2);    return res;
@@ -3448,6 +3423,21 @@ set_var_from_num(Numeric num, NumericVar *dest)    memcpy(dest->digits, NUMERIC_DIGITS(num),
ndigits* sizeof(NumericDigit));}
 
+/*
+ * set_var_from_num_nocopy() -
+ *
+ *    Convert the packed db format into a variable - without copying digits
+ */
+static void
+set_var_from_num_nocopy(Numeric num, NumericVar *dest)
+{
+    dest->ndigits = NUMERIC_NDIGITS(num);
+    dest->weight = NUMERIC_WEIGHT(num);
+    dest->sign = NUMERIC_SIGN(num);
+    dest->dscale = NUMERIC_DSCALE(num);
+    dest->digits = NUMERIC_DIGITS(num);
+}
+/* * set_var_from_var() -

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Enabling Checksums
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Identity projection