Thread: review: Reduce palloc's in numeric operations

review: Reduce palloc's in numeric operations

From

Pavel Stehule

Date:

19 November 2012, 16:18:16

Hello all

I tested this patch and I can confirm, so this patch can increase
speed about 16-22% (depends on IO waits, load type).

I tested real query (anonymyzed)

SELECT SUM(COALESCE((a * b * c * (d + ((1 -(d)) * (1 -(e))))),0)) m1 FROM  tab;

-- patched
4493     26.5591  postgres                 slot_deform_tuple
1327      7.8442  postgres                 AllocSetAlloc
1313      7.7614  postgres                 ExecMakeFunctionResultNoSets
1105      6.5319  postgres                 set_var_from_num_nocopy
924       5.4620  postgres                 make_result
637       3.7654  postgres                 mul_var
635       3.7536  postgres                 numeric_mul
560       3.3103  postgres                 MemoryContextAlloc
405       2.3940  postgres                 AllocSetFree
389       2.2995  postgres                 ExecEvalScalarVarFast
332       1.9625  postgres                 slot_getsomeattrs
322       1.9034  postgres                 numeric_add
313       1.8502  postgres                 add_abs
304       1.7970  postgres                 pfree
238       1.4069  postgres                 slot_getattr
216       1.2768  postgres                 numeric_sub
200       1.1822  postgres                 heap_tuple_untoast_attr
183       1.0818  postgres                 strip_var
180       1.0640  postgres                 ExecEvalConst
173       1.0226  postgres                 alloc_var
172       1.0167  postgres                 check_stack_depth

-- origin
4419     22.8325  postgres                 slot_deform_tuple
2041     10.5456  postgres                 AllocSetAlloc
1248      6.4483  postgres                 set_var_from_num
1186      6.1279  postgres                 ExecMakeFunctionResultNoSets
886       4.5779  postgres                 MemoryContextAlloc
856       4.4229  postgres                 make_result
757       3.9113  postgres                 numeric_mul
731       3.7770  postgres                 AllocSetFree
625       3.2293  postgres                 mul_var
601       3.1053  postgres                 alloc_var
545       2.8160  postgres                 pfree
503       2.5989  postgres                 free_var
458       2.3664  postgres                 slot_getsomeattrs
378       1.9531  postgres                 numeric_add
360       1.8601  postgres                 add_abs
336       1.7361  postgres                 ExecEvalScalarVarFast
266       1.3744  postgres                 slot_getattr
221       1.1419  postgres                 numeric_sub

Review:

1) this patch was clearly applied and compilation was without warning

2) regress tests -  All 133 tests passed.

4) don't see any memory leaks there (verified by following code)

CREATE OR REPLACE FUNCTION public.fx(_m integer)RETURNS voidLANGUAGE plpgsql
AS $function$
declare m numeric = 10;       n numeric = 20022.222; r numeric;
begin for i in 1.._m loop   r := m * n; end loop;
end;
$function$


postgres=# select fx(10000000);fx
----

(1 row)

Time: 5312.623 ms  --- original ( 4798.103 ms -- patched  10% speedup)

5) it respect PostgreSQL's coding standards

6) we would to accept this patch - it can carry 10% speedup
calculations with numerics.

Regards

Pavel Stehule

Re: review: Reduce palloc's in numeric operations

From

Heikki Linnakangas

Date:

20 November 2012, 20:00:53

On 19.11.2012 15:17, Pavel Stehule wrote:
> I tested this patch and I can confirm, so this patch can increase
> speed about 16-22% (depends on IO waits, load type).

Thanks for the review.

I spent some more time on this, continuing with the thought that perhaps
it would be better if get_str_from_var() didn't scribble on its input. I
ended up with the attached patch, which contains a bunch of small tweaks:

* Add init_var_from_num() function. This is the same as
set_var_from_num_nocopy in the original patch, but it doesn't require
the caller to have called init_var() first. IMHO this makes the calling
code slightly more readable. Also, it's now more evident what these vars
are: the digits array points to original array in the original Datum,
but 'buf' is NULL. This is the same arrangement that's used in the
constant NumericVars like const_zero.

* get_str_from_var() no longer scribbles on its input. I noticed that
it's always called with a dscale that comes from the input var itself.
In other words, the rounding was unnecessary to begin with. I simply
removed the dscale argument and the round_var() call from
get_str_from_var(). If a someone wants to display a string with
different dscale in the future, he can simply call round_var() before
get_str_from_var().

* numericvar_to_int8() no long scribbles on its input either. It creates
a temporary copy to avoid that. To compensate, the callers no longer
need to create a temporary copy, so the net # of pallocs is the same,
but this is nicer. (there's room for a micro-optimization to avoid
making the temporary copy numericvar_to_int8() when the argument is
already suitably rounded - I left that our for now, dunno if it would
make any difference in practice)

* use a constant for the number 10 in get_str_from_var_sci(), when
calculating 10^exponent. Saves a palloc() and some cycles to convert
integer 10 to numeric.

Comments? Assuming no-one sees some fatal flaw in this, I'll commit this
tomorrow.

- Heikki

Attachment

fasternumeric_v3-heikki.patch

Re: review: Reduce palloc's in numeric operations

From

Pavel Stehule

Date:

20 November 2012, 20:33:44

2012/11/20 Heikki Linnakangas <hlinnakangas@vmware.com>:
> On 19.11.2012 15:17, Pavel Stehule wrote:
>>
>> I tested this patch and I can confirm, so this patch can increase
>> speed about 16-22% (depends on IO waits, load type).
>
>
> Thanks for the review.
>
> I spent some more time on this, continuing with the thought that perhaps it
> would be better if get_str_from_var() didn't scribble on its input. I ended
> up with the attached patch, which contains a bunch of small tweaks:
>
> * Add init_var_from_num() function. This is the same as
> set_var_from_num_nocopy in the original patch, but it doesn't require the
> caller to have called init_var() first. IMHO this makes the calling code
> slightly more readable. Also, it's now more evident what these vars are: the
> digits array points to original array in the original Datum, but 'buf' is
> NULL. This is the same arrangement that's used in the constant NumericVars
> like const_zero.
>
> * get_str_from_var() no longer scribbles on its input. I noticed that it's
> always called with a dscale that comes from the input var itself. In other
> words, the rounding was unnecessary to begin with. I simply removed the
> dscale argument and the round_var() call from get_str_from_var(). If a
> someone wants to display a string with different dscale in the future, he
> can simply call round_var() before get_str_from_var().
>
> * numericvar_to_int8() no long scribbles on its input either. It creates a
> temporary copy to avoid that. To compensate, the callers no longer need to
> create a temporary copy, so the net # of pallocs is the same, but this is
> nicer. (there's room for a micro-optimization to avoid making the temporary
> copy numericvar_to_int8() when the argument is already suitably rounded - I
> left that our for now, dunno if it would make any difference in practice)
>
> * use a constant for the number 10 in get_str_from_var_sci(), when
> calculating 10^exponent. Saves a palloc() and some cycles to convert integer
> 10 to numeric.
>
> Comments? Assuming no-one sees some fatal flaw in this, I'll commit this
> tomorrow.

I have no objections

all regression tests passed, no warnings - has a sense

Regards

Pavel

>
> - Heikki

Re: review: Reduce palloc's in numeric operations

From

Tom Lane

Date:

20 November 2012, 22:44:45

Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> * Add init_var_from_num() function. This is the same as 
> set_var_from_num_nocopy in the original patch, but it doesn't require 
> the caller to have called init_var() first. IMHO this makes the calling 
> code slightly more readable. Also, it's now more evident what these vars 
> are: the digits array points to original array in the original Datum, 
> but 'buf' is NULL. This is the same arrangement that's used in the 
> constant NumericVars like const_zero.

init_var_from_num's header comment could use some more work.  The
statement that one "must not modify the returned var" is false in some
sense, since for instance numeric_ceil() does that.  The key is that you
have to replace the digit buffer not modify it in-place, but most
computational routines do that anyway.  Also it might be worth pointing
out explicitly that free_var() is not required unless the var is
modified subsequently.  (There are instances of both cases, and it might
not be clear to the reader why.)

> * get_str_from_var() no longer scribbles on its input. I noticed that 
> it's always called with a dscale that comes from the input var itself. 
> In other words, the rounding was unnecessary to begin with.

Hmm, it may have been necessary once upon a time, but I agree getting
rid of the rounding step is a win now.  Suggest adding a comment though
that the var is displayed to the number of digits indicated by its dscale.

Looks good otherwise.
        regards, tom lane

Re: review: Reduce palloc's in numeric operations

From

Heikki Linnakangas

Date:

21 November 2012, 17:17:57

On 20.11.2012 21:44, Tom Lane wrote:
> init_var_from_num's header comment could use some more work.  The
> statement that one "must not modify the returned var" is false in some
> sense, since for instance numeric_ceil() does that.  The key is that you
> have to replace the digit buffer not modify it in-place, but most
> computational routines do that anyway.  Also it might be worth pointing
> out explicitly that free_var() is not required unless the var is
> modified subsequently.  (There are instances of both cases, and it might
> not be clear to the reader why.)

Added those points to the comment.

>> * get_str_from_var() no longer scribbles on its input. I noticed that
>> it's always called with a dscale that comes from the input var itself.
>> In other words, the rounding was unnecessary to begin with.
>
> Hmm, it may have been necessary once upon a time, but I agree getting
> rid of the rounding step is a win now.  Suggest adding a comment though
> that the var is displayed to the number of digits indicated by its dscale.
>
> Looks good otherwise.

Committed, thanks to everyone involved.

- Heikki