Thread: review: Reduce palloc's in numeric operations
Hello all I tested this patch and I can confirm, so this patch can increase speed about 16-22% (depends on IO waits, load type). I tested real query (anonymyzed) SELECT SUM(COALESCE((a * b * c * (d + ((1 -(d)) * (1 -(e))))),0)) m1 FROM tab; -- patched 4493 26.5591 postgres slot_deform_tuple 1327 7.8442 postgres AllocSetAlloc 1313 7.7614 postgres ExecMakeFunctionResultNoSets 1105 6.5319 postgres set_var_from_num_nocopy 924 5.4620 postgres make_result 637 3.7654 postgres mul_var 635 3.7536 postgres numeric_mul 560 3.3103 postgres MemoryContextAlloc 405 2.3940 postgres AllocSetFree 389 2.2995 postgres ExecEvalScalarVarFast 332 1.9625 postgres slot_getsomeattrs 322 1.9034 postgres numeric_add 313 1.8502 postgres add_abs 304 1.7970 postgres pfree 238 1.4069 postgres slot_getattr 216 1.2768 postgres numeric_sub 200 1.1822 postgres heap_tuple_untoast_attr 183 1.0818 postgres strip_var 180 1.0640 postgres ExecEvalConst 173 1.0226 postgres alloc_var 172 1.0167 postgres check_stack_depth -- origin 4419 22.8325 postgres slot_deform_tuple 2041 10.5456 postgres AllocSetAlloc 1248 6.4483 postgres set_var_from_num 1186 6.1279 postgres ExecMakeFunctionResultNoSets 886 4.5779 postgres MemoryContextAlloc 856 4.4229 postgres make_result 757 3.9113 postgres numeric_mul 731 3.7770 postgres AllocSetFree 625 3.2293 postgres mul_var 601 3.1053 postgres alloc_var 545 2.8160 postgres pfree 503 2.5989 postgres free_var 458 2.3664 postgres slot_getsomeattrs 378 1.9531 postgres numeric_add 360 1.8601 postgres add_abs 336 1.7361 postgres ExecEvalScalarVarFast 266 1.3744 postgres slot_getattr 221 1.1419 postgres numeric_sub Review: 1) this patch was clearly applied and compilation was without warning 2) regress tests - All 133 tests passed. 4) don't see any memory leaks there (verified by following code) CREATE OR REPLACE FUNCTION public.fx(_m integer)RETURNS voidLANGUAGE plpgsql AS $function$ declare m numeric = 10; n numeric = 20022.222; r numeric; begin for i in 1.._m loop r := m * n; end loop; end; $function$ postgres=# select fx(10000000);fx ---- (1 row) Time: 5312.623 ms --- original ( 4798.103 ms -- patched 10% speedup) 5) it respect PostgreSQL's coding standards 6) we would to accept this patch - it can carry 10% speedup calculations with numerics. Regards Pavel Stehule
On 19.11.2012 15:17, Pavel Stehule wrote: > I tested this patch and I can confirm, so this patch can increase > speed about 16-22% (depends on IO waits, load type). Thanks for the review. I spent some more time on this, continuing with the thought that perhaps it would be better if get_str_from_var() didn't scribble on its input. I ended up with the attached patch, which contains a bunch of small tweaks: * Add init_var_from_num() function. This is the same as set_var_from_num_nocopy in the original patch, but it doesn't require the caller to have called init_var() first. IMHO this makes the calling code slightly more readable. Also, it's now more evident what these vars are: the digits array points to original array in the original Datum, but 'buf' is NULL. This is the same arrangement that's used in the constant NumericVars like const_zero. * get_str_from_var() no longer scribbles on its input. I noticed that it's always called with a dscale that comes from the input var itself. In other words, the rounding was unnecessary to begin with. I simply removed the dscale argument and the round_var() call from get_str_from_var(). If a someone wants to display a string with different dscale in the future, he can simply call round_var() before get_str_from_var(). * numericvar_to_int8() no long scribbles on its input either. It creates a temporary copy to avoid that. To compensate, the callers no longer need to create a temporary copy, so the net # of pallocs is the same, but this is nicer. (there's room for a micro-optimization to avoid making the temporary copy numericvar_to_int8() when the argument is already suitably rounded - I left that our for now, dunno if it would make any difference in practice) * use a constant for the number 10 in get_str_from_var_sci(), when calculating 10^exponent. Saves a palloc() and some cycles to convert integer 10 to numeric. Comments? Assuming no-one sees some fatal flaw in this, I'll commit this tomorrow. - Heikki
Attachment
2012/11/20 Heikki Linnakangas <hlinnakangas@vmware.com>: > On 19.11.2012 15:17, Pavel Stehule wrote: >> >> I tested this patch and I can confirm, so this patch can increase >> speed about 16-22% (depends on IO waits, load type). > > > Thanks for the review. > > I spent some more time on this, continuing with the thought that perhaps it > would be better if get_str_from_var() didn't scribble on its input. I ended > up with the attached patch, which contains a bunch of small tweaks: > > * Add init_var_from_num() function. This is the same as > set_var_from_num_nocopy in the original patch, but it doesn't require the > caller to have called init_var() first. IMHO this makes the calling code > slightly more readable. Also, it's now more evident what these vars are: the > digits array points to original array in the original Datum, but 'buf' is > NULL. This is the same arrangement that's used in the constant NumericVars > like const_zero. > > * get_str_from_var() no longer scribbles on its input. I noticed that it's > always called with a dscale that comes from the input var itself. In other > words, the rounding was unnecessary to begin with. I simply removed the > dscale argument and the round_var() call from get_str_from_var(). If a > someone wants to display a string with different dscale in the future, he > can simply call round_var() before get_str_from_var(). > > * numericvar_to_int8() no long scribbles on its input either. It creates a > temporary copy to avoid that. To compensate, the callers no longer need to > create a temporary copy, so the net # of pallocs is the same, but this is > nicer. (there's room for a micro-optimization to avoid making the temporary > copy numericvar_to_int8() when the argument is already suitably rounded - I > left that our for now, dunno if it would make any difference in practice) > > * use a constant for the number 10 in get_str_from_var_sci(), when > calculating 10^exponent. Saves a palloc() and some cycles to convert integer > 10 to numeric. > > Comments? Assuming no-one sees some fatal flaw in this, I'll commit this > tomorrow. I have no objections all regression tests passed, no warnings - has a sense Regards Pavel > > - Heikki
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > * Add init_var_from_num() function. This is the same as > set_var_from_num_nocopy in the original patch, but it doesn't require > the caller to have called init_var() first. IMHO this makes the calling > code slightly more readable. Also, it's now more evident what these vars > are: the digits array points to original array in the original Datum, > but 'buf' is NULL. This is the same arrangement that's used in the > constant NumericVars like const_zero. init_var_from_num's header comment could use some more work. The statement that one "must not modify the returned var" is false in some sense, since for instance numeric_ceil() does that. The key is that you have to replace the digit buffer not modify it in-place, but most computational routines do that anyway. Also it might be worth pointing out explicitly that free_var() is not required unless the var is modified subsequently. (There are instances of both cases, and it might not be clear to the reader why.) > * get_str_from_var() no longer scribbles on its input. I noticed that > it's always called with a dscale that comes from the input var itself. > In other words, the rounding was unnecessary to begin with. Hmm, it may have been necessary once upon a time, but I agree getting rid of the rounding step is a win now. Suggest adding a comment though that the var is displayed to the number of digits indicated by its dscale. Looks good otherwise. regards, tom lane
On 20.11.2012 21:44, Tom Lane wrote: > init_var_from_num's header comment could use some more work. The > statement that one "must not modify the returned var" is false in some > sense, since for instance numeric_ceil() does that. The key is that you > have to replace the digit buffer not modify it in-place, but most > computational routines do that anyway. Also it might be worth pointing > out explicitly that free_var() is not required unless the var is > modified subsequently. (There are instances of both cases, and it might > not be clear to the reader why.) Added those points to the comment. >> * get_str_from_var() no longer scribbles on its input. I noticed that >> it's always called with a dscale that comes from the input var itself. >> In other words, the rounding was unnecessary to begin with. > > Hmm, it may have been necessary once upon a time, but I agree getting > rid of the rounding step is a win now. Suggest adding a comment though > that the var is displayed to the number of digits indicated by its dscale. > > Looks good otherwise. Committed, thanks to everyone involved. - Heikki