review: Reduce palloc's in numeric operations
Hello all
I tested this patch and I can confirm, so this patch can increase
speed about 16-22% (depends on IO waits, load type).
I tested real query (anonymyzed)
SELECT SUM(COALESCE((a * b * c * (d + ((1 -(d)) * (1 -(e))))),0)) m1 FROM tab;
-- patched
4493 26.5591 postgres slot_deform_tuple
1327 7.8442 postgres AllocSetAlloc
1313 7.7614 postgres ExecMakeFunctionResultNoSets
1105 6.5319 postgres set_var_from_num_nocopy
924 5.4620 postgres make_result
637 3.7654 postgres mul_var
635 3.7536 postgres numeric_mul
560 3.3103 postgres MemoryContextAlloc
405 2.3940 postgres AllocSetFree
389 2.2995 postgres ExecEvalScalarVarFast
332 1.9625 postgres slot_getsomeattrs
322 1.9034 postgres numeric_add
313 1.8502 postgres add_abs
304 1.7970 postgres pfree
238 1.4069 postgres slot_getattr
216 1.2768 postgres numeric_sub
200 1.1822 postgres heap_tuple_untoast_attr
183 1.0818 postgres strip_var
180 1.0640 postgres ExecEvalConst
173 1.0226 postgres alloc_var
172 1.0167 postgres check_stack_depth
-- origin
4419 22.8325 postgres slot_deform_tuple
2041 10.5456 postgres AllocSetAlloc
1248 6.4483 postgres set_var_from_num
1186 6.1279 postgres ExecMakeFunctionResultNoSets
886 4.5779 postgres MemoryContextAlloc
856 4.4229 postgres make_result
757 3.9113 postgres numeric_mul
731 3.7770 postgres AllocSetFree
625 3.2293 postgres mul_var
601 3.1053 postgres alloc_var
545 2.8160 postgres pfree
503 2.5989 postgres free_var
458 2.3664 postgres slot_getsomeattrs
378 1.9531 postgres numeric_add
360 1.8601 postgres add_abs
336 1.7361 postgres ExecEvalScalarVarFast
266 1.3744 postgres slot_getattr
221 1.1419 postgres numeric_sub
Review:
1) this patch was clearly applied and compilation was without warning
2) regress tests - All 133 tests passed.
4) don't see any memory leaks there (verified by following code)
CREATE OR REPLACE FUNCTION public.fx(_m integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
declare m numeric = 10;
n numeric = 20022.222;
r numeric;
begin
for i in 1.._m
loop
r := m * n;
end loop;
end;
$function$
postgres=# select fx(10000000);
fx
----
(1 row)
Time: 5312.623 ms --- original ( 4798.103 ms -- patched 10% speedup)
5) it respect PostgreSQL's coding standards
6) we would to accept this patch - it can carry 10% speedup
calculations with numerics.
Regards
Pavel Stehule
On 19.11.2012 15:17, Pavel Stehule wrote:
I tested this patch and I can confirm, so this patch can increase
speed about 16-22% (depends on IO waits, load type).
Thanks for the review.
I spent some more time on this, continuing with the thought that perhaps
it would be better if get_str_from_var() didn't scribble on its input. I
ended up with the attached patch, which contains a bunch of small tweaks:
* Add init_var_from_num() function. This is the same as
set_var_from_num_nocopy in the original patch, but it doesn't require
the caller to have called init_var() first. IMHO this makes the calling
code slightly more readable. Also, it's now more evident what these vars
are: the digits array points to original array in the original Datum,
but 'buf' is NULL. This is the same arrangement that's used in the
constant NumericVars like const_zero.
* get_str_from_var() no longer scribbles on its input. I noticed that
it's always called with a dscale that comes from the input var itself.
In other words, the rounding was unnecessary to begin with. I simply
removed the dscale argument and the round_var() call from
get_str_from_var(). If a someone wants to display a string with
different dscale in the future, he can simply call round_var() before
get_str_from_var().
* numericvar_to_int8() no long scribbles on its input either. It creates
a temporary copy to avoid that. To compensate, the callers no longer
need to create a temporary copy, so the net # of pallocs is the same,
but this is nicer. (there's room for a micro-optimization to avoid
making the temporary copy numericvar_to_int8() when the argument is
already suitably rounded - I left that our for now, dunno if it would
make any difference in practice)
* use a constant for the number 10 in get_str_from_var_sci(), when
calculating 10^exponent. Saves a palloc() and some cycles to convert
integer 10 to numeric.
Comments? Assuming no-one sees some fatal flaw in this, I'll commit this
tomorrow.
- Heikki
Attachments:
fasternumeric_v3-heikki.patchtext/x-diff; name=fasternumeric_v3-heikki.patchDownload+97-155
2012/11/20 Heikki Linnakangas <hlinnakangas@vmware.com>:
On 19.11.2012 15:17, Pavel Stehule wrote:
I tested this patch and I can confirm, so this patch can increase
speed about 16-22% (depends on IO waits, load type).Thanks for the review.
I spent some more time on this, continuing with the thought that perhaps it
would be better if get_str_from_var() didn't scribble on its input. I ended
up with the attached patch, which contains a bunch of small tweaks:* Add init_var_from_num() function. This is the same as
set_var_from_num_nocopy in the original patch, but it doesn't require the
caller to have called init_var() first. IMHO this makes the calling code
slightly more readable. Also, it's now more evident what these vars are: the
digits array points to original array in the original Datum, but 'buf' is
NULL. This is the same arrangement that's used in the constant NumericVars
like const_zero.* get_str_from_var() no longer scribbles on its input. I noticed that it's
always called with a dscale that comes from the input var itself. In other
words, the rounding was unnecessary to begin with. I simply removed the
dscale argument and the round_var() call from get_str_from_var(). If a
someone wants to display a string with different dscale in the future, he
can simply call round_var() before get_str_from_var().* numericvar_to_int8() no long scribbles on its input either. It creates a
temporary copy to avoid that. To compensate, the callers no longer need to
create a temporary copy, so the net # of pallocs is the same, but this is
nicer. (there's room for a micro-optimization to avoid making the temporary
copy numericvar_to_int8() when the argument is already suitably rounded - I
left that our for now, dunno if it would make any difference in practice)* use a constant for the number 10 in get_str_from_var_sci(), when
calculating 10^exponent. Saves a palloc() and some cycles to convert integer
10 to numeric.Comments? Assuming no-one sees some fatal flaw in this, I'll commit this
tomorrow.
I have no objections
all regression tests passed, no warnings - has a sense
Regards
Pavel
Show quoted text
- Heikki
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
* Add init_var_from_num() function. This is the same as
set_var_from_num_nocopy in the original patch, but it doesn't require
the caller to have called init_var() first. IMHO this makes the calling
code slightly more readable. Also, it's now more evident what these vars
are: the digits array points to original array in the original Datum,
but 'buf' is NULL. This is the same arrangement that's used in the
constant NumericVars like const_zero.
init_var_from_num's header comment could use some more work. The
statement that one "must not modify the returned var" is false in some
sense, since for instance numeric_ceil() does that. The key is that you
have to replace the digit buffer not modify it in-place, but most
computational routines do that anyway. Also it might be worth pointing
out explicitly that free_var() is not required unless the var is
modified subsequently. (There are instances of both cases, and it might
not be clear to the reader why.)
* get_str_from_var() no longer scribbles on its input. I noticed that
it's always called with a dscale that comes from the input var itself.
In other words, the rounding was unnecessary to begin with.
Hmm, it may have been necessary once upon a time, but I agree getting
rid of the rounding step is a win now. Suggest adding a comment though
that the var is displayed to the number of digits indicated by its dscale.
Looks good otherwise.
regards, tom lane
On 20.11.2012 21:44, Tom Lane wrote:
init_var_from_num's header comment could use some more work. The
statement that one "must not modify the returned var" is false in some
sense, since for instance numeric_ceil() does that. The key is that you
have to replace the digit buffer not modify it in-place, but most
computational routines do that anyway. Also it might be worth pointing
out explicitly that free_var() is not required unless the var is
modified subsequently. (There are instances of both cases, and it might
not be clear to the reader why.)
Added those points to the comment.
* get_str_from_var() no longer scribbles on its input. I noticed that
it's always called with a dscale that comes from the input var itself.
In other words, the rounding was unnecessary to begin with.Hmm, it may have been necessary once upon a time, but I agree getting
rid of the rounding step is a win now. Suggest adding a comment though
that the var is displayed to the number of digits indicated by its dscale.Looks good otherwise.
Committed, thanks to everyone involved.
- Heikki