In PG12, query with float calculations is slower than PG11

Started by keisuke kurodaalmost 6 years ago37 messages

keisuke kuroda

keisuke.kuroda.3862@gmail.com

almost 6 years ago

1 attachment(s)

Hi,

I am testing performance both PG12 and PG11.
I found the case of performance degradation in PG12.

Amit Langote help me to analyze and to create patch.
Thanks!

* environment

CentOS Linux release 7.6.1810 (Core)
postgresql 12.1
postgresql 11.6

* postgresql.conf

shared_buffers = 2048MB
max_parallel_workers_per_gather = 0
work_mem = '64MB'
jit = off

* test case

CREATE TABLE realtest(a real, b real, c real, d real, e real);
INSERT INTO realtest SELECT i,i,i,i,i FROM generate_series(0,10000000) AS i;

EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

* result

PG12.1 5878.389 ms
PG11.6 4533.554 ms

** PostgreSQL 12.1

pgbench=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.realtest (cost=0.00..288697.59 rows=10000115 width=40)
(actual time=0.040..5195.328 rows=10000001 loops=1)
Output: ('2'::double precision * a), ('2'::double precision * b),
('2'::double precision * c), ('2'::double precision * d), ('2'::double
precision * e)
Buffers: shared hit=63695
Planning Time: 0.051 ms
Execution Time: 5878.389 ms
(5 行)

Samples: 6K of event 'cpu-clock', Event count (approx.): 1577750000
Overhead Command Shared Object Symbol
25.48% postgres postgres [.] ExecInterpExpr
★18.65% postgres libc-2.17.so [.] __isinf
14.36% postgres postgres [.] float84mul
8.54% postgres [vdso] [.] __vdso_clock_gettime
4.02% postgres postgres [.] ExecScan
3.69% postgres postgres [.] tts_buffer_heap_getsomeattrs
2.63% postgres libc-2.17.so [.] __clock_gettime
2.55% postgres postgres [.] HeapTupleSatisfiesVisibility
2.00% postgres postgres [.] heapgettup_pagemode

** PostgreSQL 11.6

pgbench=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.realtest (cost=0.00..288697.59 rows=10000115 width=40)
(actual time=0.012..3845.480 rows=10000001 loops=1)
Output: ('2'::double precision * a), ('2'::double precision * b),
('2'::double precision * c), ('2'::double precision * d), ('2'::double
precision * e)
Buffers: shared hit=63695
Planning Time: 0.033 ms
Execution Time: 4533.554 ms
(5 行)

Samples: 4K of event 'cpu-clock', Event count (approx.): 1192000000
Overhead Command Shared Object Symbol
32.30% postgres postgres [.] ExecInterpExpr
14.95% postgres postgres [.] float84mul
10.57% postgres [vdso] [.] __vdso_clock_gettime
★ 6.84% postgres libc-2.17.so [.] __isinf
3.96% postgres postgres [.] ExecScan
3.50% postgres libc-2.17.so [.] __clock_gettime
3.31% postgres postgres [.] heap_getnext
3.08% postgres postgres [.] HeapTupleSatisfiesMVCC
2.77% postgres postgres [.] slot_deform_tuple
2.37% postgres postgres [.] ExecProcNodeInstr
2.08% postgres postgres [.] standard_ExecutorRun

* cause

Obviously, even in common cases where no overflow occurs,
you can tell that PG 12 is performing isinf() 3 times on every call of
float8_mul() once for each of val1, val2, result where as PG 11
is performing only once for result.

That's because check_float8_val() (in PG 12) is a function
whose arguments must be evaluated before
it is called (it is inline, but that's irrelevant),
whereas CHECKFLOATVAL() (in PG11) is a macro
whose arguments are only substituted into its body.

By the way, this change of float8mul() implementation is
mostly due to the following commit in PG 12 development cycle:
commit 6bf0bc842bd75877e31727eb559c6a69e237f831

Especially, the following diff:

@@ -894,13 +746,8 @@ float8mul(PG_FUNCTION_ARGS)  {
    float8      arg1 = PG_GETARG_FLOAT8(0);
    float8      arg2 = PG_GETARG_FLOAT8(1);
-   float8      result;
-
-   result = arg1 * arg2;

-   CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2),
-                 arg1 == 0 || arg2 == 0);
-   PG_RETURN_FLOAT8(result);
+   PG_RETURN_FLOAT8(float8_mul(arg1, arg2));
 }

* patch

This patch uses MACRO which was used by PG11.
I tried attached patch, which can be applied to PG 12 source and performed
a benchmark:

PG12.1 5878.389 ms
PG11.6 4533.554 ms

PG12.1 + Patch 4679.162 ms

** PostgreSQL 12.1 + Patch

postgres=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.realtest (cost=0.00..307328.38 rows=10828150 width=40)
(actual time=0.012..4009.012 rows=10000001 loops=1)
Output: ('2'::double precision * a), ('2'::double precision * b),
('2'::double precision * c), ('2'::double precision * d), ('2'::double
precision * e)
Buffers: shared hit=63695
Planning Time: 0.038 ms
Execution Time: 4679.162 ms
(5 rows)

Samples: 5K of event 'cpu-clock', Event count (approx.): 1376750000
Overhead Command Shared Object Symbol
31.43% postgres postgres [.] ExecInterpExpr
14.24% postgres postgres [.] float84mul
10.40% postgres [vdso] [.] __vdso_clock_gettime
★ 5.41% postgres libc-2.17.so [.] __isinf
4.63% postgres postgres [.] tts_buffer_heap_getsomeattrs
4.03% postgres postgres [.] ExecScan
3.54% postgres libc-2.17.so [.] __clock_gettime
3.12% postgres postgres [.] HeapTupleSatisfiesVisibility
2.36% postgres postgres [.] heap_getnextslot
2.16% postgres postgres [.] heapgettup_pagemode
2.09% postgres postgres [.] standard_ExecutorRun
2.07% postgres postgres [.] SeqNext
2.03% postgres postgres [.] ExecProcNodeInstr
2.03% postgres postgres [.] tts_virtual_clear

PG 12 is still slower compared to PG 11, but the __isinf() situation is
better with the patch.

Best Regards,
Keisuke Kuroda

Attachments:

check-float-val.patchapplication/octet-stream; name=check-float-val.patchDownload

diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index a90d4db215..5885719850 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -1191,7 +1191,7 @@ dtof(PG_FUNCTION_ARGS)
 {
 	float8		num = PG_GETARG_FLOAT8(0);
 
-	check_float4_val((float4) num, isinf(num), num == 0);
+	CHECKFLOATVAL((float4) num, isinf(num), num == 0);
 
 	PG_RETURN_FLOAT4((float4) num);
 }
@@ -1445,7 +1445,7 @@ dsqrt(PG_FUNCTION_ARGS)
 
 	result = sqrt(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1460,7 +1460,7 @@ dcbrt(PG_FUNCTION_ARGS)
 	float8		result;
 
 	result = cbrt(arg1);
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1532,7 +1532,7 @@ dpow(PG_FUNCTION_ARGS)
 	else if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1) || isinf(arg2), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1551,7 +1551,7 @@ dexp(PG_FUNCTION_ARGS)
 	if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1), false);
+	CHECKFLOATVAL(result, isinf(arg1), false);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1580,7 +1580,7 @@ dlog1(PG_FUNCTION_ARGS)
 
 	result = log(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1610,7 +1610,7 @@ dlog10(PG_FUNCTION_ARGS)
 
 	result = log10(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1640,7 +1640,7 @@ dacos(PG_FUNCTION_ARGS)
 
 	result = acos(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1670,7 +1670,7 @@ dasin(PG_FUNCTION_ARGS)
 
 	result = asin(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1695,7 +1695,7 @@ datan(PG_FUNCTION_ARGS)
 	 */
 	result = atan(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1720,7 +1720,7 @@ datan2(PG_FUNCTION_ARGS)
 	 */
 	result = atan2(arg1, arg2);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1760,7 +1760,7 @@ dcos(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1787,7 +1787,7 @@ dcot(PG_FUNCTION_ARGS)
 				 errmsg("input is out of range")));
 
 	result = 1.0 / result;
-	check_float8_val(result, true /* cot(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cot(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1813,7 +1813,7 @@ dsin(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1839,7 +1839,7 @@ dtan(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, true /* tan(pi/2) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tan(pi/2) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -1991,7 +1991,7 @@ dacosd(PG_FUNCTION_ARGS)
 	else
 		result = 90.0 + asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2026,7 +2026,7 @@ dasind(PG_FUNCTION_ARGS)
 	else
 		result = -asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2056,7 +2056,7 @@ datand(PG_FUNCTION_ARGS)
 	atan_arg1 = atan(arg1);
 	result = (atan_arg1 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2090,7 +2090,7 @@ datan2d(PG_FUNCTION_ARGS)
 	atan2_arg1_arg2 = atan2(arg1, arg2);
 	result = (atan2_arg1_arg2 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2211,7 +2211,7 @@ dcosd(PG_FUNCTION_ARGS)
 
 	result = sign * cosd_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2276,7 +2276,7 @@ dcotd(PG_FUNCTION_ARGS)
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* cotd(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cotd(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2330,7 +2330,7 @@ dsind(PG_FUNCTION_ARGS)
 
 	result = sign * sind_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2395,7 +2395,7 @@ dtand(PG_FUNCTION_ARGS)
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* tand(90) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tand(90) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2462,7 +2462,7 @@ dsinh(PG_FUNCTION_ARGS)
 			result = get_float8_infinity();
 	}
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2486,7 +2486,7 @@ dcosh(PG_FUNCTION_ARGS)
 	if (errno == ERANGE)
 		result = get_float8_infinity();
 
-	check_float8_val(result, true, false);
+	CHECKFLOATVAL(result, true, false);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2504,7 +2504,7 @@ dtanh(PG_FUNCTION_ARGS)
 	 */
 	result = tanh(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2522,7 +2522,7 @@ dasinh(PG_FUNCTION_ARGS)
 	 */
 	result = asinh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2548,7 +2548,7 @@ dacosh(PG_FUNCTION_ARGS)
 
 	result = acosh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2583,7 +2583,7 @@ datanh(PG_FUNCTION_ARGS)
 	else
 		result = atanh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
@@ -2784,7 +2784,7 @@ float8_combine(PG_FUNCTION_ARGS)
 		Sx = float8_pl(Sx1, Sx2);
 		tmp = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 	}
 
 	/*
@@ -3294,13 +3294,13 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 		Sx = float8_pl(Sx1, Sx2);
 		tmp1 = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 		Sy = float8_pl(Sy1, Sy2);
 		tmp2 = Sy1 / N1 - Sy2 / N2;
 		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
-		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		CHECKFLOATVAL(Syy, isinf(Syy1) || isinf(Syy2), true);
 		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
-		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+		CHECKFLOATVAL(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
 	}
 
 	/*
diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index e2c5dc0f57..f692abab5b 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -32,6 +32,22 @@ static const uint32 nan[2] = {0xffffffff, 0x7fffffff};
 #define NAN (*(const float8 *) nan)
 #endif
 
+/*
+ * check to see if a float4/8 val has underflowed or overflowed
+ */
+#define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid)			\
+do {															\
+	if (isinf(val) && !(inf_is_valid))							\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+		  errmsg("value out of range: overflow")));				\
+																\
+	if ((val) == 0.0 && !(zero_is_valid))						\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+		 errmsg("value out of range: underflow")));				\
+} while(0)
+
 extern PGDLLIMPORT int extra_float_digits;
 
 /*
@@ -130,6 +146,7 @@ get_float8_nan(void)
 
 /*
  * Checks to see if a float4/8 val has underflowed or overflowed
+ * Note: Deprecated; use CHECKFLOATVAL() macro instead!
  */
 
 static inline void
@@ -178,7 +195,7 @@ float4_pl(const float4 val1, const float4 val2)
 	float4		result;
 
 	result = val1 + val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
@@ -189,7 +206,7 @@ float8_pl(const float8 val1, const float8 val2)
 	float8		result;
 
 	result = val1 + val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
@@ -200,7 +217,7 @@ float4_mi(const float4 val1, const float4 val2)
 	float4		result;
 
 	result = val1 - val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
@@ -211,7 +228,7 @@ float8_mi(const float8 val1, const float8 val2)
 	float8		result;
 
 	result = val1 - val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
@@ -222,7 +239,7 @@ float4_mul(const float4 val1, const float4 val2)
 	float4		result;
 
 	result = val1 * val2;
-	check_float4_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0f || val2 == 0.0f);
 
 	return result;
@@ -234,7 +251,7 @@ float8_mul(const float8 val1, const float8 val2)
 	float8		result;
 
 	result = val1 * val2;
-	check_float8_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0 || val2 == 0.0);
 
 	return result;
@@ -251,7 +268,7 @@ float4_div(const float4 val1, const float4 val2)
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), val1 == 0.0f);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0f);
 
 	return result;
 }
@@ -267,7 +284,7 @@ float8_div(const float8 val1, const float8 val2)
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), val1 == 0.0);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0);
 
 	return result;
 }

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: keisuke kuroda (#1)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-06 14:25:03 +0900, keisuke kuroda wrote:

That's because check_float8_val() (in PG 12) is a function
whose arguments must be evaluated before
it is called (it is inline, but that's irrelevant),
whereas CHECKFLOATVAL() (in PG11) is a macro
whose arguments are only substituted into its body.

Hm - it's not that clear to me that it is irrelevant that the function
gets inlined. The compiler should know that isinf is side-effect free,
and that it doesn't have to evaluate before necessary.

Normally isinf is implemented by a compiler intrisic within the system
headers. But not in your profile:

★ 5.41% postgres libc-2.17.so [.] __isinf

I checked, and I don't see any references to isinf from within float.c
(looking at the disassembly - there's some debug strings containing the
word, but that's it).

What compiler & compiler version on what kind of architecture is this?

Greetings,

Andres Freund

Amit Langote

amitlangote09@gmail.com

almost 6 years ago

In reply to: Andres Freund (#2)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On Thu, Feb 6, 2020 at 2:55 PM Andres Freund <andres@anarazel.de> wrote:

On 2020-02-06 14:25:03 +0900, keisuke kuroda wrote:

That's because check_float8_val() (in PG 12) is a function
whose arguments must be evaluated before
it is called (it is inline, but that's irrelevant),
whereas CHECKFLOATVAL() (in PG11) is a macro
whose arguments are only substituted into its body.

Hm - it's not that clear to me that it is irrelevant that the function
gets inlined. The compiler should know that isinf is side-effect free,
and that it doesn't have to evaluate before necessary.

Normally isinf is implemented by a compiler intrisic within the system
headers. But not in your profile:

★ 5.41% postgres libc-2.17.so [.] __isinf

I checked, and I don't see any references to isinf from within float.c
(looking at the disassembly - there's some debug strings containing the
word, but that's it).

What compiler & compiler version on what kind of architecture is this?

As Kuroda-san mentioned, I also checked the behavior that he reports.
The compiler I used is an ancient one (CentOS 7 default):

$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)

Compiler dependent behavior of inlining might be relevant here, but
there is one more thing to consider. The if () condition in
check_float8_val (PG 12) and CHECKFLOATVAL (PG 11) is calculated
differently, causing isinf() to be called more times in PG 12:

static inline void
check_float8_val(const float8 val, const bool inf_is_valid,
const bool zero_is_valid)
{
if (!inf_is_valid && unlikely(isinf(val)))
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("value out of range: overflow")));

#define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid) \
do { \
if (isinf(val) && !(inf_is_valid)) \
ereport(ERROR, \
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), \
errmsg("value out of range: overflow"))); \

called thusly:

check_float8_val(result, isinf(val1) || isinf(val2),
val1 == 0.0 || val2 == 0.0);

and

CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2),
arg1 == 0 || arg2 == 0);

from float8_mul() and float8mul() in PG 12 and PG 11, respectively.

You may notice that the if () condition is reversed, so while PG 12
calculates isinf(arg1) || isinf(arg2) first and isinf(result) only if
the first is false, which it is in most cases, PG 11 calculates
isinf(result) first, followed by isinf(arg1) || isinf(arg2) if the
former is true. I don't understand why such reversal was necessary,
but it appears to be the main factor behind this slowdown. So, even
if PG 12's check_float8_val() is perfectly inlined, this slowdown
couldn't be helped.

Thanks,
Amit

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Amit Langote (#3)

Re: In PG12, query with float calculations is slower than PG11

So it appears to me that what commit 6bf0bc842 did in this area was
not just wrong, but disastrously so. Before that, we had a macro that
evaluated isinf(val) before it evaluated the inf_is_valid condition.
Now we have check_float[48]_val which do it the other way around.
That would be okay if the inf_is_valid condition were cheap to
evaluate, but in common code paths it's actually twice as expensive
as isinf().

Andres seems to be of the opinion that the compiler should be willing
to ignore the semantic requirements of the C standard in order
to rearrange the code back into the cheaper order. That sounds like
wishful thinking to me ... even if it actually works on his compiler,
it certainly isn't going to work for everyone.

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation, because
otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first. (Another reason to do so is
so that the file/line numbers generated for the error reports go back
to being at least a little bit useful.) We could use local variables
within the macro to avoid double evals, if anyone thinks that's
actually important --- I don't.

I think the current code is probably also misusing unlikely(),
and that the right way would be more like

if (unlikely(isinf(val) && !(inf_is_valid)))

regards, tom lane

Robert Haas

robertmhaas@gmail.com

almost 6 years ago

In reply to: Tom Lane (#4)

Re: In PG12, query with float calculations is slower than PG11

On Thu, Feb 6, 2020 at 11:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So it appears to me that what commit 6bf0bc842 did in this area was
not just wrong, but disastrously so. Before that, we had a macro that
evaluated isinf(val) before it evaluated the inf_is_valid condition.
Now we have check_float[48]_val which do it the other way around.
That would be okay if the inf_is_valid condition were cheap to
evaluate, but in common code paths it's actually twice as expensive
as isinf().

Well, if the previous coding was a deliberate attempt to dodge this
performance issue, the evidence seems to be well-concealed. Neither
the comments for that macro nor the related commit messages make any
mention of it. When subtle things like this are performance-critical,
good comments are pretty critical, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Tom Lane (#4)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-06 11:03:51 -0500, Tom Lane wrote:

Andres seems to be of the opinion that the compiler should be willing
to ignore the semantic requirements of the C standard in order
to rearrange the code back into the cheaper order. That sounds like
wishful thinking to me ... even if it actually works on his compiler,
it certainly isn't going to work for everyone.

Sorry, but, uh, what are you talking about? Please tell me which single
standards violation I'm advocating for?

I was asking about the inlining bit because the first email of the topic
explained that as the problem, which I don't believe can be the full
explanation - and it turns out it isn't. As Amit Langote's followup
email explained, there's the whole issue of the order of checks being
inverted - which is clearly bad. And wholly unrelated to inlining.

And I asked about __isinf() being used because there are issues with
accidentally ending up with the non-intrinsic version of isinf() when
not using gcc, due to badly written standard library headers.

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation, because
otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first.

I'd first like to see some actual evidence of this being a problem,
rather than just the order of the checks.

(Another reason to do so is so that the file/line numbers generated
for the error reports go back to being at least a little bit useful.)
We could use local variables within the macro to avoid double evals,
if anyone thinks that's actually important --- I don't.

I don't think that's necessarily a good idea. In fact, I think we should
probably do the exact opposite, and move the error messages further out
of line. All these otherwise very small functions having their own
ereports makes them much bigger. Our low code density, and the resulting
rate of itlb misses, is pretty significant cost (cf [1]https://twitter.com/AndresFreundTec/status/1214305610172289024).

master:
text data bss dec hex filename
36124 44 65 36233 8d89 float.o
error messages moved out of line:
text data bss dec hex filename
32883 44 65 32992 80e0 float.o

Taking int4pl as an example - solely because it is simpler assembly to
look at - we get:

master:
0x00000000004ac190 <+0>: mov 0x30(%rdi),%rax
0x00000000004ac194 <+4>: add 0x20(%rdi),%eax
0x00000000004ac197 <+7>: jo 0x4ac19c <int4pl+12>
0x00000000004ac199 <+9>: cltq
0x00000000004ac19b <+11>: retq
0x00000000004ac19c <+12>: push %rbp
0x00000000004ac19d <+13>: lea 0x1a02c4(%rip),%rsi # 0x64c468
0x00000000004ac1a4 <+20>: xor %r8d,%r8d
0x00000000004ac1a7 <+23>: lea 0x265da1(%rip),%rcx # 0x711f4f <__func__.26823>
0x00000000004ac1ae <+30>: mov $0x30b,%edx
0x00000000004ac1b3 <+35>: mov $0x14,%edi
0x00000000004ac1b8 <+40>: callq 0x586060 <errstart>
0x00000000004ac1bd <+45>: lea 0x147e0e(%rip),%rdi # 0x5f3fd2
0x00000000004ac1c4 <+52>: xor %eax,%eax
0x00000000004ac1c6 <+54>: callq 0x5896a0 <errmsg>
0x00000000004ac1cb <+59>: mov $0x3000082,%edi
0x00000000004ac1d0 <+64>: mov %eax,%ebp
0x00000000004ac1d2 <+66>: callq 0x589540 <errcode>
0x00000000004ac1d7 <+71>: mov %eax,%edi
0x00000000004ac1d9 <+73>: mov %ebp,%esi
0x00000000004ac1db <+75>: xor %eax,%eax
0x00000000004ac1dd <+77>: callq 0x588fb0 <errfinish>

out-of-line error:
0x00000000004b04e0 <+0>: mov 0x30(%rdi),%rax
0x00000000004b04e4 <+4>: add 0x20(%rdi),%eax
0x00000000004b04e7 <+7>: jo 0x4b04ec <int4pl+12>
0x00000000004b04e9 <+9>: cltq
0x00000000004b04eb <+11>: retq
0x00000000004b04ec <+12>: push %rax
0x00000000004b04ed <+13>: callq 0x115e17 <out_of_range_err>

With the out-of-line error, we can fit multiple of these functions into one
cache line. With the inline error, not even one.

Greetings,

Andres Freund

[1]: https://twitter.com/AndresFreundTec/status/1214305610172289024

keisuke kuroda

keisuke.kuroda.3862@gmail.com

almost 6 years ago

In reply to: Andres Freund (#6)

Re: In PG12, query with float calculations is slower than PG11

Hi,

I have been testing with newer compiler (clang-7)
and result is a bit different at least with clang-7.
Compiling PG 12.1 (even without patch) with clang-7
results in __isinf() no longer being a bottleneck,
that is, you don't see it in profiler at all.

So, there is no issue for people who use the modern clang toolchain,
but maybe that's not everyone.
So there would still be some interest in doing something about this.

* clang

bash-4.2$ which clang
/opt/rh/llvm-toolset-7.0/root/usr/bin/clang

bash-4.2$ clang -v
clang version 7.0.1 (tags/RELEASE_701/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rh/llvm-toolset-7.0/root/usr/bin
Found candidate GCC installation:
/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7
Found candidate GCC installation:
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Selected GCC installation:
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64

** pg_config

---
CONFIGURE = '--prefix=/var/lib/pgsql/pgsql/12.1'
'CC=/opt/rh/llvm-toolset-7.0/root/usr/bin/clang'
'PKG_CONFIG_PATH=/opt/rh/llvm-toolset-7.0/root/usr/lib64/pkgconfig'
CC = /opt/rh/llvm-toolset-7.0/root/usr/bin/clang
---

* result(PostgreSQL 12.1 (even without patch))

postgres=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on public.realtest (cost=0.00..288697.59 rows=10000115 width=40)
(actual time=0.012..3878.284 rows=10000001 loops=1)
Output: ('2'::double precision * a), ('2'::double precision * b),
('2'::double precision * c), ('2'::double precision * d), ('2'::double
precision * e)
Buffers: shared hit=63695
Planning Time: 0.038 ms
Execution Time: 4533.767 ms
(5 rows)

Samples: 5K of event 'cpu-clock', Event count (approx.): 1275000000
Overhead Command Shared Object Symbol
33.92% postgres postgres [.] ExecInterpExpr
13.27% postgres postgres [.] float84mul
10.86% postgres [vdso] [.] __vdso_clock_gettime
5.49% postgres postgres [.] tts_buffer_heap_getsomeattrs
3.96% postgres postgres [.] ExecScan
3.25% postgres libc-2.17.so [.] __clock_gettime
3.16% postgres postgres [.] heap_getnextslot
2.41% postgres postgres [.] tts_virtual_clear
2.39% postgres postgres [.] SeqNext
2.22% postgres postgres [.] InstrStopNode

Best Regards,
Keisuke Kuroda

2020年2月7日(金) 3:48 Andres Freund <andres@anarazel.de>:

Show quoted text

Hi,

On 2020-02-06 11:03:51 -0500, Tom Lane wrote:

Andres seems to be of the opinion that the compiler should be willing
to ignore the semantic requirements of the C standard in order
to rearrange the code back into the cheaper order. That sounds like
wishful thinking to me ... even if it actually works on his compiler,
it certainly isn't going to work for everyone.

Sorry, but, uh, what are you talking about? Please tell me which single
standards violation I'm advocating for?

I was asking about the inlining bit because the first email of the topic
explained that as the problem, which I don't believe can be the full
explanation - and it turns out it isn't. As Amit Langote's followup
email explained, there's the whole issue of the order of checks being
inverted - which is clearly bad. And wholly unrelated to inlining.

And I asked about __isinf() being used because there are issues with
accidentally ending up with the non-intrinsic version of isinf() when
not using gcc, due to badly written standard library headers.

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation, because
otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first.

I'd first like to see some actual evidence of this being a problem,
rather than just the order of the checks.

(Another reason to do so is so that the file/line numbers generated
for the error reports go back to being at least a little bit useful.)
We could use local variables within the macro to avoid double evals,
if anyone thinks that's actually important --- I don't.

I don't think that's necessarily a good idea. In fact, I think we should
probably do the exact opposite, and move the error messages further out
of line. All these otherwise very small functions having their own
ereports makes them much bigger. Our low code density, and the resulting
rate of itlb misses, is pretty significant cost (cf [1]).

master:
text data bss dec hex filename
36124 44 65 36233 8d89 float.o
error messages moved out of line:
text data bss dec hex filename
32883 44 65 32992 80e0 float.o

Taking int4pl as an example - solely because it is simpler assembly to
look at - we get:

master:
0x00000000004ac190 <+0>: mov 0x30(%rdi),%rax
0x00000000004ac194 <+4>: add 0x20(%rdi),%eax
0x00000000004ac197 <+7>: jo 0x4ac19c <int4pl+12>
0x00000000004ac199 <+9>: cltq
0x00000000004ac19b <+11>: retq
0x00000000004ac19c <+12>: push %rbp
0x00000000004ac19d <+13>: lea 0x1a02c4(%rip),%rsi #
0x64c468
0x00000000004ac1a4 <+20>: xor %r8d,%r8d
0x00000000004ac1a7 <+23>: lea 0x265da1(%rip),%rcx #
0x711f4f <__func__.26823>
0x00000000004ac1ae <+30>: mov $0x30b,%edx
0x00000000004ac1b3 <+35>: mov $0x14,%edi
0x00000000004ac1b8 <+40>: callq 0x586060 <errstart>
0x00000000004ac1bd <+45>: lea 0x147e0e(%rip),%rdi #
0x5f3fd2
0x00000000004ac1c4 <+52>: xor %eax,%eax
0x00000000004ac1c6 <+54>: callq 0x5896a0 <errmsg>
0x00000000004ac1cb <+59>: mov $0x3000082,%edi
0x00000000004ac1d0 <+64>: mov %eax,%ebp
0x00000000004ac1d2 <+66>: callq 0x589540 <errcode>
0x00000000004ac1d7 <+71>: mov %eax,%edi
0x00000000004ac1d9 <+73>: mov %ebp,%esi
0x00000000004ac1db <+75>: xor %eax,%eax
0x00000000004ac1dd <+77>: callq 0x588fb0 <errfinish>

out-of-line error:
0x00000000004b04e0 <+0>: mov 0x30(%rdi),%rax
0x00000000004b04e4 <+4>: add 0x20(%rdi),%eax
0x00000000004b04e7 <+7>: jo 0x4b04ec <int4pl+12>
0x00000000004b04e9 <+9>: cltq
0x00000000004b04eb <+11>: retq
0x00000000004b04ec <+12>: push %rax
0x00000000004b04ed <+13>: callq 0x115e17 <out_of_range_err>

With the out-of-line error, we can fit multiple of these functions into one
cache line. With the inline error, not even one.

Greetings,

Andres Freund

[1] https://twitter.com/AndresFreundTec/status/1214305610172289024

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: keisuke kuroda (#7)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On February 6, 2020 11:42:30 PM PST, keisuke kuroda <keisuke.kuroda.3862@gmail.com> wrote:

Hi,

I have been testing with newer compiler (clang-7)
and result is a bit different at least with clang-7.
Compiling PG 12.1 (even without patch) with clang-7
results in __isinf() no longer being a bottleneck,
that is, you don't see it in profiler at all.

I don't think that's necessarily the right conclusion. What's quite possibly happening is that you do not see the external isinf function anymore, because it is implemented as an intrinsic, but that there still are more computations being done. Due to the changed order of the isinf checks. You'd have to compare with 11 using the same compiler.

Andres

* result(PostgreSQL 12.1 (even without patch))

postgres=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on public.realtest (cost=0.00..288697.59 rows=10000115
width=40)
(actual time=0.012..3878.284 rows=10000001 loops=1)
Output: ('2'::double precision * a), ('2'::double precision * b),
('2'::double precision * c), ('2'::double precision * d), ('2'::double
precision * e)
Buffers: shared hit=63695
Planning Time: 0.038 ms
Execution Time: 4533.767 ms
(5 rows)

Samples: 5K of event 'cpu-clock', Event count (approx.): 1275000000
Overhead Command Shared Object Symbol
33.92% postgres postgres [.] ExecInterpExpr
13.27% postgres postgres [.] float84mul
10.86% postgres [vdso] [.] __vdso_clock_gettime
5.49% postgres postgres [.] tts_buffer_heap_getsomeattrs
3.96% postgres postgres [.] ExecScan
3.25% postgres libc-2.17.so [.] __clock_gettime
3.16% postgres postgres [.] heap_getnextslot
2.41% postgres postgres [.] tts_virtual_clear
2.39% postgres postgres [.] SeqNext
2.22% postgres postgres [.] InstrStopNode

Best Regards,
Keisuke Kuroda

2020年2月7日(金) 3:48 Andres Freund <andres@anarazel.de>:

Hi,

On 2020-02-06 11:03:51 -0500, Tom Lane wrote:

Andres seems to be of the opinion that the compiler should be

willing

to ignore the semantic requirements of the C standard in order
to rearrange the code back into the cheaper order. That sounds

like

wishful thinking to me ... even if it actually works on his

compiler,

it certainly isn't going to work for everyone.

Sorry, but, uh, what are you talking about? Please tell me which

single

standards violation I'm advocating for?

I was asking about the inlining bit because the first email of the

topic

explained that as the problem, which I don't believe can be the full
explanation - and it turns out it isn't. As Amit Langote's followup
email explained, there's the whole issue of the order of checks being
inverted - which is clearly bad. And wholly unrelated to inlining.

And I asked about __isinf() being used because there are issues with
accidentally ending up with the non-intrinsic version of isinf() when
not using gcc, due to badly written standard library headers.

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation,

because

otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first.

I'd first like to see some actual evidence of this being a problem,
rather than just the order of the checks.

(Another reason to do so is so that the file/line numbers generated
for the error reports go back to being at least a little bit

useful.)

We could use local variables within the macro to avoid double

evals,

if anyone thinks that's actually important --- I don't.

I don't think that's necessarily a good idea. In fact, I think we

should

probably do the exact opposite, and move the error messages further

out

of line. All these otherwise very small functions having their own
ereports makes them much bigger. Our low code density, and the

resulting

rate of itlb misses, is pretty significant cost (cf [1]).

master:
text data bss dec hex filename
36124 44 65 36233 8d89 float.o
error messages moved out of line:
text data bss dec hex filename
32883 44 65 32992 80e0 float.o

Taking int4pl as an example - solely because it is simpler assembly

to

look at - we get:

master:
0x00000000004ac190 <+0>: mov 0x30(%rdi),%rax
0x00000000004ac194 <+4>: add 0x20(%rdi),%eax
0x00000000004ac197 <+7>: jo 0x4ac19c <int4pl+12>
0x00000000004ac199 <+9>: cltq
0x00000000004ac19b <+11>: retq
0x00000000004ac19c <+12>: push %rbp
0x00000000004ac19d <+13>: lea 0x1a02c4(%rip),%rsi #
0x64c468
0x00000000004ac1a4 <+20>: xor %r8d,%r8d
0x00000000004ac1a7 <+23>: lea 0x265da1(%rip),%rcx #
0x711f4f <__func__.26823>
0x00000000004ac1ae <+30>: mov $0x30b,%edx
0x00000000004ac1b3 <+35>: mov $0x14,%edi
0x00000000004ac1b8 <+40>: callq 0x586060 <errstart>
0x00000000004ac1bd <+45>: lea 0x147e0e(%rip),%rdi #
0x5f3fd2
0x00000000004ac1c4 <+52>: xor %eax,%eax
0x00000000004ac1c6 <+54>: callq 0x5896a0 <errmsg>
0x00000000004ac1cb <+59>: mov $0x3000082,%edi
0x00000000004ac1d0 <+64>: mov %eax,%ebp
0x00000000004ac1d2 <+66>: callq 0x589540 <errcode>
0x00000000004ac1d7 <+71>: mov %eax,%edi
0x00000000004ac1d9 <+73>: mov %ebp,%esi
0x00000000004ac1db <+75>: xor %eax,%eax
0x00000000004ac1dd <+77>: callq 0x588fb0 <errfinish>

out-of-line error:
0x00000000004b04e0 <+0>: mov 0x30(%rdi),%rax
0x00000000004b04e4 <+4>: add 0x20(%rdi),%eax
0x00000000004b04e7 <+7>: jo 0x4b04ec <int4pl+12>
0x00000000004b04e9 <+9>: cltq
0x00000000004b04eb <+11>: retq
0x00000000004b04ec <+12>: push %rax
0x00000000004b04ed <+13>: callq 0x115e17 <out_of_range_err>

With the out-of-line error, we can fit multiple of these functions

into one

cache line. With the inline error, not even one.

Greetings,

Andres Freund

[1] https://twitter.com/AndresFreundTec/status/1214305610172289024

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Amit Langote

amitlangote09@gmail.com

almost 6 years ago

In reply to: Andres Freund (#8)

Re: In PG12, query with float calculations is slower than PG11

On Fri, Feb 7, 2020 at 4:54 PM Andres Freund <andres@anarazel.de> wrote:

On February 6, 2020 11:42:30 PM PST, keisuke kuroda <keisuke.kuroda.3862@gmail.com> wrote:

Hi,

I have been testing with newer compiler (clang-7)
and result is a bit different at least with clang-7.
Compiling PG 12.1 (even without patch) with clang-7
results in __isinf() no longer being a bottleneck,
that is, you don't see it in profiler at all.

I don't think that's necessarily the right conclusion. What's quite possibly happening is that you do not see the external isinf function anymore, because it is implemented as an intrinsic, but that there still are more computations being done. Due to the changed order of the isinf checks. You'd have to compare with 11 using the same compiler.

I did some tests using two relatively recent compilers: gcc 8 and
clang-7 and here are the results:

Setup:

create table realtest (a real, b real, c real, d real, e real);
insert into realtest select i, i, i, i, i from generate_series(1, 1000000) i;

Test query:

/tmp/query.sql
select avg(2*dsqrt(a)), avg(2*dsqrt(b)), avg(2*dsqrt(c)),
avg(2*dsqrt(d)), avg(2*dsqrt(e)) from realtest;

pgbench -n -T 60 -f /tmp/query.sql

Latency and profiling results:

gcc 8 (gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3))
====

11.6

latency average = 463.968 ms

40.62% postgres postgres [.] ExecInterpExpr
9.74% postgres postgres [.] float8_accum
6.12% postgres libc-2.17.so [.] __isinf
5.96% postgres postgres [.] float8mul
5.33% postgres postgres [.] dsqrt
3.90% postgres postgres [.] ftod
3.53% postgres postgres [.] Float8GetDatum
2.34% postgres postgres [.] DatumGetFloat8
2.15% postgres postgres [.] AggCheckCallContext
2.03% postgres postgres [.] slot_deform_tuple
1.95% postgres libm-2.17.so [.] __sqrt
1.19% postgres postgres [.] check_float8_array

HEAD

latency average = 549.071 ms

31.74% postgres postgres [.] ExecInterpExpr
11.02% postgres libc-2.17.so [.] __isinf
10.58% postgres postgres [.] float8_accum
4.84% postgres postgres [.] check_float8_val
4.66% postgres postgres [.] dsqrt
3.91% postgres postgres [.] float8mul
3.56% postgres postgres [.] ftod
3.26% postgres postgres [.] Float8GetDatum
2.91% postgres postgres [.] float8_mul
2.30% postgres postgres [.] DatumGetFloat8
2.19% postgres postgres [.] slot_deform_heap_tuple
1.81% postgres postgres [.] AggCheckCallContext
1.31% postgres libm-2.17.so [.] __sqrt
1.25% postgres postgres [.] check_float8_array

HEAD + patch

latency average = 546.624 ms

33.51% postgres postgres [.] ExecInterpExpr
10.35% postgres postgres [.] float8_accum
10.06% postgres libc-2.17.so [.] __isinf
4.58% postgres postgres [.] dsqrt
4.14% postgres postgres [.] check_float8_val
4.03% postgres postgres [.] ftod
3.54% postgres postgres [.] float8mul
2.96% postgres postgres [.] Float8GetDatum
2.38% postgres postgres [.] slot_deform_heap_tuple
2.23% postgres postgres [.] DatumGetFloat8
2.09% postgres postgres [.] float8_mul
1.88% postgres postgres [.] AggCheckCallContext
1.65% postgres libm-2.17.so [.] __sqrt
1.22% postgres postgres [.] check_float8_array

clang-7 (clang version 7.0.1 (tags/RELEASE_701/final))
=====

11.6

latency average = 419.014 ms

47.57% postgres postgres [.] ExecInterpExpr
7.99% postgres postgres [.] float8_accum
5.96% postgres postgres [.] dsqrt
4.88% postgres postgres [.] float8mul
4.23% postgres postgres [.] ftod
3.30% postgres postgres [.] slot_deform_tuple
3.19% postgres postgres [.] DatumGetFloat8
1.92% postgres libm-2.17.so [.] __sqrt
1.72% postgres postgres [.] check_float8_array

HEAD

latency average = 452.958 ms

40.55% postgres postgres [.] ExecInterpExpr
10.61% postgres postgres [.] float8_accum
4.58% postgres postgres [.] dsqrt
3.59% postgres postgres [.] slot_deform_heap_tuple
3.54% postgres postgres [.] check_float8_val
3.48% postgres postgres [.] ftod
3.42% postgres postgres [.] float8mul
3.22% postgres postgres [.] DatumGetFloat8
2.69% postgres postgres [.] Float8GetDatum
2.46% postgres postgres [.] float8_mul
2.29% postgres libm-2.17.so [.] __sqrt
1.47% postgres postgres [.] check_float8_array

HEAD + patch

latency average = 452.533 ms

41.05% postgres postgres [.] ExecInterpExpr
10.15% postgres postgres [.] float8_accum
5.62% postgres postgres [.] dsqrt
3.86% postgres postgres [.] check_float8_val
3.27% postgres postgres [.] float8mul
3.09% postgres postgres [.] slot_deform_heap_tuple
2.91% postgres postgres [.] ftod
2.88% postgres postgres [.] DatumGetFloat8
2.62% postgres postgres [.] float8_mul
2.03% postgres libm-2.17.so [.] __sqrt
2.00% postgres postgres [.] check_float8_array

The patch mentioned above is this:

diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index e2c5dc0f57..dc97d19293 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -136,12 +136,12 @@ static inline void
 check_float4_val(const float4 val, const bool inf_is_valid,
                  const bool zero_is_valid)
 {
-    if (!inf_is_valid && unlikely(isinf(val)))
+    if (unlikely(isinf(val)) && !inf_is_valid)
         ereport(ERROR,
                 (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                  errmsg("value out of range: overflow")));

-    if (!zero_is_valid && unlikely(val == 0.0))
+    if (unlikely(val == 0.0) && !zero_is_valid)
         ereport(ERROR,
                 (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                  errmsg("value out of range: underflow")));
@@ -151,12 +151,12 @@ static inline void
 check_float8_val(const float8 val, const bool inf_is_valid,
                  const bool zero_is_valid)
 {
-    if (!inf_is_valid && unlikely(isinf(val)))
+    if (unlikely(isinf(val)) && !inf_is_valid)
         ereport(ERROR,
                 (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                  errmsg("value out of range: overflow")));

-    if (!zero_is_valid && unlikely(val == 0.0))
+    if (unlikely(val == 0.0) && !zero_is_valid)
         ereport(ERROR,
                 (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                  errmsg("value out of range: underflow")));

So, the patch appears to do very little here. I can only conclude that
the check_float{8|4}_val() (PG 12) is slower than CHECKFLOATVAL() (PG
11) due to arguments being evaluated first. It's entirely possible
though that the patch shown above is not enough.

Thanks,
Amit

#10

Amit Langote

amitlangote09@gmail.com

almost 6 years ago

In reply to: Amit Langote (#9)

Re: In PG12, query with float calculations is slower than PG11

Fwiw, also tried the patch that Kuroda-san had posted yesterday.

On Fri, Feb 7, 2020 at 5:17 PM Amit Langote <amitlangote09@gmail.com> wrote:

Latency and profiling results:

gcc 8 (gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3))
====

11.6

latency average = 463.968 ms

40.62% postgres postgres [.] ExecInterpExpr
9.74% postgres postgres [.] float8_accum
6.12% postgres libc-2.17.so [.] __isinf
5.96% postgres postgres [.] float8mul
5.33% postgres postgres [.] dsqrt
3.90% postgres postgres [.] ftod
3.53% postgres postgres [.] Float8GetDatum
2.34% postgres postgres [.] DatumGetFloat8
2.15% postgres postgres [.] AggCheckCallContext
2.03% postgres postgres [.] slot_deform_tuple
1.95% postgres libm-2.17.so [.] __sqrt
1.19% postgres postgres [.] check_float8_array

HEAD

latency average = 549.071 ms

31.74% postgres postgres [.] ExecInterpExpr
11.02% postgres libc-2.17.so [.] __isinf
10.58% postgres postgres [.] float8_accum
4.84% postgres postgres [.] check_float8_val
4.66% postgres postgres [.] dsqrt
3.91% postgres postgres [.] float8mul
3.56% postgres postgres [.] ftod
3.26% postgres postgres [.] Float8GetDatum
2.91% postgres postgres [.] float8_mul
2.30% postgres postgres [.] DatumGetFloat8
2.19% postgres postgres [.] slot_deform_heap_tuple
1.81% postgres postgres [.] AggCheckCallContext
1.31% postgres libm-2.17.so [.] __sqrt
1.25% postgres postgres [.] check_float8_array

HEAD + patch

latency average = 546.624 ms

33.51% postgres postgres [.] ExecInterpExpr
10.35% postgres postgres [.] float8_accum
10.06% postgres libc-2.17.so [.] __isinf
4.58% postgres postgres [.] dsqrt
4.14% postgres postgres [.] check_float8_val
4.03% postgres postgres [.] ftod
3.54% postgres postgres [.] float8mul
2.96% postgres postgres [.] Float8GetDatum
2.38% postgres postgres [.] slot_deform_heap_tuple
2.23% postgres postgres [.] DatumGetFloat8
2.09% postgres postgres [.] float8_mul
1.88% postgres postgres [.] AggCheckCallContext
1.65% postgres libm-2.17.so [.] __sqrt
1.22% postgres postgres [.] check_float8_array

HEAD + Kuroda-san's patch (compiled with gcc 8)

latency average = 484.604 ms

37.41% postgres postgres [.] ExecInterpExpr
10.83% postgres postgres [.] float8_accum
5.62% postgres postgres [.] dsqrt
4.23% postgres libc-2.17.so [.] __isinf
4.05% postgres postgres [.] float8mul
3.85% postgres postgres [.] ftod
3.18% postgres postgres [.] Float8GetDatum
2.81% postgres postgres [.] slot_deform_heap_tuple
2.63% postgres postgres [.] DatumGetFloat8
2.46% postgres postgres [.] float8_mul
1.91% postgres libm-2.17.so [.] __sqrt

clang-7 (clang version 7.0.1 (tags/RELEASE_701/final))
=====

11.6

latency average = 419.014 ms

47.57% postgres postgres [.] ExecInterpExpr
7.99% postgres postgres [.] float8_accum
5.96% postgres postgres [.] dsqrt
4.88% postgres postgres [.] float8mul
4.23% postgres postgres [.] ftod
3.30% postgres postgres [.] slot_deform_tuple
3.19% postgres postgres [.] DatumGetFloat8
1.92% postgres libm-2.17.so [.] __sqrt
1.72% postgres postgres [.] check_float8_array

HEAD

latency average = 452.958 ms

40.55% postgres postgres [.] ExecInterpExpr
10.61% postgres postgres [.] float8_accum
4.58% postgres postgres [.] dsqrt
3.59% postgres postgres [.] slot_deform_heap_tuple
3.54% postgres postgres [.] check_float8_val
3.48% postgres postgres [.] ftod
3.42% postgres postgres [.] float8mul
3.22% postgres postgres [.] DatumGetFloat8
2.69% postgres postgres [.] Float8GetDatum
2.46% postgres postgres [.] float8_mul
2.29% postgres libm-2.17.so [.] __sqrt
1.47% postgres postgres [.] check_float8_array

HEAD + patch

latency average = 452.533 ms

41.05% postgres postgres [.] ExecInterpExpr
10.15% postgres postgres [.] float8_accum
5.62% postgres postgres [.] dsqrt
3.86% postgres postgres [.] check_float8_val
3.27% postgres postgres [.] float8mul
3.09% postgres postgres [.] slot_deform_heap_tuple
2.91% postgres postgres [.] ftod
2.88% postgres postgres [.] DatumGetFloat8
2.62% postgres postgres [.] float8_mul
2.03% postgres libm-2.17.so [.] __sqrt
2.00% postgres postgres [.] check_float8_array

HEAD + Kuroda-san's patch (compiled with clang-7)

latency average = 435.454 ms

43.02% postgres postgres [.] ExecInterpExpr
10.86% postgres postgres [.] float8_accum
3.97% postgres postgres [.] dsqrt
3.97% postgres postgres [.] float8mul
3.51% postgres postgres [.] ftod
3.42% postgres postgres [.] slot_deform_heap_tuple
3.36% postgres postgres [.] DatumGetFloat8
1.97% postgres libm-2.17.so [.] __sqrt
1.97% postgres postgres [.] check_float8_array
1.88% postgres postgres [.] float8_mul

Needless to say, that one makes a visible difference, although still
slower compared to PG 11.

Thanks,
Amit

#11

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Amit Langote (#10)

Re: In PG12, query with float calculations is slower than PG11

Fwiw, also tried the patch that Kuroda-san had posted yesterday.

I run the same test case too:

clang version 7.0.0:

HEAD 2548.119 ms
with patch 2320.974 ms

clang version 8.0.0:

HEAD 2431.766 ms
with patch 2419.439 ms

clang version 9.0.0:

HEAD 2477.493 ms
with patch 2365.509 ms

gcc version 7.4.0:

HEAD 2451.261 ms
with patch 2343.393 ms

gcc version 8.3.0:

HEAD 2540.626 ms
with patch 2299.653 ms

#12

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Andres Freund (#6)

1 attachment(s)

Re: In PG12, query with float calculations is slower than PG11

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation, because
otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first.

I'd first like to see some actual evidence of this being a problem,
rather than just the order of the checks.

There seem to be enough evidence of this being the problem. We are
better off going back to the macro-based implementation. I polished
Keisuke Kuroda's patch commenting about the performance issue, removed
the check_float*_val() functions completely, and added unlikely() as
Tom Lane suggested. It is attached. I confirmed with different
compilers that the macro, and unlikely() makes this noticeably faster.

Attachments:

0001-Bring-back-CHECKFLOATVAL-macro-v01.patchtext/x-patch; charset=US-ASCII; name=0001-Bring-back-CHECKFLOATVAL-macro-v01.patchDownload

From e869373ad093e668872f08833de2c5c614aab673 Mon Sep 17 00:00:00 2001
From: Emre Hasegeli <emre@hasegeli.com>
Date: Fri, 7 Feb 2020 10:27:25 +0000
Subject: [PATCH] Bring back CHECKFLOATVAL() macro

The inline functions added by 6bf0bc842b caused the conditions of
overflow/underflow checks to be evaluated when no overflow/underflow
happen.  This slowed down floating point operations.  This commit brings
back the macro that was in use before 6bf0bc842b to fix the performace
regression.

Reported-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Author: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Discussion: https://www.postgresql.org/message-id/CANDwggLe1Gc1OrRqvPfGE%3DkM9K0FSfia0hbeFCEmwabhLz95AA%40mail.gmail.com
---
 src/backend/utils/adt/float.c   | 66 ++++++++++++++---------------
 src/backend/utils/adt/geo_ops.c |  2 +-
 src/include/utils/float.h       | 75 ++++++++++++++-------------------
 3 files changed, 66 insertions(+), 77 deletions(-)

diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index a90d4db215..5885719850 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -1184,21 +1184,21 @@ ftod(PG_FUNCTION_ARGS)
 
 
 /*
  *		dtof			- converts a float8 number to a float4 number
  */
 Datum
 dtof(PG_FUNCTION_ARGS)
 {
 	float8		num = PG_GETARG_FLOAT8(0);
 
-	check_float4_val((float4) num, isinf(num), num == 0);
+	CHECKFLOATVAL((float4) num, isinf(num), num == 0);
 
 	PG_RETURN_FLOAT4((float4) num);
 }
 
 
 /*
  *		dtoi4			- converts a float8 number to an int4 number
  */
 Datum
 dtoi4(PG_FUNCTION_ARGS)
@@ -1438,36 +1438,36 @@ dsqrt(PG_FUNCTION_ARGS)
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
 				 errmsg("cannot take square root of a negative number")));
 
 	result = sqrt(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcbrt			- returns cube root of arg1
  */
 Datum
 dcbrt(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	result = cbrt(arg1);
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dpow			- returns pow(arg1,arg2)
  */
 Datum
 dpow(PG_FUNCTION_ARGS)
 {
@@ -1525,40 +1525,40 @@ dpow(PG_FUNCTION_ARGS)
 			/* The sign of Inf is not significant in this case. */
 			result = get_float8_infinity();
 		else if (fabs(arg1) != 1)
 			result = 0;
 		else
 			result = 1;
 	}
 	else if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1) || isinf(arg2), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dexp			- returns the exponential function of arg1
  */
 Datum
 dexp(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	errno = 0;
 	result = exp(arg1);
 	if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1), false);
+	CHECKFLOATVAL(result, isinf(arg1), false);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog1			- returns the natural logarithm of arg1
  */
 Datum
 dlog1(PG_FUNCTION_ARGS)
 {
@@ -1573,21 +1573,21 @@ dlog1(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog10			- returns the base 10 logarithm of arg1
  */
 Datum
 dlog10(PG_FUNCTION_ARGS)
 {
@@ -1603,21 +1603,21 @@ dlog10(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log10(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dacos			- returns the arccos of arg1 (radians)
  */
 Datum
 dacos(PG_FUNCTION_ARGS)
 {
@@ -1633,21 +1633,21 @@ dacos(PG_FUNCTION_ARGS)
 	 * range [-1, 1] to values in the range [0, Pi], so we should reject any
 	 * inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acos(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasin			- returns the arcsin of arg1 (radians)
  */
 Datum
 dasin(PG_FUNCTION_ARGS)
 {
@@ -1663,21 +1663,21 @@ dasin(PG_FUNCTION_ARGS)
 	 * range [-1, 1] to values in the range [-Pi/2, Pi/2], so we should reject
 	 * any inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = asin(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datan			- returns the arctan of arg1 (radians)
  */
 Datum
 datan(PG_FUNCTION_ARGS)
 {
@@ -1688,21 +1688,21 @@ datan(PG_FUNCTION_ARGS)
 	if (isnan(arg1))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-Pi/2, Pi/2], so the result should always be
 	 * finite, even if the input is infinite.
 	 */
 	result = atan(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2			- returns the arctan of arg1/arg2 (radians)
  */
 Datum
 datan2(PG_FUNCTION_ARGS)
 {
@@ -1713,21 +1713,21 @@ datan2(PG_FUNCTION_ARGS)
 	/* Per the POSIX spec, return NaN if either input is NaN */
 	if (isnan(arg1) || isnan(arg2))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * atan2 maps all inputs to values in the range [-Pi, Pi], so the result
 	 * should always be finite, even if the inputs are infinite.
 	 */
 	result = atan2(arg1, arg2);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcos			- returns the cosine of arg1 (radians)
  */
 Datum
 dcos(PG_FUNCTION_ARGS)
 {
@@ -1753,21 +1753,21 @@ dcos(PG_FUNCTION_ARGS)
 	 * platform reports via errno, so also explicitly test for infinite
 	 * inputs.
 	 */
 	errno = 0;
 	result = cos(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcot			- returns the cotangent of arg1 (radians)
  */
 Datum
 dcot(PG_FUNCTION_ARGS)
 {
@@ -1780,21 +1780,21 @@ dcot(PG_FUNCTION_ARGS)
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = 1.0 / result;
-	check_float8_val(result, true /* cot(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cot(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsin			- returns the sine of arg1 (radians)
  */
 Datum
 dsin(PG_FUNCTION_ARGS)
 {
@@ -1806,21 +1806,21 @@ dsin(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = sin(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtan			- returns the tangent of arg1 (radians)
  */
 Datum
 dtan(PG_FUNCTION_ARGS)
 {
@@ -1832,21 +1832,21 @@ dtan(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, true /* tan(pi/2) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tan(pi/2) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /* ========== DEGREE-BASED TRIGONOMETRIC FUNCTIONS ========== */
 
 
 /*
  * Initialize the cached constants declared at the head of this file
  * (sin_30 etc).  The fact that we need those at all, let alone need this
@@ -1984,21 +1984,21 @@ dacosd(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = acosd_q1(arg1);
 	else
 		result = 90.0 + asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasind			- returns the arcsin of arg1 (degrees)
  */
 Datum
 dasind(PG_FUNCTION_ARGS)
 {
@@ -2019,21 +2019,21 @@ dasind(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = asind_q1(arg1);
 	else
 		result = -asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datand			- returns the arctan of arg1 (degrees)
  */
 Datum
 datand(PG_FUNCTION_ARGS)
 {
@@ -2049,21 +2049,21 @@ datand(PG_FUNCTION_ARGS)
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-90, 90], so the result should always be finite,
 	 * even if the input is infinite.  Additionally, we take care to ensure
 	 * than when arg1 is 1, the result is exactly 45.
 	 */
 	atan_arg1 = atan(arg1);
 	result = (atan_arg1 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2d			- returns the arctan of arg1/arg2 (degrees)
  */
 Datum
 datan2d(PG_FUNCTION_ARGS)
 {
@@ -2083,21 +2083,21 @@ datan2d(PG_FUNCTION_ARGS)
 	 * result should always be finite, even if the inputs are infinite.
 	 *
 	 * Note: this coding assumes that atan(1.0) is a suitable scaling constant
 	 * to get an exact result from atan2().  This might well fail on us at
 	 * some point, requiring us to decide exactly what inputs we think we're
 	 * going to guarantee an exact result for.
 	 */
 	atan2_arg1_arg2 = atan2(arg1, arg2);
 	result = (atan2_arg1_arg2 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		sind_0_to_30	- returns the sine of an angle that lies between 0 and
  *						  30 degrees.  This will return exactly 0 when x is 0,
  *						  and exactly 0.5 when x is 30 degrees.
  */
 static double
@@ -2204,21 +2204,21 @@ dcosd(PG_FUNCTION_ARGS)
 
 	if (arg1 > 90.0)
 	{
 		/* cosd(180-x) = -cosd(x) */
 		arg1 = 180.0 - arg1;
 		sign = -sign;
 	}
 
 	result = sign * cosd_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcotd			- returns the cotangent of arg1 (degrees)
  */
 Datum
 dcotd(PG_FUNCTION_ARGS)
 {
@@ -2269,21 +2269,21 @@ dcotd(PG_FUNCTION_ARGS)
 	result = sign * (cot_arg1 / cot_45);
 
 	/*
 	 * On some machines we get cotd(270) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* cotd(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cotd(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsind			- returns the sine of arg1 (degrees)
  */
 Datum
 dsind(PG_FUNCTION_ARGS)
 {
@@ -2323,21 +2323,21 @@ dsind(PG_FUNCTION_ARGS)
 	}
 
 	if (arg1 > 90.0)
 	{
 		/* sind(180-x) = sind(x) */
 		arg1 = 180.0 - arg1;
 	}
 
 	result = sign * sind_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtand			- returns the tangent of arg1 (degrees)
  */
 Datum
 dtand(PG_FUNCTION_ARGS)
 {
@@ -2388,21 +2388,21 @@ dtand(PG_FUNCTION_ARGS)
 	result = sign * (tan_arg1 / tan_45);
 
 	/*
 	 * On some machines we get tand(180) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* tand(90) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tand(90) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		degrees		- returns degrees converted from radians
  */
 Datum
 degrees(PG_FUNCTION_ARGS)
 {
@@ -2455,21 +2455,21 @@ dsinh(PG_FUNCTION_ARGS)
 	 * sign of arg1.
 	 */
 	if (errno == ERANGE)
 	{
 		if (arg1 < 0)
 			result = -get_float8_infinity();
 		else
 			result = get_float8_infinity();
 	}
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcosh			- returns the hyperbolic cosine of arg1
  */
 Datum
 dcosh(PG_FUNCTION_ARGS)
 {
@@ -2479,57 +2479,57 @@ dcosh(PG_FUNCTION_ARGS)
 	errno = 0;
 	result = cosh(arg1);
 
 	/*
 	 * if an ERANGE error occurs, it means there is an overflow.  As cosh is
 	 * always positive, it always means the result is positive infinity.
 	 */
 	if (errno == ERANGE)
 		result = get_float8_infinity();
 
-	check_float8_val(result, true, false);
+	CHECKFLOATVAL(result, true, false);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dtanh			- returns the hyperbolic tangent of arg1
  */
 Datum
 dtanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For tanh, we don't need an errno check because it never overflows.
 	 */
 	result = tanh(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dasinh			- returns the inverse hyperbolic sine of arg1
  */
 Datum
 dasinh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For asinh, we don't need an errno check because it never overflows.
 	 */
 	result = asinh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dacosh			- returns the inverse hyperbolic cosine of arg1
  */
 Datum
 dacosh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2541,21 +2541,21 @@ dacosh(PG_FUNCTION_ARGS)
 	 * thing because some implementations will report that for NaN. Otherwise,
 	 * no error is possible.
 	 */
 	if (arg1 < 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acosh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		datanh			- returns the inverse hyperbolic tangent of arg1
  */
 Datum
 datanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2576,21 +2576,21 @@ datanh(PG_FUNCTION_ARGS)
 	 * glibc versions may produce the wrong errno for this.  All other inputs
 	 * cannot produce an error.
 	 */
 	if (arg1 == -1.0)
 		result = -get_float8_infinity();
 	else if (arg1 == 1.0)
 		result = get_float8_infinity();
 	else
 		result = atanh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		drandom		- returns a random number
  */
 Datum
 drandom(PG_FUNCTION_ARGS)
 {
@@ -2777,21 +2777,21 @@ float8_combine(PG_FUNCTION_ARGS)
 		N = N1;
 		Sx = Sx1;
 		Sxx = Sxx1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
@@ -3287,27 +3287,27 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 		Sy = Sy1;
 		Syy = Syy1;
 		Sxy = Sxy1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp1 = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 		Sy = float8_pl(Sy1, Sy2);
 		tmp2 = Sy1 / N1 - Sy2 / N2;
 		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
-		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		CHECKFLOATVAL(Syy, isinf(Syy1) || isinf(Syy2), true);
 		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
-		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+		CHECKFLOATVAL(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index d5ded471c4..a0d02c6a90 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -5538,14 +5538,14 @@ pg_hypot(float8 x, float8 y)
 	 * such cases, but more importantly it also protects against
 	 * divide-by-zero errors, since now x >= y.
 	 */
 	if (y == 0.0)
 		return x;
 
 	/* Determine the hypotenuse */
 	yx = y / x;
 	result = x * sqrt(1.0 + (yx * yx));
 
-	check_float8_val(result, false, false);
+	CHECKFLOATVAL(result, false, false);
 
 	return result;
 }
diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index e2c5dc0f57..5c6ce5bdc6 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -25,20 +25,43 @@
 /* Radians per degree, a.k.a. PI / 180 */
 #define RADIANS_PER_DEGREE 0.0174532925199432957692
 
 /* Visual C++ etc lacks NAN, and won't accept 0.0/0.0. */
 #if defined(WIN32) && !defined(NAN)
 static const uint32 nan[2] = {0xffffffff, 0x7fffffff};
 
 #define NAN (*(const float8 *) nan)
 #endif
 
+/*
+ * Checks to see if a float4/8 val has underflowed or overflowed
+ *
+ * The rest of the API uses inline functions, but this has to stay as macro
+ * to prevent the inf_is_valid and zero_is_valid arguments to be evaluated
+ * when the val is not inf or zero.  Evaluating the arguments is more
+ * expensive than checking the value itself.
+ *
+ * Note that this macro double evaluates the val.
+ */
+#define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid)			\
+do {															\
+	if (unlikely(isinf(val) && !(inf_is_valid)))				\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+				 errmsg("value out of range: overflow")));		\
+																\
+	if (unlikely((val) == 0.0 && !(zero_is_valid)))				\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+				 errmsg("value out of range: underflow")));		\
+} while(0)
+
 extern PGDLLIMPORT int extra_float_digits;
 
 /*
  * Utility functions in float.c
  */
 extern int	is_infinite(float8 val);
 extern float8 float8in_internal(char *num, char **endptr_p,
 								const char *type_name, const char *orig_string);
 extern float8 float8in_internal_opt_error(char *num, char **endptr_p,
 										  const char *type_name, const char *orig_string,
@@ -122,159 +145,125 @@ get_float8_nan(void)
 #if defined(NAN) && !(defined(__NetBSD__) && defined(__mips__))
 	/* C99 standard way */
 	return (float8) NAN;
 #else
 	/* Assume we can get a NaN via zero divide */
 	return (float8) (0.0 / 0.0);
 #endif
 }
 
 /*
- * Checks to see if a float4/8 val has underflowed or overflowed
- */
-
-static inline void
-check_float4_val(const float4 val, const bool inf_is_valid,
-				 const bool zero_is_valid)
-{
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
-}
-
-static inline void
-check_float8_val(const float8 val, const bool inf_is_valid,
-				 const bool zero_is_valid)
-{
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
-}
-
-/*
- * Routines for operations with the checks above
+ * Routines for operations with overflow/underflow checks
  *
  * There isn't any way to check for underflow of addition/subtraction
  * because numbers near the underflow value have already been rounded to
  * the point where we can't detect that the two values were originally
  * different, e.g. on x86, '1e-45'::float4 == '2e-45'::float4 ==
  * 1.4013e-45.
  */
 
 static inline float4
 float4_pl(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 + val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float8
 float8_pl(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 + val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float4
 float4_mi(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 - val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float8
 float8_mi(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 - val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float4
 float4_mul(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 * val2;
-	check_float4_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0f || val2 == 0.0f);
 
 	return result;
 }
 
 static inline float8
 float8_mul(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 * val2;
-	check_float8_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0 || val2 == 0.0);
 
 	return result;
 }
 
 static inline float4
 float4_div(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	if (val2 == 0.0f)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), val1 == 0.0f);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0f);
 
 	return result;
 }
 
 static inline float8
 float8_div(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	if (val2 == 0.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), val1 == 0.0);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0);
 
 	return result;
 }
 
 /*
  * Routines for NaN-aware comparisons
  *
  * We consider all NaNs to be equal and larger than any non-NaN. This is
  * somewhat arbitrary; the important thing is to have a consistent sort
  * order.
-- 
2.17.1

#13

Tels

nospam-pg-abuse@bloodgate.com

almost 6 years ago

In reply to: Emre Hasegeli (#12)

1 attachment(s)

Re: In PG12, query with float calculations is slower than PG11

Moin,

On 2020-02-07 15:42, Emre Hasegeli wrote:

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation, because
otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first.

I'd first like to see some actual evidence of this being a problem,
rather than just the order of the checks.

There seem to be enough evidence of this being the problem. We are
better off going back to the macro-based implementation. I polished
Keisuke Kuroda's patch commenting about the performance issue, removed
the check_float*_val() functions completely, and added unlikely() as
Tom Lane suggested. It is attached. I confirmed with different
compilers that the macro, and unlikely() makes this noticeably faster.

Hm, the diff has the macro tests as:

  +	if (unlikely(isinf(val) && !(inf_is_valid)))
  ...
  +      if (unlikely((val) == 0.0 && !(zero_is_valid)))

But the comment does not explain that this test has to be in that
order, or the compiler will for non-constant arguments evalute
the (now) right-side first. E.g. if I understand this correctly:

+ if (!(zero_is_valid) && unlikely((val) == 0.0)

would have the same problem of evaluating "zero_is_valid" (which
might be an isinf(arg1) || isinf(arg2)) first and so be the same thing
we try to avoid with the macro? Maybe adding this bit of info to the
comment makes it clearer?

Also, a few places use the macro as:

+ CHECKFLOATVAL(result, true, true);

which evaluates to a complete NOP in both cases. IMHO this could be
replaced with a comment like:

+ // No CHECKFLOATVAL() needed, as both inf and 0.0 are valid

(or something along the lines of "no error can occur"), as otherwise
CHECKFLOATVAL() implies to the casual reader that there are some checks
done, while in reality no real checks are done at all (and hopefully
the compiler optimizes everything away, which might not be true for
debug builds).

--
Best regards,

Tels

Attachments:

0001-Bring-back-CHECKFLOATVAL-macro-v01.patchtext/x-patch; charset=US-ASCII; name=0001-Bring-back-CHECKFLOATVAL-macro-v01.patchDownload

From e869373ad093e668872f08833de2c5c614aab673 Mon Sep 17 00:00:00 2001
From: Emre Hasegeli <emre@hasegeli.com>
Date: Fri, 7 Feb 2020 10:27:25 +0000
Subject: [PATCH] Bring back CHECKFLOATVAL() macro

The inline functions added by 6bf0bc842b caused the conditions of
overflow/underflow checks to be evaluated when no overflow/underflow
happen.  This slowed down floating point operations.  This commit brings
back the macro that was in use before 6bf0bc842b to fix the performace
regression.

Reported-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Author: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Discussion: https://www.postgresql.org/message-id/CANDwggLe1Gc1OrRqvPfGE%3DkM9K0FSfia0hbeFCEmwabhLz95AA%40mail.gmail.com
---
 src/backend/utils/adt/float.c   | 66 ++++++++++++++---------------
 src/backend/utils/adt/geo_ops.c |  2 +-
 src/include/utils/float.h       | 75 ++++++++++++++-------------------
 3 files changed, 66 insertions(+), 77 deletions(-)

diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index a90d4db215..5885719850 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -1184,21 +1184,21 @@ ftod(PG_FUNCTION_ARGS)
 
 
 /*
  *		dtof			- converts a float8 number to a float4 number
  */
 Datum
 dtof(PG_FUNCTION_ARGS)
 {
 	float8		num = PG_GETARG_FLOAT8(0);
 
-	check_float4_val((float4) num, isinf(num), num == 0);
+	CHECKFLOATVAL((float4) num, isinf(num), num == 0);
 
 	PG_RETURN_FLOAT4((float4) num);
 }
 
 
 /*
  *		dtoi4			- converts a float8 number to an int4 number
  */
 Datum
 dtoi4(PG_FUNCTION_ARGS)
@@ -1438,36 +1438,36 @@ dsqrt(PG_FUNCTION_ARGS)
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
 				 errmsg("cannot take square root of a negative number")));
 
 	result = sqrt(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcbrt			- returns cube root of arg1
  */
 Datum
 dcbrt(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	result = cbrt(arg1);
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dpow			- returns pow(arg1,arg2)
  */
 Datum
 dpow(PG_FUNCTION_ARGS)
 {
@@ -1525,40 +1525,40 @@ dpow(PG_FUNCTION_ARGS)
 			/* The sign of Inf is not significant in this case. */
 			result = get_float8_infinity();
 		else if (fabs(arg1) != 1)
 			result = 0;
 		else
 			result = 1;
 	}
 	else if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1) || isinf(arg2), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dexp			- returns the exponential function of arg1
  */
 Datum
 dexp(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	errno = 0;
 	result = exp(arg1);
 	if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1), false);
+	CHECKFLOATVAL(result, isinf(arg1), false);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog1			- returns the natural logarithm of arg1
  */
 Datum
 dlog1(PG_FUNCTION_ARGS)
 {
@@ -1573,21 +1573,21 @@ dlog1(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog10			- returns the base 10 logarithm of arg1
  */
 Datum
 dlog10(PG_FUNCTION_ARGS)
 {
@@ -1603,21 +1603,21 @@ dlog10(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log10(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dacos			- returns the arccos of arg1 (radians)
  */
 Datum
 dacos(PG_FUNCTION_ARGS)
 {
@@ -1633,21 +1633,21 @@ dacos(PG_FUNCTION_ARGS)
 	 * range [-1, 1] to values in the range [0, Pi], so we should reject any
 	 * inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acos(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasin			- returns the arcsin of arg1 (radians)
  */
 Datum
 dasin(PG_FUNCTION_ARGS)
 {
@@ -1663,21 +1663,21 @@ dasin(PG_FUNCTION_ARGS)
 	 * range [-1, 1] to values in the range [-Pi/2, Pi/2], so we should reject
 	 * any inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = asin(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datan			- returns the arctan of arg1 (radians)
  */
 Datum
 datan(PG_FUNCTION_ARGS)
 {
@@ -1688,21 +1688,21 @@ datan(PG_FUNCTION_ARGS)
 	if (isnan(arg1))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-Pi/2, Pi/2], so the result should always be
 	 * finite, even if the input is infinite.
 	 */
 	result = atan(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2			- returns the arctan of arg1/arg2 (radians)
  */
 Datum
 datan2(PG_FUNCTION_ARGS)
 {
@@ -1713,21 +1713,21 @@ datan2(PG_FUNCTION_ARGS)
 	/* Per the POSIX spec, return NaN if either input is NaN */
 	if (isnan(arg1) || isnan(arg2))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * atan2 maps all inputs to values in the range [-Pi, Pi], so the result
 	 * should always be finite, even if the inputs are infinite.
 	 */
 	result = atan2(arg1, arg2);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcos			- returns the cosine of arg1 (radians)
  */
 Datum
 dcos(PG_FUNCTION_ARGS)
 {
@@ -1753,21 +1753,21 @@ dcos(PG_FUNCTION_ARGS)
 	 * platform reports via errno, so also explicitly test for infinite
 	 * inputs.
 	 */
 	errno = 0;
 	result = cos(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcot			- returns the cotangent of arg1 (radians)
  */
 Datum
 dcot(PG_FUNCTION_ARGS)
 {
@@ -1780,21 +1780,21 @@ dcot(PG_FUNCTION_ARGS)
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = 1.0 / result;
-	check_float8_val(result, true /* cot(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cot(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsin			- returns the sine of arg1 (radians)
  */
 Datum
 dsin(PG_FUNCTION_ARGS)
 {
@@ -1806,21 +1806,21 @@ dsin(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = sin(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtan			- returns the tangent of arg1 (radians)
  */
 Datum
 dtan(PG_FUNCTION_ARGS)
 {
@@ -1832,21 +1832,21 @@ dtan(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, true /* tan(pi/2) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tan(pi/2) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /* ========== DEGREE-BASED TRIGONOMETRIC FUNCTIONS ========== */
 
 
 /*
  * Initialize the cached constants declared at the head of this file
  * (sin_30 etc).  The fact that we need those at all, let alone need this
@@ -1984,21 +1984,21 @@ dacosd(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = acosd_q1(arg1);
 	else
 		result = 90.0 + asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasind			- returns the arcsin of arg1 (degrees)
  */
 Datum
 dasind(PG_FUNCTION_ARGS)
 {
@@ -2019,21 +2019,21 @@ dasind(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = asind_q1(arg1);
 	else
 		result = -asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datand			- returns the arctan of arg1 (degrees)
  */
 Datum
 datand(PG_FUNCTION_ARGS)
 {
@@ -2049,21 +2049,21 @@ datand(PG_FUNCTION_ARGS)
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-90, 90], so the result should always be finite,
 	 * even if the input is infinite.  Additionally, we take care to ensure
 	 * than when arg1 is 1, the result is exactly 45.
 	 */
 	atan_arg1 = atan(arg1);
 	result = (atan_arg1 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2d			- returns the arctan of arg1/arg2 (degrees)
  */
 Datum
 datan2d(PG_FUNCTION_ARGS)
 {
@@ -2083,21 +2083,21 @@ datan2d(PG_FUNCTION_ARGS)
 	 * result should always be finite, even if the inputs are infinite.
 	 *
 	 * Note: this coding assumes that atan(1.0) is a suitable scaling constant
 	 * to get an exact result from atan2().  This might well fail on us at
 	 * some point, requiring us to decide exactly what inputs we think we're
 	 * going to guarantee an exact result for.
 	 */
 	atan2_arg1_arg2 = atan2(arg1, arg2);
 	result = (atan2_arg1_arg2 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		sind_0_to_30	- returns the sine of an angle that lies between 0 and
  *						  30 degrees.  This will return exactly 0 when x is 0,
  *						  and exactly 0.5 when x is 30 degrees.
  */
 static double
@@ -2204,21 +2204,21 @@ dcosd(PG_FUNCTION_ARGS)
 
 	if (arg1 > 90.0)
 	{
 		/* cosd(180-x) = -cosd(x) */
 		arg1 = 180.0 - arg1;
 		sign = -sign;
 	}
 
 	result = sign * cosd_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcotd			- returns the cotangent of arg1 (degrees)
  */
 Datum
 dcotd(PG_FUNCTION_ARGS)
 {
@@ -2269,21 +2269,21 @@ dcotd(PG_FUNCTION_ARGS)
 	result = sign * (cot_arg1 / cot_45);
 
 	/*
 	 * On some machines we get cotd(270) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* cotd(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cotd(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsind			- returns the sine of arg1 (degrees)
  */
 Datum
 dsind(PG_FUNCTION_ARGS)
 {
@@ -2323,21 +2323,21 @@ dsind(PG_FUNCTION_ARGS)
 	}
 
 	if (arg1 > 90.0)
 	{
 		/* sind(180-x) = sind(x) */
 		arg1 = 180.0 - arg1;
 	}
 
 	result = sign * sind_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtand			- returns the tangent of arg1 (degrees)
  */
 Datum
 dtand(PG_FUNCTION_ARGS)
 {
@@ -2388,21 +2388,21 @@ dtand(PG_FUNCTION_ARGS)
 	result = sign * (tan_arg1 / tan_45);
 
 	/*
 	 * On some machines we get tand(180) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* tand(90) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tand(90) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		degrees		- returns degrees converted from radians
  */
 Datum
 degrees(PG_FUNCTION_ARGS)
 {
@@ -2455,21 +2455,21 @@ dsinh(PG_FUNCTION_ARGS)
 	 * sign of arg1.
 	 */
 	if (errno == ERANGE)
 	{
 		if (arg1 < 0)
 			result = -get_float8_infinity();
 		else
 			result = get_float8_infinity();
 	}
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcosh			- returns the hyperbolic cosine of arg1
  */
 Datum
 dcosh(PG_FUNCTION_ARGS)
 {
@@ -2479,57 +2479,57 @@ dcosh(PG_FUNCTION_ARGS)
 	errno = 0;
 	result = cosh(arg1);
 
 	/*
 	 * if an ERANGE error occurs, it means there is an overflow.  As cosh is
 	 * always positive, it always means the result is positive infinity.
 	 */
 	if (errno == ERANGE)
 		result = get_float8_infinity();
 
-	check_float8_val(result, true, false);
+	CHECKFLOATVAL(result, true, false);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dtanh			- returns the hyperbolic tangent of arg1
  */
 Datum
 dtanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For tanh, we don't need an errno check because it never overflows.
 	 */
 	result = tanh(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dasinh			- returns the inverse hyperbolic sine of arg1
  */
 Datum
 dasinh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For asinh, we don't need an errno check because it never overflows.
 	 */
 	result = asinh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dacosh			- returns the inverse hyperbolic cosine of arg1
  */
 Datum
 dacosh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2541,21 +2541,21 @@ dacosh(PG_FUNCTION_ARGS)
 	 * thing because some implementations will report that for NaN. Otherwise,
 	 * no error is possible.
 	 */
 	if (arg1 < 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acosh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		datanh			- returns the inverse hyperbolic tangent of arg1
  */
 Datum
 datanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2576,21 +2576,21 @@ datanh(PG_FUNCTION_ARGS)
 	 * glibc versions may produce the wrong errno for this.  All other inputs
 	 * cannot produce an error.
 	 */
 	if (arg1 == -1.0)
 		result = -get_float8_infinity();
 	else if (arg1 == 1.0)
 		result = get_float8_infinity();
 	else
 		result = atanh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		drandom		- returns a random number
  */
 Datum
 drandom(PG_FUNCTION_ARGS)
 {
@@ -2777,21 +2777,21 @@ float8_combine(PG_FUNCTION_ARGS)
 		N = N1;
 		Sx = Sx1;
 		Sxx = Sxx1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
@@ -3287,27 +3287,27 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 		Sy = Sy1;
 		Syy = Syy1;
 		Sxy = Sxy1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp1 = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 		Sy = float8_pl(Sy1, Sy2);
 		tmp2 = Sy1 / N1 - Sy2 / N2;
 		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
-		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		CHECKFLOATVAL(Syy, isinf(Syy1) || isinf(Syy2), true);
 		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
-		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+		CHECKFLOATVAL(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index d5ded471c4..a0d02c6a90 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -5538,14 +5538,14 @@ pg_hypot(float8 x, float8 y)
 	 * such cases, but more importantly it also protects against
 	 * divide-by-zero errors, since now x >= y.
 	 */
 	if (y == 0.0)
 		return x;
 
 	/* Determine the hypotenuse */
 	yx = y / x;
 	result = x * sqrt(1.0 + (yx * yx));
 
-	check_float8_val(result, false, false);
+	CHECKFLOATVAL(result, false, false);
 
 	return result;
 }
diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index e2c5dc0f57..5c6ce5bdc6 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -25,20 +25,43 @@
 /* Radians per degree, a.k.a. PI / 180 */
 #define RADIANS_PER_DEGREE 0.0174532925199432957692
 
 /* Visual C++ etc lacks NAN, and won't accept 0.0/0.0. */
 #if defined(WIN32) && !defined(NAN)
 static const uint32 nan[2] = {0xffffffff, 0x7fffffff};
 
 #define NAN (*(const float8 *) nan)
 #endif
 
+/*
+ * Checks to see if a float4/8 val has underflowed or overflowed
+ *
+ * The rest of the API uses inline functions, but this has to stay as macro
+ * to prevent the inf_is_valid and zero_is_valid arguments to be evaluated
+ * when the val is not inf or zero.  Evaluating the arguments is more
+ * expensive than checking the value itself.
+ *
+ * Note that this macro double evaluates the val.
+ */
+#define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid)			\
+do {															\
+	if (unlikely(isinf(val) && !(inf_is_valid)))				\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+				 errmsg("value out of range: overflow")));		\
+																\
+	if (unlikely((val) == 0.0 && !(zero_is_valid)))				\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+				 errmsg("value out of range: underflow")));		\
+} while(0)
+
 extern PGDLLIMPORT int extra_float_digits;
 
 /*
  * Utility functions in float.c
  */
 extern int	is_infinite(float8 val);
 extern float8 float8in_internal(char *num, char **endptr_p,
 								const char *type_name, const char *orig_string);
 extern float8 float8in_internal_opt_error(char *num, char **endptr_p,
 										  const char *type_name, const char *orig_string,
@@ -122,159 +145,125 @@ get_float8_nan(void)
 #if defined(NAN) && !(defined(__NetBSD__) && defined(__mips__))
 	/* C99 standard way */
 	return (float8) NAN;
 #else
 	/* Assume we can get a NaN via zero divide */
 	return (float8) (0.0 / 0.0);
 #endif
 }
 
 /*
- * Checks to see if a float4/8 val has underflowed or overflowed
- */
-
-static inline void
-check_float4_val(const float4 val, const bool inf_is_valid,
-				 const bool zero_is_valid)
-{
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
-}
-
-static inline void
-check_float8_val(const float8 val, const bool inf_is_valid,
-				 const bool zero_is_valid)
-{
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
-}
-
-/*
- * Routines for operations with the checks above
+ * Routines for operations with overflow/underflow checks
  *
  * There isn't any way to check for underflow of addition/subtraction
  * because numbers near the underflow value have already been rounded to
  * the point where we can't detect that the two values were originally
  * different, e.g. on x86, '1e-45'::float4 == '2e-45'::float4 ==
  * 1.4013e-45.
  */
 
 static inline float4
 float4_pl(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 + val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float8
 float8_pl(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 + val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float4
 float4_mi(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 - val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float8
 float8_mi(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 - val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float4
 float4_mul(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 * val2;
-	check_float4_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0f || val2 == 0.0f);
 
 	return result;
 }
 
 static inline float8
 float8_mul(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 * val2;
-	check_float8_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0 || val2 == 0.0);
 
 	return result;
 }
 
 static inline float4
 float4_div(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	if (val2 == 0.0f)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), val1 == 0.0f);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0f);
 
 	return result;
 }
 
 static inline float8
 float8_div(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	if (val2 == 0.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), val1 == 0.0);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0);
 
 	return result;
 }
 
 /*
  * Routines for NaN-aware comparisons
  *
  * We consider all NaNs to be equal and larger than any non-NaN. This is
  * somewhat arbitrary; the important thing is to have a consistent sort
  * order.
-- 
2.17.1

#14

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Amit Langote (#9)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-07 17:17:21 +0900, Amit Langote wrote:

I did some tests using two relatively recent compilers: gcc 8 and
clang-7 and here are the results:

Hm, these very much look like they've been done in an unoptimized build?

40.62% postgres postgres [.] ExecInterpExpr
9.74% postgres postgres [.] float8_accum
6.12% postgres libc-2.17.so [.] __isinf
5.96% postgres postgres [.] float8mul
5.33% postgres postgres [.] dsqrt
3.90% postgres postgres [.] ftod
3.53% postgres postgres [.] Float8GetDatum
2.34% postgres postgres [.] DatumGetFloat8
2.15% postgres postgres [.] AggCheckCallContext
2.03% postgres postgres [.] slot_deform_tuple
1.95% postgres libm-2.17.so [.] __sqrt
1.19% postgres postgres [.] check_float8_array

HEAD

latency average = 549.071 ms

31.74% postgres postgres [.] ExecInterpExpr
11.02% postgres libc-2.17.so [.] __isinf
10.58% postgres postgres [.] float8_accum
4.84% postgres postgres [.] check_float8_val
4.66% postgres postgres [.] dsqrt
3.91% postgres postgres [.] float8mul
3.56% postgres postgres [.] ftod
3.26% postgres postgres [.] Float8GetDatum
2.91% postgres postgres [.] float8_mul
2.30% postgres postgres [.] DatumGetFloat8
2.19% postgres postgres [.] slot_deform_heap_tuple
1.81% postgres postgres [.] AggCheckCallContext
1.31% postgres libm-2.17.so [.] __sqrt
1.25% postgres postgres [.] check_float8_array

Because DatumGetFloat8, Float8GetDatum, etc aren't functions that
normally stay separate.

Greetings,

Andres Freund

#15

Amit Langote

amitlangote09@gmail.com

almost 6 years ago

In reply to: Andres Freund (#14)

Re: In PG12, query with float calculations is slower than PG11

On Sat, Feb 8, 2020 at 3:13 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-02-07 17:17:21 +0900, Amit Langote wrote:

I did some tests using two relatively recent compilers: gcc 8 and
clang-7 and here are the results:

Hm, these very much look like they've been done in an unoptimized build?

40.62% postgres postgres [.] ExecInterpExpr
9.74% postgres postgres [.] float8_accum
6.12% postgres libc-2.17.so [.] __isinf
5.96% postgres postgres [.] float8mul
5.33% postgres postgres [.] dsqrt
3.90% postgres postgres [.] ftod
3.53% postgres postgres [.] Float8GetDatum
2.34% postgres postgres [.] DatumGetFloat8
2.15% postgres postgres [.] AggCheckCallContext
2.03% postgres postgres [.] slot_deform_tuple
1.95% postgres libm-2.17.so [.] __sqrt
1.19% postgres postgres [.] check_float8_array

HEAD

latency average = 549.071 ms

31.74% postgres postgres [.] ExecInterpExpr
11.02% postgres libc-2.17.so [.] __isinf
10.58% postgres postgres [.] float8_accum
4.84% postgres postgres [.] check_float8_val
4.66% postgres postgres [.] dsqrt
3.91% postgres postgres [.] float8mul
3.56% postgres postgres [.] ftod
3.26% postgres postgres [.] Float8GetDatum
2.91% postgres postgres [.] float8_mul
2.30% postgres postgres [.] DatumGetFloat8
2.19% postgres postgres [.] slot_deform_heap_tuple
1.81% postgres postgres [.] AggCheckCallContext
1.31% postgres libm-2.17.so [.] __sqrt
1.25% postgres postgres [.] check_float8_array

Because DatumGetFloat8, Float8GetDatum, etc aren't functions that
normally stay separate.

Okay, fair.

Here are numbers after compiling with -O3:

gcc 8
=====

HEAD

latency average = 350.187 ms

34.67% postgres postgres [.] ExecInterpExpr
20.94% postgres libc-2.17.so [.] __isinf
10.74% postgres postgres [.] float8_accum
8.22% postgres postgres [.] dsqrt
6.63% postgres postgres [.] float8mul
3.45% postgres postgres [.] ftod
2.32% postgres postgres [.] tts_buffer_heap_getsomeattrs

HEAD + reverse-if-condition patch

latency average = 346.710 ms

34.48% postgres postgres [.] ExecInterpExpr
21.00% postgres libc-2.17.so [.] __isinf
12.26% postgres postgres [.] float8_accum
8.31% postgres postgres [.] dsqrt
6.32% postgres postgres [.] float8mul
3.23% postgres postgres [.] ftod
2.25% postgres postgres [.] tts_buffer_heap_getsomeattrs

HEAD + revert-to-macro patch

latency average = 297.493 ms

39.25% postgres postgres [.] ExecInterpExpr
14.44% postgres postgres [.] float8_accum
11.02% postgres libc-2.17.so [.] __isinf
8.21% postgres postgres [.] dsqrt
5.55% postgres postgres [.] float8mul
4.15% postgres postgres [.] ftod
2.78% postgres postgres [.] tts_buffer_heap_getsomeattrs

11.6

latency average = 290.301 ms

42.78% postgres postgres [.] ExecInterpExpr
12.27% postgres postgres [.] float8_accum
12.12% postgres libc-2.17.so [.] __isinf
8.96% postgres postgres [.] dsqrt
5.77% postgres postgres [.] float8mul
3.94% postgres postgres [.] ftod
2.61% postgres postgres [.] AggCheckCallContext

clang-7
=======

HEAD

latency average = 246.278 ms

44.47% postgres postgres [.] ExecInterpExpr
14.56% postgres postgres [.] float8_accum
7.25% postgres postgres [.] float8mul
7.22% postgres postgres [.] dsqrt
5.40% postgres postgres [.] ftod
4.09% postgres postgres [.] tts_buffer_heap_getsomeattrs
2.20% postgres postgres [.] check_float8_val

HEAD + reverse-if-condition patch

latency average = 240.212 ms

45.49% postgres postgres [.] ExecInterpExpr
13.69% postgres postgres [.] float8_accum
8.32% postgres postgres [.] dsqrt
5.28% postgres postgres [.] ftod
5.19% postgres postgres [.] float8mul
3.68% postgres postgres [.] tts_buffer_heap_getsomeattrs
2.90% postgres postgres [.] float8_mul

HEAD + revert-to-macro patch

latency average = 240.620 ms

44.04% postgres postgres [.] ExecInterpExpr
13.72% postgres postgres [.] float8_accum
9.26% postgres postgres [.] dsqrt
5.30% postgres postgres [.] ftod
4.66% postgres postgres [.] float8mul
3.53% postgres postgres [.] tts_buffer_heap_getsomeattrs
3.39% postgres postgres [.] float8_mul

11.6

latency average = 237.045 ms

46.85% postgres postgres [.] ExecInterpExpr
11.39% postgres postgres [.] float8_accum
8.02% postgres postgres [.] dsqrt
7.29% postgres postgres [.] slot_deform_tuple
6.04% postgres postgres [.] float8mul
5.49% postgres postgres [.] ftod

PG 12 is worse than PG 11 when compiled with gcc.

Thanks,
Amit

#16

Amit Langote

amitlangote09@gmail.com

almost 6 years ago

In reply to: Emre Hasegeli (#12)

Re: In PG12, query with float calculations is slower than PG11

On Fri, Feb 7, 2020 at 11:43 PM Emre Hasegeli <emre@hasegeli.com> wrote:

The patch looks unduly invasive to me, but I think that it might be
right that we should go back to a macro-based implementation, because
otherwise we don't have a good way to be certain that the function
parameter won't get evaluated first.

I'd first like to see some actual evidence of this being a problem,
rather than just the order of the checks.

There seem to be enough evidence of this being the problem. We are
better off going back to the macro-based implementation. I polished
Keisuke Kuroda's patch commenting about the performance issue, removed
the check_float*_val() functions completely, and added unlikely() as
Tom Lane suggested. It is attached. I confirmed with different
compilers that the macro, and unlikely() makes this noticeably faster.

Thanks for updating the patch.

Should we update the same macro in contrib/btree_gist/btree_utils_num.h too?

Regards,
Amit

#17

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Tels (#13)

1 attachment(s)

Re: In PG12, query with float calculations is slower than PG11

But the comment does not explain that this test has to be in that
order, or the compiler will for non-constant arguments evalute
the (now) right-side first. E.g. if I understand this correctly:

+ if (!(zero_is_valid) && unlikely((val) == 0.0)

would have the same problem of evaluating "zero_is_valid" (which
might be an isinf(arg1) || isinf(arg2)) first and so be the same thing
we try to avoid with the macro? Maybe adding this bit of info to the
comment makes it clearer?

Added.

Also, a few places use the macro as:

+ CHECKFLOATVAL(result, true, true);

which evaluates to a complete NOP in both cases. IMHO this could be
replaced with a comment like:

+ // No CHECKFLOATVAL() needed, as both inf and 0.0 are valid

(or something along the lines of "no error can occur"), as otherwise
CHECKFLOATVAL() implies to the casual reader that there are some checks
done, while in reality no real checks are done at all (and hopefully
the compiler optimizes everything away, which might not be true for
debug builds).

I don't know why those trigonometric functions don't check for
overflow/underflow like all the rest of float.c. I'll submit another
patch to make them error when overflow/underflow.

The new version is attached.

Attachments:

0001-Bring-back-CHECKFLOATVAL-macro-v02.patchtext/x-patch; charset=US-ASCII; name=0001-Bring-back-CHECKFLOATVAL-macro-v02.patchDownload

From fb5052b869255ef9465b1de92e84b2fb66dd6eb3 Mon Sep 17 00:00:00 2001
From: Emre Hasegeli <emre@hasegeli.com>
Date: Fri, 7 Feb 2020 10:27:25 +0000
Subject: [PATCH] Bring back CHECKFLOATVAL() macro

The inline functions added by 6bf0bc842b caused the conditions of
overflow/underflow checks to be evaluated when no overflow/underflow
happen.  This slowed down floating point operations.  This commit brings
back the macro that was in use before 6bf0bc842b to fix the performace
regression.

Reported-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Author: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Discussion: https://www.postgresql.org/message-id/CANDwggLe1Gc1OrRqvPfGE%3DkM9K0FSfia0hbeFCEmwabhLz95AA%40mail.gmail.com
---
 contrib/btree_gist/btree_utils_num.h |  6 +--
 src/backend/utils/adt/float.c        | 66 ++++++++++++------------
 src/backend/utils/adt/geo_ops.c      |  2 +-
 src/include/utils/float.h            | 76 ++++++++++++----------------
 4 files changed, 70 insertions(+), 80 deletions(-)

diff --git a/contrib/btree_gist/btree_utils_num.h b/contrib/btree_gist/btree_utils_num.h
index 1fedfbe82d..a3227fd758 100644
--- a/contrib/btree_gist/btree_utils_num.h
+++ b/contrib/btree_gist/btree_utils_num.h
@@ -84,30 +84,30 @@ typedef struct
  */
 #define INTERVAL_TO_SEC(ivp) \
 	(((double) (ivp)->time) / ((double) USECS_PER_SEC) + \
 	 (ivp)->day * (24.0 * SECS_PER_HOUR) + \
 	 (ivp)->month * (30.0 * SECS_PER_DAY))
 
 #define GET_FLOAT_DISTANCE(t, arg1, arg2)	Abs( ((float8) *((const t *) (arg1))) - ((float8) *((const t *) (arg2))) )
 
 /*
  * check to see if a float4/8 val has underflowed or overflowed
- * borrowed from src/backend/utils/adt/float.c
+ * borrowed from src/include/utils/float.c
  */
 #define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid)			\
 do {															\
-	if (isinf(val) && !(inf_is_valid))							\
+	if (unlikely(isinf(val) && !(inf_is_valid)))				\
 		ereport(ERROR,											\
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
 		  errmsg("value out of range: overflow")));				\
 																\
-	if ((val) == 0.0 && !(zero_is_valid))						\
+	if (unlikely((val) == 0.0 && !(zero_is_valid)))				\
 		ereport(ERROR,											\
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
 		 errmsg("value out of range: underflow")));				\
 } while(0)
 
 
 extern Interval *abs_interval(Interval *a);
 
 extern bool gbt_num_consistent(const GBT_NUMKEY_R *key, const void *query,
 							   const StrategyNumber *strategy, bool is_leaf,
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index a90d4db215..5885719850 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -1184,21 +1184,21 @@ ftod(PG_FUNCTION_ARGS)
 
 
 /*
  *		dtof			- converts a float8 number to a float4 number
  */
 Datum
 dtof(PG_FUNCTION_ARGS)
 {
 	float8		num = PG_GETARG_FLOAT8(0);
 
-	check_float4_val((float4) num, isinf(num), num == 0);
+	CHECKFLOATVAL((float4) num, isinf(num), num == 0);
 
 	PG_RETURN_FLOAT4((float4) num);
 }
 
 
 /*
  *		dtoi4			- converts a float8 number to an int4 number
  */
 Datum
 dtoi4(PG_FUNCTION_ARGS)
@@ -1438,36 +1438,36 @@ dsqrt(PG_FUNCTION_ARGS)
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
 				 errmsg("cannot take square root of a negative number")));
 
 	result = sqrt(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcbrt			- returns cube root of arg1
  */
 Datum
 dcbrt(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	result = cbrt(arg1);
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dpow			- returns pow(arg1,arg2)
  */
 Datum
 dpow(PG_FUNCTION_ARGS)
 {
@@ -1525,40 +1525,40 @@ dpow(PG_FUNCTION_ARGS)
 			/* The sign of Inf is not significant in this case. */
 			result = get_float8_infinity();
 		else if (fabs(arg1) != 1)
 			result = 0;
 		else
 			result = 1;
 	}
 	else if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1) || isinf(arg2), arg1 == 0);
+	CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dexp			- returns the exponential function of arg1
  */
 Datum
 dexp(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	errno = 0;
 	result = exp(arg1);
 	if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1), false);
+	CHECKFLOATVAL(result, isinf(arg1), false);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog1			- returns the natural logarithm of arg1
  */
 Datum
 dlog1(PG_FUNCTION_ARGS)
 {
@@ -1573,21 +1573,21 @@ dlog1(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog10			- returns the base 10 logarithm of arg1
  */
 Datum
 dlog10(PG_FUNCTION_ARGS)
 {
@@ -1603,21 +1603,21 @@ dlog10(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log10(arg1);
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
+	CHECKFLOATVAL(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dacos			- returns the arccos of arg1 (radians)
  */
 Datum
 dacos(PG_FUNCTION_ARGS)
 {
@@ -1633,21 +1633,21 @@ dacos(PG_FUNCTION_ARGS)
 	 * range [-1, 1] to values in the range [0, Pi], so we should reject any
 	 * inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acos(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasin			- returns the arcsin of arg1 (radians)
  */
 Datum
 dasin(PG_FUNCTION_ARGS)
 {
@@ -1663,21 +1663,21 @@ dasin(PG_FUNCTION_ARGS)
 	 * range [-1, 1] to values in the range [-Pi/2, Pi/2], so we should reject
 	 * any inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = asin(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datan			- returns the arctan of arg1 (radians)
  */
 Datum
 datan(PG_FUNCTION_ARGS)
 {
@@ -1688,21 +1688,21 @@ datan(PG_FUNCTION_ARGS)
 	if (isnan(arg1))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-Pi/2, Pi/2], so the result should always be
 	 * finite, even if the input is infinite.
 	 */
 	result = atan(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2			- returns the arctan of arg1/arg2 (radians)
  */
 Datum
 datan2(PG_FUNCTION_ARGS)
 {
@@ -1713,21 +1713,21 @@ datan2(PG_FUNCTION_ARGS)
 	/* Per the POSIX spec, return NaN if either input is NaN */
 	if (isnan(arg1) || isnan(arg2))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * atan2 maps all inputs to values in the range [-Pi, Pi], so the result
 	 * should always be finite, even if the inputs are infinite.
 	 */
 	result = atan2(arg1, arg2);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcos			- returns the cosine of arg1 (radians)
  */
 Datum
 dcos(PG_FUNCTION_ARGS)
 {
@@ -1753,21 +1753,21 @@ dcos(PG_FUNCTION_ARGS)
 	 * platform reports via errno, so also explicitly test for infinite
 	 * inputs.
 	 */
 	errno = 0;
 	result = cos(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcot			- returns the cotangent of arg1 (radians)
  */
 Datum
 dcot(PG_FUNCTION_ARGS)
 {
@@ -1780,21 +1780,21 @@ dcot(PG_FUNCTION_ARGS)
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = 1.0 / result;
-	check_float8_val(result, true /* cot(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cot(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsin			- returns the sine of arg1 (radians)
  */
 Datum
 dsin(PG_FUNCTION_ARGS)
 {
@@ -1806,21 +1806,21 @@ dsin(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = sin(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtan			- returns the tangent of arg1 (radians)
  */
 Datum
 dtan(PG_FUNCTION_ARGS)
 {
@@ -1832,21 +1832,21 @@ dtan(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
-	check_float8_val(result, true /* tan(pi/2) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tan(pi/2) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /* ========== DEGREE-BASED TRIGONOMETRIC FUNCTIONS ========== */
 
 
 /*
  * Initialize the cached constants declared at the head of this file
  * (sin_30 etc).  The fact that we need those at all, let alone need this
@@ -1984,21 +1984,21 @@ dacosd(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = acosd_q1(arg1);
 	else
 		result = 90.0 + asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasind			- returns the arcsin of arg1 (degrees)
  */
 Datum
 dasind(PG_FUNCTION_ARGS)
 {
@@ -2019,21 +2019,21 @@ dasind(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = asind_q1(arg1);
 	else
 		result = -asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datand			- returns the arctan of arg1 (degrees)
  */
 Datum
 datand(PG_FUNCTION_ARGS)
 {
@@ -2049,21 +2049,21 @@ datand(PG_FUNCTION_ARGS)
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-90, 90], so the result should always be finite,
 	 * even if the input is infinite.  Additionally, we take care to ensure
 	 * than when arg1 is 1, the result is exactly 45.
 	 */
 	atan_arg1 = atan(arg1);
 	result = (atan_arg1 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2d			- returns the arctan of arg1/arg2 (degrees)
  */
 Datum
 datan2d(PG_FUNCTION_ARGS)
 {
@@ -2083,21 +2083,21 @@ datan2d(PG_FUNCTION_ARGS)
 	 * result should always be finite, even if the inputs are infinite.
 	 *
 	 * Note: this coding assumes that atan(1.0) is a suitable scaling constant
 	 * to get an exact result from atan2().  This might well fail on us at
 	 * some point, requiring us to decide exactly what inputs we think we're
 	 * going to guarantee an exact result for.
 	 */
 	atan2_arg1_arg2 = atan2(arg1, arg2);
 	result = (atan2_arg1_arg2 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		sind_0_to_30	- returns the sine of an angle that lies between 0 and
  *						  30 degrees.  This will return exactly 0 when x is 0,
  *						  and exactly 0.5 when x is 30 degrees.
  */
 static double
@@ -2204,21 +2204,21 @@ dcosd(PG_FUNCTION_ARGS)
 
 	if (arg1 > 90.0)
 	{
 		/* cosd(180-x) = -cosd(x) */
 		arg1 = 180.0 - arg1;
 		sign = -sign;
 	}
 
 	result = sign * cosd_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcotd			- returns the cotangent of arg1 (degrees)
  */
 Datum
 dcotd(PG_FUNCTION_ARGS)
 {
@@ -2269,21 +2269,21 @@ dcotd(PG_FUNCTION_ARGS)
 	result = sign * (cot_arg1 / cot_45);
 
 	/*
 	 * On some machines we get cotd(270) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* cotd(0) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* cotd(0) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsind			- returns the sine of arg1 (degrees)
  */
 Datum
 dsind(PG_FUNCTION_ARGS)
 {
@@ -2323,21 +2323,21 @@ dsind(PG_FUNCTION_ARGS)
 	}
 
 	if (arg1 > 90.0)
 	{
 		/* sind(180-x) = sind(x) */
 		arg1 = 180.0 - arg1;
 	}
 
 	result = sign * sind_q1(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtand			- returns the tangent of arg1 (degrees)
  */
 Datum
 dtand(PG_FUNCTION_ARGS)
 {
@@ -2388,21 +2388,21 @@ dtand(PG_FUNCTION_ARGS)
 	result = sign * (tan_arg1 / tan_45);
 
 	/*
 	 * On some machines we get tand(180) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* tand(90) == Inf */ , true);
+	CHECKFLOATVAL(result, true /* tand(90) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		degrees		- returns degrees converted from radians
  */
 Datum
 degrees(PG_FUNCTION_ARGS)
 {
@@ -2455,21 +2455,21 @@ dsinh(PG_FUNCTION_ARGS)
 	 * sign of arg1.
 	 */
 	if (errno == ERANGE)
 	{
 		if (arg1 < 0)
 			result = -get_float8_infinity();
 		else
 			result = get_float8_infinity();
 	}
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcosh			- returns the hyperbolic cosine of arg1
  */
 Datum
 dcosh(PG_FUNCTION_ARGS)
 {
@@ -2479,57 +2479,57 @@ dcosh(PG_FUNCTION_ARGS)
 	errno = 0;
 	result = cosh(arg1);
 
 	/*
 	 * if an ERANGE error occurs, it means there is an overflow.  As cosh is
 	 * always positive, it always means the result is positive infinity.
 	 */
 	if (errno == ERANGE)
 		result = get_float8_infinity();
 
-	check_float8_val(result, true, false);
+	CHECKFLOATVAL(result, true, false);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dtanh			- returns the hyperbolic tangent of arg1
  */
 Datum
 dtanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For tanh, we don't need an errno check because it never overflows.
 	 */
 	result = tanh(arg1);
 
-	check_float8_val(result, false, true);
+	CHECKFLOATVAL(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dasinh			- returns the inverse hyperbolic sine of arg1
  */
 Datum
 dasinh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For asinh, we don't need an errno check because it never overflows.
 	 */
 	result = asinh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dacosh			- returns the inverse hyperbolic cosine of arg1
  */
 Datum
 dacosh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2541,21 +2541,21 @@ dacosh(PG_FUNCTION_ARGS)
 	 * thing because some implementations will report that for NaN. Otherwise,
 	 * no error is possible.
 	 */
 	if (arg1 < 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acosh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		datanh			- returns the inverse hyperbolic tangent of arg1
  */
 Datum
 datanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2576,21 +2576,21 @@ datanh(PG_FUNCTION_ARGS)
 	 * glibc versions may produce the wrong errno for this.  All other inputs
 	 * cannot produce an error.
 	 */
 	if (arg1 == -1.0)
 		result = -get_float8_infinity();
 	else if (arg1 == 1.0)
 		result = get_float8_infinity();
 	else
 		result = atanh(arg1);
 
-	check_float8_val(result, true, true);
+	CHECKFLOATVAL(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		drandom		- returns a random number
  */
 Datum
 drandom(PG_FUNCTION_ARGS)
 {
@@ -2777,21 +2777,21 @@ float8_combine(PG_FUNCTION_ARGS)
 		N = N1;
 		Sx = Sx1;
 		Sxx = Sxx1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
@@ -3287,27 +3287,27 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 		Sy = Sy1;
 		Syy = Syy1;
 		Sxy = Sxy1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp1 = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		CHECKFLOATVAL(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
 		Sy = float8_pl(Sy1, Sy2);
 		tmp2 = Sy1 / N1 - Sy2 / N2;
 		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
-		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		CHECKFLOATVAL(Syy, isinf(Syy1) || isinf(Syy2), true);
 		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
-		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+		CHECKFLOATVAL(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index d5ded471c4..a0d02c6a90 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -5538,14 +5538,14 @@ pg_hypot(float8 x, float8 y)
 	 * such cases, but more importantly it also protects against
 	 * divide-by-zero errors, since now x >= y.
 	 */
 	if (y == 0.0)
 		return x;
 
 	/* Determine the hypotenuse */
 	yx = y / x;
 	result = x * sqrt(1.0 + (yx * yx));
 
-	check_float8_val(result, false, false);
+	CHECKFLOATVAL(result, false, false);
 
 	return result;
 }
diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index e2c5dc0f57..f9c0222c56 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -25,20 +25,44 @@
 /* Radians per degree, a.k.a. PI / 180 */
 #define RADIANS_PER_DEGREE 0.0174532925199432957692
 
 /* Visual C++ etc lacks NAN, and won't accept 0.0/0.0. */
 #if defined(WIN32) && !defined(NAN)
 static const uint32 nan[2] = {0xffffffff, 0x7fffffff};
 
 #define NAN (*(const float8 *) nan)
 #endif
 
+/*
+ * Checks to see if a float4/8 val has underflowed or overflowed
+ *
+ * The rest of the API uses inline functions, but this has to stay as macro
+ * to prevent the inf_is_valid and zero_is_valid arguments to be evaluated
+ * when the val is not inf or zero.  We first check the val, and then
+ * the other arguments.  Evaluation the other arguments is twice as expensive
+ * on the common code paths.
+ *
+ * Note that this macro double evaluates the val.
+ */
+#define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid)			\
+do {															\
+	if (unlikely(isinf(val) && !(inf_is_valid)))				\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+				 errmsg("value out of range: overflow")));		\
+																\
+	if (unlikely((val) == 0.0 && !(zero_is_valid)))				\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),	\
+				 errmsg("value out of range: underflow")));		\
+} while(0)
+
 extern PGDLLIMPORT int extra_float_digits;
 
 /*
  * Utility functions in float.c
  */
 extern int	is_infinite(float8 val);
 extern float8 float8in_internal(char *num, char **endptr_p,
 								const char *type_name, const char *orig_string);
 extern float8 float8in_internal_opt_error(char *num, char **endptr_p,
 										  const char *type_name, const char *orig_string,
@@ -122,159 +146,125 @@ get_float8_nan(void)
 #if defined(NAN) && !(defined(__NetBSD__) && defined(__mips__))
 	/* C99 standard way */
 	return (float8) NAN;
 #else
 	/* Assume we can get a NaN via zero divide */
 	return (float8) (0.0 / 0.0);
 #endif
 }
 
 /*
- * Checks to see if a float4/8 val has underflowed or overflowed
- */
-
-static inline void
-check_float4_val(const float4 val, const bool inf_is_valid,
-				 const bool zero_is_valid)
-{
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
-}
-
-static inline void
-check_float8_val(const float8 val, const bool inf_is_valid,
-				 const bool zero_is_valid)
-{
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
-}
-
-/*
- * Routines for operations with the checks above
+ * Routines for operations with overflow/underflow checks
  *
  * There isn't any way to check for underflow of addition/subtraction
  * because numbers near the underflow value have already been rounded to
  * the point where we can't detect that the two values were originally
  * different, e.g. on x86, '1e-45'::float4 == '2e-45'::float4 ==
  * 1.4013e-45.
  */
 
 static inline float4
 float4_pl(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 + val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float8
 float8_pl(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 + val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float4
 float4_mi(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 - val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float8
 float8_mi(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 - val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), true);
 
 	return result;
 }
 
 static inline float4
 float4_mul(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 * val2;
-	check_float4_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0f || val2 == 0.0f);
 
 	return result;
 }
 
 static inline float8
 float8_mul(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 * val2;
-	check_float8_val(result, isinf(val1) || isinf(val2),
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2),
 					 val1 == 0.0 || val2 == 0.0);
 
 	return result;
 }
 
 static inline float4
 float4_div(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	if (val2 == 0.0f)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), val1 == 0.0f);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0f);
 
 	return result;
 }
 
 static inline float8
 float8_div(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	if (val2 == 0.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), val1 == 0.0);
+	CHECKFLOATVAL(result, isinf(val1) || isinf(val2), val1 == 0.0);
 
 	return result;
 }
 
 /*
  * Routines for NaN-aware comparisons
  *
  * We consider all NaNs to be equal and larger than any non-NaN. This is
  * somewhat arbitrary; the important thing is to have a consistent sort
  * order.
-- 
2.17.1

#18

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Amit Langote (#16)

Re: In PG12, query with float calculations is slower than PG11

Should we update the same macro in contrib/btree_gist/btree_utils_num.h too?

I posted another version incorporating this.

#19

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Emre Hasegeli (#17)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-12 11:54:13 +0000, Emre Hasegeli wrote:

From fb5052b869255ef9465b1de92e84b2fb66dd6eb3 Mon Sep 17 00:00:00 2001
From: Emre Hasegeli <emre@hasegeli.com>
Date: Fri, 7 Feb 2020 10:27:25 +0000
Subject: [PATCH] Bring back CHECKFLOATVAL() macro

The inline functions added by 6bf0bc842b caused the conditions of
overflow/underflow checks to be evaluated when no overflow/underflow
happen. This slowed down floating point operations. This commit brings
back the macro that was in use before 6bf0bc842b to fix the performace
regression.

Wait, no. Didn't we get to the point that we figured out that the
primary issue is the reversal of the order of what is checked is the
primary problem, rather than the macro/inline piece?

Nor do I see how it's going to be ok to just rename the function in a
stable branch.

Greetings,

Andres Freund

#20

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Andres Freund (#19)

Re: In PG12, query with float calculations is slower than PG11

Wait, no. Didn't we get to the point that we figured out that the
primary issue is the reversal of the order of what is checked is the
primary problem, rather than the macro/inline piece?

Reversal of the order makes a small or no difference. The
macro/inline change causes the real slowdown at least on GCC.

Nor do I see how it's going to be ok to just rename the function in a
stable branch.

I'll post another version to keep them around.

#21

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Emre Hasegeli (#20)

Re: In PG12, query with float calculations is slower than PG11

On 2020-02-12 17:49:14 +0000, Emre Hasegeli wrote:

Nor do I see how it's going to be ok to just rename the function in a
stable branch.

I'll post another version to keep them around.

I'd just rename the macro to the name of the inline function. No need to
have a verbose change in all callsites just to update the name imo.

#22

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Andres Freund (#21)

Re: In PG12, query with float calculations is slower than PG11

Andres Freund <andres@anarazel.de> writes:

I'd just rename the macro to the name of the inline function. No need to
have a verbose change in all callsites just to update the name imo.

+1, that's what I had in mind too. That does suggest though that we
ought to make sure the macro has single-eval behavior, so that you
don't need to know it's a macro.

regards, tom lane

#23

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Tom Lane (#22)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-12 13:15:22 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I'd just rename the macro to the name of the inline function. No need to
have a verbose change in all callsites just to update the name imo.

+1, that's what I had in mind too. That does suggest though that we
ought to make sure the macro has single-eval behavior, so that you
don't need to know it's a macro.

We'd have to store 'val' in a local variable for that I think. Not the
prettiest, but also not a problem.

I do wonder if we're just punching ourselves in the face with the
signature of these checks. Part of the problem here really comes from
using the same function to handle a number of different checks.

I mean something like dtof's
check_float4_val((float4) num, isinf(num), num == 0);
where the num == 0 is solely to satisfy the check function is a bit
stupid.

And the reason we have these isinf(arg1) || isinf(arg2) parameters is
also largely because we force the same function to be used in cases
where we have two inputs, rather than just one.

For most places it'd probably end up being easier to read and to
optimize if we just wrote them as

if (unlikely(isinf(result)) && !isinf(arg))
float_overflow_error();

and when needed added a

else if (unlikely(result == 0) && arg1 != 0.0)
float_underflow_error();

the verbose piece really is the error, not the error check. Sure, there
are more complicated cases like

if (unlikely(isinf(result)) && (!isinf(arg1) || !isinf(arg2)))

but that's still not very complicated.

Greetings,

Andres Freund

#24

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Andres Freund (#23)

Re: In PG12, query with float calculations is slower than PG11

Andres Freund <andres@anarazel.de> writes:

I do wonder if we're just punching ourselves in the face with the
signature of these checks. Part of the problem here really comes from
using the same function to handle a number of different checks.

Yeah, I've thought that too. It's *far* from clear that this thing
is a win at all, other than your point about the number of copies of
the ereport call. It's bulky, it's hard to optimize, and I have
never thought it was more readable than the direct tests it replaced.

For most places it'd probably end up being easier to read and to
optimize if we just wrote them as
if (unlikely(isinf(result)) && !isinf(arg))
float_overflow_error();
and when needed added a
else if (unlikely(result == 0) && arg1 != 0.0)
float_underflow_error();

regards, tom lane

#25

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Tom Lane (#24)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-12 14:18:30 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I do wonder if we're just punching ourselves in the face with the
signature of these checks. Part of the problem here really comes from
using the same function to handle a number of different checks.

Yeah, I've thought that too. It's *far* from clear that this thing
is a win at all, other than your point about the number of copies of
the ereport call. It's bulky, it's hard to optimize, and I have
never thought it was more readable than the direct tests it replaced.

For most places it'd probably end up being easier to read and to
optimize if we just wrote them as
if (unlikely(isinf(result)) && !isinf(arg))
float_overflow_error();
and when needed added a
else if (unlikely(result == 0) && arg1 != 0.0)
float_underflow_error();

+1

Cool. Emre, any chance you could write a patch along those lines?

I'm inclined that we should backpatch that, and just leave the inline
function (without in core callers) in place in 12?

Greetings,

Andres Freund

#26

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Andres Freund (#25)

Re: In PG12, query with float calculations is slower than PG11

Andres Freund <andres@anarazel.de> writes:

I'm inclined that we should backpatch that, and just leave the inline
function (without in core callers) in place in 12?

Yeah, we can't remove the inline function in 12. But we don't have
to use it.

regards, tom lane

#27

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Andres Freund (#25)

Re: In PG12, query with float calculations is slower than PG11

For most places it'd probably end up being easier to read and to
optimize if we just wrote them as
if (unlikely(isinf(result)) && !isinf(arg))
float_overflow_error();
and when needed added a
else if (unlikely(result == 0) && arg1 != 0.0)
float_underflow_error();

+1

Cool. Emre, any chance you could write a patch along those lines?

Yes, I am happy to do. It makes more sense to me too.

#28

Emre Hasegeli

emre@hasegeli.com

almost 6 years ago

In reply to: Emre Hasegeli (#27)

1 attachment(s)

Re: In PG12, query with float calculations is slower than PG11

For most places it'd probably end up being easier to read and to
optimize if we just wrote them as
if (unlikely(isinf(result)) && !isinf(arg))
float_overflow_error();
and when needed added a
else if (unlikely(result == 0) && arg1 != 0.0)
float_underflow_error();

+1

Cool. Emre, any chance you could write a patch along those lines?

Yes, I am happy to do. It makes more sense to me too.

How about the one attached?

Attachments:

0001-Optimize-float-overflow-underflow-checks-v3.patchtext/x-patch; charset=US-ASCII; name=0001-Optimize-float-overflow-underflow-checks-v3.patchDownload

From 161384d51f517f1f4d9f403b46e90fbe1c869cbe Mon Sep 17 00:00:00 2001
From: Emre Hasegeli <emre@hasegeli.com>
Date: Fri, 7 Feb 2020 10:27:25 +0000
Subject: [PATCH] Optimize float overflow/underflow checks

The inline functions added by 6bf0bc842b caused the conditions of
overflow/underflow checks to be evaluated when no overflow/underflow
happen.  This slowed down floating point operations.  This commit
inlines the checks to prevent the performance regression.

This also moves the error messages further out of line.  All these
otherwise very small functions having their own ereports() was making
them much bigger.  Our low code density, and the resulting rate of ITLB
misses, is pretty significant cost.

And also this commit is changing the usage of unlikely() to cover
the whole condition.  Using it only for the result is not semantically
correct.  It is more than likely for the result to be infinite when
the input is, or it to be 0 when the input is.

Reported-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Discussion: https://postgr.es/m/CANDwggLe1Gc1OrRqvPfGE%3DkM9K0FSfia0hbeFCEmwabhLz95AA%40mail.gmail.com
---
 src/backend/utils/adt/float.c   | 146 ++++++++++++++++++++++----------
 src/backend/utils/adt/geo_ops.c |   5 +-
 src/include/utils/float.h       |  67 ++++++++-------
 3 files changed, 145 insertions(+), 73 deletions(-)

diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index a90d4db215..30abe06b87 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -79,20 +79,35 @@ static void init_degree_constants(void);
  * function, which causes configure to not set HAVE_CBRT.  Furthermore,
  * their compilers spit up at the mismatch between extern declaration
  * and static definition.  We work around that here by the expedient
  * of a #define to make the actual name of the static function different.
  */
 #define cbrt my_cbrt
 static double cbrt(double x);
 #endif							/* HAVE_CBRT */
 
 
+void float_overflow_error()
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+			 errmsg("value out of range: overflow")));
+}
+
+void float_underflow_error()
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+			 errmsg("value out of range: underflow")));
+}
+
+
 /*
  * Returns -1 if 'val' represents negative infinity, 1 if 'val'
  * represents (positive) infinity, and 0 otherwise. On some platforms,
  * this is equivalent to the isinf() macro, but not everywhere: C99
  * does not specify that isinf() needs to distinguish between positive
  * and negative infinity.
  */
 int
 is_infinite(double val)
 {
@@ -1183,24 +1198,29 @@ ftod(PG_FUNCTION_ARGS)
 }
 
 
 /*
  *		dtof			- converts a float8 number to a float4 number
  */
 Datum
 dtof(PG_FUNCTION_ARGS)
 {
 	float8		num = PG_GETARG_FLOAT8(0);
+	float4		result;
 
-	check_float4_val((float4) num, isinf(num), num == 0);
+	result = (float4) num;
+	if (unlikely(isinf(result) && !isinf(num)))
+		float_overflow_error();
+	if (unlikely(result == 0.0f && num != 0.0f))
+		float_underflow_error();
 
-	PG_RETURN_FLOAT4((float4) num);
+	PG_RETURN_FLOAT4(result);
 }
 
 
 /*
  *		dtoi4			- converts a float8 number to an int4 number
  */
 Datum
 dtoi4(PG_FUNCTION_ARGS)
 {
 	float8		num = PG_GETARG_FLOAT8(0);
@@ -1437,37 +1457,44 @@ dsqrt(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
 				 errmsg("cannot take square root of a negative number")));
 
 	result = sqrt(arg1);
+	if (unlikely(isinf(result) && !isinf(arg1)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && arg1 != 0.0))
+		float_underflow_error();
 
-	check_float8_val(result, isinf(arg1), arg1 == 0);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcbrt			- returns cube root of arg1
  */
 Datum
 dcbrt(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	result = cbrt(arg1);
-	check_float8_val(result, isinf(arg1), arg1 == 0);
+	if (unlikely(isinf(result) && !isinf(arg1)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && arg1 != 0.0))
+		float_underflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dpow			- returns pow(arg1,arg2)
  */
 Datum
 dpow(PG_FUNCTION_ARGS)
 {
@@ -1525,40 +1552,48 @@ dpow(PG_FUNCTION_ARGS)
 			/* The sign of Inf is not significant in this case. */
 			result = get_float8_infinity();
 		else if (fabs(arg1) != 1)
 			result = 0;
 		else
 			result = 1;
 	}
 	else if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1) || isinf(arg2), arg1 == 0);
+	if (unlikely(isinf(result) && !isinf(arg1) && !isinf(arg2)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && arg1 != 0.0))
+		float_underflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dexp			- returns the exponential function of arg1
  */
 Datum
 dexp(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	errno = 0;
 	result = exp(arg1);
 	if (errno == ERANGE && result != 0 && !isinf(result))
 		result = get_float8_infinity();
 
-	check_float8_val(result, isinf(arg1), false);
+	if (unlikely(isinf(result) && !isinf(arg1)))
+		float_overflow_error();
+	if (unlikely(result == 0.0))
+		float_underflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog1			- returns the natural logarithm of arg1
  */
 Datum
 dlog1(PG_FUNCTION_ARGS)
 {
@@ -1572,22 +1607,25 @@ dlog1(PG_FUNCTION_ARGS)
 	if (arg1 == 0.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log(arg1);
+	if (unlikely(isinf(result) && !isinf(arg1)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && arg1 != 1.0))
+		float_underflow_error();
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dlog10			- returns the base 10 logarithm of arg1
  */
 Datum
 dlog10(PG_FUNCTION_ARGS)
 {
@@ -1602,22 +1640,25 @@ dlog10(PG_FUNCTION_ARGS)
 	if (arg1 == 0.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of zero")));
 	if (arg1 < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_LOG),
 				 errmsg("cannot take logarithm of a negative number")));
 
 	result = log10(arg1);
+	if (unlikely(isinf(result) && !isinf(arg1)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && arg1 != 1.0))
+		float_underflow_error();
 
-	check_float8_val(result, isinf(arg1), arg1 == 1);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dacos			- returns the arccos of arg1 (radians)
  */
 Datum
 dacos(PG_FUNCTION_ARGS)
 {
@@ -1632,22 +1673,23 @@ dacos(PG_FUNCTION_ARGS)
 	 * The principal branch of the inverse cosine function maps values in the
 	 * range [-1, 1] to values in the range [0, Pi], so we should reject any
 	 * inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acos(arg1);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
 
-	check_float8_val(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasin			- returns the arcsin of arg1 (radians)
  */
 Datum
 dasin(PG_FUNCTION_ARGS)
 {
@@ -1662,22 +1704,23 @@ dasin(PG_FUNCTION_ARGS)
 	 * The principal branch of the inverse sine function maps values in the
 	 * range [-1, 1] to values in the range [-Pi/2, Pi/2], so we should reject
 	 * any inputs outside that range and the result will always be finite.
 	 */
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = asin(arg1);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
 
-	check_float8_val(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datan			- returns the arctan of arg1 (radians)
  */
 Datum
 datan(PG_FUNCTION_ARGS)
 {
@@ -1687,22 +1730,23 @@ datan(PG_FUNCTION_ARGS)
 	/* Per the POSIX spec, return NaN if the input is NaN */
 	if (isnan(arg1))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-Pi/2, Pi/2], so the result should always be
 	 * finite, even if the input is infinite.
 	 */
 	result = atan(arg1);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
 
-	check_float8_val(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2			- returns the arctan of arg1/arg2 (radians)
  */
 Datum
 datan2(PG_FUNCTION_ARGS)
 {
@@ -1712,22 +1756,23 @@ datan2(PG_FUNCTION_ARGS)
 
 	/* Per the POSIX spec, return NaN if either input is NaN */
 	if (isnan(arg1) || isnan(arg2))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/*
 	 * atan2 maps all inputs to values in the range [-Pi, Pi], so the result
 	 * should always be finite, even if the inputs are infinite.
 	 */
 	result = atan2(arg1, arg2);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
 
-	check_float8_val(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcos			- returns the cosine of arg1 (radians)
  */
 Datum
 dcos(PG_FUNCTION_ARGS)
 {
@@ -1752,22 +1797,23 @@ dcos(PG_FUNCTION_ARGS)
 	 * should return a domain error; but we won't notice that unless the
 	 * platform reports via errno, so also explicitly test for infinite
 	 * inputs.
 	 */
 	errno = 0;
 	result = cos(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
+	if (unlikely(isinf(result)))
+		float_overflow_error();
 
-	check_float8_val(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcot			- returns the cotangent of arg1 (radians)
  */
 Datum
 dcot(PG_FUNCTION_ARGS)
 {
@@ -1780,21 +1826,22 @@ dcot(PG_FUNCTION_ARGS)
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = 1.0 / result;
-	check_float8_val(result, true /* cot(0) == Inf */ , true);
+	/* Not checking for overflow because cot(0) == Inf */
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsin			- returns the sine of arg1 (radians)
  */
 Datum
 dsin(PG_FUNCTION_ARGS)
 {
@@ -1805,22 +1852,23 @@ dsin(PG_FUNCTION_ARGS)
 	if (isnan(arg1))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = sin(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
+	if (unlikely(isinf(result)))
+		float_overflow_error();
 
-	check_float8_val(result, false, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtan			- returns the tangent of arg1 (radians)
  */
 Datum
 dtan(PG_FUNCTION_ARGS)
 {
@@ -1831,22 +1879,22 @@ dtan(PG_FUNCTION_ARGS)
 	if (isnan(arg1))
 		PG_RETURN_FLOAT8(get_float8_nan());
 
 	/* Be sure to throw an error if the input is infinite --- see dcos() */
 	errno = 0;
 	result = tan(arg1);
 	if (errno != 0 || isinf(arg1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
+	/* Not checking for overflow because tan(pi/2) == Inf */
 
-	check_float8_val(result, true /* tan(pi/2) == Inf */ , true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /* ========== DEGREE-BASED TRIGONOMETRIC FUNCTIONS ========== */
 
 
 /*
  * Initialize the cached constants declared at the head of this file
  * (sin_30 etc).  The fact that we need those at all, let alone need this
@@ -1984,21 +2032,23 @@ dacosd(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = acosd_q1(arg1);
 	else
 		result = 90.0 + asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dasind			- returns the arcsin of arg1 (degrees)
  */
 Datum
 dasind(PG_FUNCTION_ARGS)
 {
@@ -2019,21 +2069,23 @@ dasind(PG_FUNCTION_ARGS)
 	if (arg1 < -1.0 || arg1 > 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	if (arg1 >= 0.0)
 		result = asind_q1(arg1);
 	else
 		result = -asind_q1(-arg1);
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		datand			- returns the arctan of arg1 (degrees)
  */
 Datum
 datand(PG_FUNCTION_ARGS)
 {
@@ -2049,21 +2101,23 @@ datand(PG_FUNCTION_ARGS)
 
 	/*
 	 * The principal branch of the inverse tangent function maps all inputs to
 	 * values in the range [-90, 90], so the result should always be finite,
 	 * even if the input is infinite.  Additionally, we take care to ensure
 	 * than when arg1 is 1, the result is exactly 45.
 	 */
 	atan_arg1 = atan(arg1);
 	result = (atan_arg1 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		atan2d			- returns the arctan of arg1/arg2 (degrees)
  */
 Datum
 datan2d(PG_FUNCTION_ARGS)
 {
@@ -2083,21 +2137,23 @@ datan2d(PG_FUNCTION_ARGS)
 	 * result should always be finite, even if the inputs are infinite.
 	 *
 	 * Note: this coding assumes that atan(1.0) is a suitable scaling constant
 	 * to get an exact result from atan2().  This might well fail on us at
 	 * some point, requiring us to decide exactly what inputs we think we're
 	 * going to guarantee an exact result for.
 	 */
 	atan2_arg1_arg2 = atan2(arg1, arg2);
 	result = (atan2_arg1_arg2 / atan_1_0) * 45.0;
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		sind_0_to_30	- returns the sine of an angle that lies between 0 and
  *						  30 degrees.  This will return exactly 0 when x is 0,
  *						  and exactly 0.5 when x is 30 degrees.
  */
 static double
@@ -2204,21 +2260,23 @@ dcosd(PG_FUNCTION_ARGS)
 
 	if (arg1 > 90.0)
 	{
 		/* cosd(180-x) = -cosd(x) */
 		arg1 = 180.0 - arg1;
 		sign = -sign;
 	}
 
 	result = sign * cosd_q1(arg1);
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcotd			- returns the cotangent of arg1 (degrees)
  */
 Datum
 dcotd(PG_FUNCTION_ARGS)
 {
@@ -2269,21 +2327,22 @@ dcotd(PG_FUNCTION_ARGS)
 	result = sign * (cot_arg1 / cot_45);
 
 	/*
 	 * On some machines we get cotd(270) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* cotd(0) == Inf */ , true);
+	/* Not checking for overflow because cotd(0) == Inf */
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dsind			- returns the sine of arg1 (degrees)
  */
 Datum
 dsind(PG_FUNCTION_ARGS)
 {
@@ -2323,21 +2382,23 @@ dsind(PG_FUNCTION_ARGS)
 	}
 
 	if (arg1 > 90.0)
 	{
 		/* sind(180-x) = sind(x) */
 		arg1 = 180.0 - arg1;
 	}
 
 	result = sign * sind_q1(arg1);
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dtand			- returns the tangent of arg1 (degrees)
  */
 Datum
 dtand(PG_FUNCTION_ARGS)
 {
@@ -2388,21 +2449,22 @@ dtand(PG_FUNCTION_ARGS)
 	result = sign * (tan_arg1 / tan_45);
 
 	/*
 	 * On some machines we get tand(180) = minus zero, but this isn't always
 	 * true.  For portability, and because the user constituency for this
 	 * function probably doesn't want minus zero, force it to plain zero.
 	 */
 	if (result == 0.0)
 		result = 0.0;
 
-	check_float8_val(result, true /* tand(90) == Inf */ , true);
+	/* Not checking for overflow because tand(90) == Inf */
+
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		degrees		- returns degrees converted from radians
  */
 Datum
 degrees(PG_FUNCTION_ARGS)
 {
@@ -2455,21 +2517,20 @@ dsinh(PG_FUNCTION_ARGS)
 	 * sign of arg1.
 	 */
 	if (errno == ERANGE)
 	{
 		if (arg1 < 0)
 			result = -get_float8_infinity();
 		else
 			result = get_float8_infinity();
 	}
 
-	check_float8_val(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		dcosh			- returns the hyperbolic cosine of arg1
  */
 Datum
 dcosh(PG_FUNCTION_ARGS)
 {
@@ -2479,57 +2540,60 @@ dcosh(PG_FUNCTION_ARGS)
 	errno = 0;
 	result = cosh(arg1);
 
 	/*
 	 * if an ERANGE error occurs, it means there is an overflow.  As cosh is
 	 * always positive, it always means the result is positive infinity.
 	 */
 	if (errno == ERANGE)
 		result = get_float8_infinity();
 
-	check_float8_val(result, true, false);
+	if (unlikely(result == 0.0))
+		float_underflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dtanh			- returns the hyperbolic tangent of arg1
  */
 Datum
 dtanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For tanh, we don't need an errno check because it never overflows.
 	 */
 	result = tanh(arg1);
 
-	check_float8_val(result, false, true);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dasinh			- returns the inverse hyperbolic sine of arg1
  */
 Datum
 dasinh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
 	float8		result;
 
 	/*
 	 * For asinh, we don't need an errno check because it never overflows.
 	 */
 	result = asinh(arg1);
 
-	check_float8_val(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		dacosh			- returns the inverse hyperbolic cosine of arg1
  */
 Datum
 dacosh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2541,21 +2605,20 @@ dacosh(PG_FUNCTION_ARGS)
 	 * thing because some implementations will report that for NaN. Otherwise,
 	 * no error is possible.
 	 */
 	if (arg1 < 1.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("input is out of range")));
 
 	result = acosh(arg1);
 
-	check_float8_val(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 /*
  *		datanh			- returns the inverse hyperbolic tangent of arg1
  */
 Datum
 datanh(PG_FUNCTION_ARGS)
 {
 	float8		arg1 = PG_GETARG_FLOAT8(0);
@@ -2576,21 +2639,20 @@ datanh(PG_FUNCTION_ARGS)
 	 * glibc versions may produce the wrong errno for this.  All other inputs
 	 * cannot produce an error.
 	 */
 	if (arg1 == -1.0)
 		result = -get_float8_infinity();
 	else if (arg1 == 1.0)
 		result = get_float8_infinity();
 	else
 		result = atanh(arg1);
 
-	check_float8_val(result, true, true);
 	PG_RETURN_FLOAT8(result);
 }
 
 
 /*
  *		drandom		- returns a random number
  */
 Datum
 drandom(PG_FUNCTION_ARGS)
 {
@@ -2777,21 +2839,22 @@ float8_combine(PG_FUNCTION_ARGS)
 		N = N1;
 		Sx = Sx1;
 		Sxx = Sxx1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		if (unlikely(isinf(Sxx) && !isinf(Sxx1) && !isinf(Sxx2)))
+			float_overflow_error();
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
@@ -2846,23 +2909,21 @@ float8_accum(PG_FUNCTION_ARGS)
 
 		/*
 		 * Overflow check.  We only report an overflow error when finite
 		 * inputs lead to infinite results.  Note also that Sxx should be NaN
 		 * if any of the inputs are infinite, so we intentionally prevent Sxx
 		 * from becoming infinite.
 		 */
 		if (isinf(Sx) || isinf(Sxx))
 		{
 			if (!isinf(transvalues[1]) && !isinf(newval))
-				ereport(ERROR,
-						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-						 errmsg("value out of range: overflow")));
+				float_overflow_error();
 
 			Sxx = get_float8_nan();
 		}
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
@@ -2922,23 +2983,21 @@ float4_accum(PG_FUNCTION_ARGS)
 
 		/*
 		 * Overflow check.  We only report an overflow error when finite
 		 * inputs lead to infinite results.  Note also that Sxx should be NaN
 		 * if any of the inputs are infinite, so we intentionally prevent Sxx
 		 * from becoming infinite.
 		 */
 		if (isinf(Sx) || isinf(Sxx))
 		{
 			if (!isinf(transvalues[1]) && !isinf(newval))
-				ereport(ERROR,
-						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-						 errmsg("value out of range: overflow")));
+				float_overflow_error();
 
 			Sxx = get_float8_nan();
 		}
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
@@ -3145,23 +3204,21 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 		 */
 		if (isinf(Sx) || isinf(Sxx) || isinf(Sy) || isinf(Syy) || isinf(Sxy))
 		{
 			if (((isinf(Sx) || isinf(Sxx)) &&
 				 !isinf(transvalues[1]) && !isinf(newvalX)) ||
 				((isinf(Sy) || isinf(Syy)) &&
 				 !isinf(transvalues[3]) && !isinf(newvalY)) ||
 				(isinf(Sxy) &&
 				 !isinf(transvalues[1]) && !isinf(newvalX) &&
 				 !isinf(transvalues[3]) && !isinf(newvalY)))
-				ereport(ERROR,
-						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-						 errmsg("value out of range: overflow")));
+				float_overflow_error();
 
 			if (isinf(Sxx))
 				Sxx = get_float8_nan();
 			if (isinf(Syy))
 				Syy = get_float8_nan();
 			if (isinf(Sxy))
 				Sxy = get_float8_nan();
 		}
 	}
 
@@ -3287,27 +3344,30 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 		Sy = Sy1;
 		Syy = Syy1;
 		Sxy = Sxy1;
 	}
 	else
 	{
 		N = N1 + N2;
 		Sx = float8_pl(Sx1, Sx2);
 		tmp1 = Sx1 / N1 - Sx2 / N2;
 		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
-		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		if (unlikely(isinf(Sxx) && !isinf(Sxx1) && !isinf(Sxx2)))
+			float_overflow_error();
 		Sy = float8_pl(Sy1, Sy2);
 		tmp2 = Sy1 / N1 - Sy2 / N2;
 		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
-		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		if (unlikely(isinf(Syy) && !isinf(Syy1) && !isinf(Syy2)))
+			float_overflow_error();
 		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
-		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+		if (unlikely(isinf(Sxy) && !isinf(Sxy1) && !isinf(Sxy2)))
+			float_overflow_error();
 	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
 	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
 	 * new array with the updated transition data and return it.
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues1[0] = N;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index d5ded471c4..71ee7f8f08 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -5538,14 +5538,17 @@ pg_hypot(float8 x, float8 y)
 	 * such cases, but more importantly it also protects against
 	 * divide-by-zero errors, since now x >= y.
 	 */
 	if (y == 0.0)
 		return x;
 
 	/* Determine the hypotenuse */
 	yx = y / x;
 	result = x * sqrt(1.0 + (yx * yx));
 
-	check_float8_val(result, false, false);
+	if (unlikely(isinf(result)))
+		float_overflow_error();
+	if (unlikely(result == 0.0))
+		float_underflow_error();
 
 	return result;
 }
diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index e2c5dc0f57..0018b3b5af 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -30,20 +30,22 @@
 static const uint32 nan[2] = {0xffffffff, 0x7fffffff};
 
 #define NAN (*(const float8 *) nan)
 #endif
 
 extern PGDLLIMPORT int extra_float_digits;
 
 /*
  * Utility functions in float.c
  */
+extern void	float_overflow_error(void);
+extern void	float_underflow_error(void);
 extern int	is_infinite(float8 val);
 extern float8 float8in_internal(char *num, char **endptr_p,
 								const char *type_name, const char *orig_string);
 extern float8 float8in_internal_opt_error(char *num, char **endptr_p,
 										  const char *type_name, const char *orig_string,
 										  bool *have_error);
 extern char *float8out_internal(float8 num);
 extern int	float4_cmp_internal(float4 a, float4 b);
 extern int	float8_cmp_internal(float8 a, float8 b);
 
@@ -123,158 +125,165 @@ get_float8_nan(void)
 	/* C99 standard way */
 	return (float8) NAN;
 #else
 	/* Assume we can get a NaN via zero divide */
 	return (float8) (0.0 / 0.0);
 #endif
 }
 
 /*
  * Checks to see if a float4/8 val has underflowed or overflowed
+ *
+ * They are no longer used because of the performance regression they cause,
+ * but left for backwards compatibility.
  */
 
 static inline void
 check_float4_val(const float4 val, const bool inf_is_valid,
 				 const bool zero_is_valid)
 {
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
+	if (unlikely(isinf(val) && !inf_is_valid))
+		float_overflow_error();
+	if (unlikely(val == 0.0f && !zero_is_valid))
+		float_underflow_error();
 }
 
 static inline void
 check_float8_val(const float8 val, const bool inf_is_valid,
 				 const bool zero_is_valid)
 {
-	if (!inf_is_valid && unlikely(isinf(val)))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: overflow")));
-
-	if (!zero_is_valid && unlikely(val == 0.0))
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value out of range: underflow")));
+	if (unlikely(isinf(val) && !inf_is_valid))
+		float_overflow_error();
+	if (unlikely(val == 0.0 && !zero_is_valid))
+		float_underflow_error();
 }
 
 /*
- * Routines for operations with the checks above
+ * Routines for operations with overflow/underflow checks
  *
  * There isn't any way to check for underflow of addition/subtraction
  * because numbers near the underflow value have already been rounded to
  * the point where we can't detect that the two values were originally
  * different, e.g. on x86, '1e-45'::float4 == '2e-45'::float4 ==
  * 1.4013e-45.
  */
 
 static inline float4
 float4_pl(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 + val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
 
 	return result;
 }
 
 static inline float8
 float8_pl(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 + val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
 
 	return result;
 }
 
 static inline float4
 float4_mi(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 - val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), true);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
 
 	return result;
 }
 
 static inline float8
 float8_mi(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 - val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), true);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
 
 	return result;
 }
 
 static inline float4
 float4_mul(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	result = val1 * val2;
-	check_float4_val(result, isinf(val1) || isinf(val2),
-					 val1 == 0.0f || val2 == 0.0f);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
+	if (unlikely(result == 0.0f && val1 != 0.0f && val2 != 0.0f))
+		float_underflow_error();
 
 	return result;
 }
 
 static inline float8
 float8_mul(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	result = val1 * val2;
-	check_float8_val(result, isinf(val1) || isinf(val2),
-					 val1 == 0.0 || val2 == 0.0);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && val1 != 0.0 && val2 != 0.0))
+		float_underflow_error();
 
 	return result;
 }
 
 static inline float4
 float4_div(const float4 val1, const float4 val2)
 {
 	float4		result;
 
 	if (val2 == 0.0f)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float4_val(result, isinf(val1) || isinf(val2), val1 == 0.0f);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
+	if (unlikely(result == 0.0f && val1 != 0.0f))
+		float_underflow_error();
 
 	return result;
 }
 
 static inline float8
 float8_div(const float8 val1, const float8 val2)
 {
 	float8		result;
 
 	if (val2 == 0.0)
 		ereport(ERROR,
 				(errcode(ERRCODE_DIVISION_BY_ZERO),
 				 errmsg("division by zero")));
 
 	result = val1 / val2;
-	check_float8_val(result, isinf(val1) || isinf(val2), val1 == 0.0);
+	if (unlikely(isinf(result) && !isinf(val1) && !isinf(val2)))
+		float_overflow_error();
+	if (unlikely(result == 0.0 && val1 != 0.0))
+		float_underflow_error();
 
 	return result;
 }
 
 /*
  * Routines for NaN-aware comparisons
  *
  * We consider all NaNs to be equal and larger than any non-NaN. This is
  * somewhat arbitrary; the important thing is to have a consistent sort
  * order.
-- 
2.17.1

#29

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Emre Hasegeli (#28)

Re: In PG12, query with float calculations is slower than PG11

Emre Hasegeli <emre@hasegeli.com> writes:

Cool. Emre, any chance you could write a patch along those lines?

How about the one attached?

I see some minor things I don't like here, eg float_*flow_error()
need some documentation as to why they exist. But I'll review,
fix those things up and then push.

regards, tom lane

#30

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Tom Lane (#29)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On February 13, 2020 8:30:45 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Emre Hasegeli <emre@hasegeli.com> writes:

Cool. Emre, any chance you could write a patch along those lines?

How about the one attached?

I see some minor things I don't like here, eg float_*flow_error()
need some documentation as to why they exist. But I'll review,
fix those things up and then push.

Would be good to mark them noreturn too.

Wonder if it's useful to add the"cold" marker to pg. Not as part of this patch, but for functions like these.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#31

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Emre Hasegeli (#28)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-13 16:25:25 +0000, Emre Hasegeli wrote:

And also this commit is changing the usage of unlikely() to cover
the whole condition. Using it only for the result is not semantically
correct. It is more than likely for the result to be infinite when
the input is, or it to be 0 when the input is.

I'm not really convinced by this fwiw.

Comparing

if (unlikely(isinf(result) && !isinf(num)))
float_overflow_error();

with

if (unlikely(isinf(result)) && !isinf(num))
float_overflow_error();

I don't think it's clear that we want the former. What we want to
express is that it's unlikely that the result is infinite, and that the
compiler should optimize for that. Since there's a jump involved between
the check for isinf(result) and the one for !isinf(num), we want the
compiler to implement this so the non-overflow path follows the first
check, and the rest of the check is later.

+void float_overflow_error()
+{

Tom's probably on this, but it should be (void).

@@ -2846,23 +2909,21 @@ float8_accum(PG_FUNCTION_ARGS)

/*
* Overflow check.  We only report an overflow error when finite
* inputs lead to infinite results.  Note also that Sxx should be NaN
* if any of the inputs are infinite, so we intentionally prevent Sxx
* from becoming infinite.
*/
if (isinf(Sx) || isinf(Sxx))
{
if (!isinf(transvalues[1]) && !isinf(newval))
-				ereport(ERROR,
-						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-						 errmsg("value out of range: overflow")));
+				float_overflow_error();

Sxx = get_float8_nan();
}
}

Probably worth unifiying the use of unlikely around isinf here and in
the follow functions.

Greetings,

Andres Freund

#32

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Andres Freund (#30)

Re: In PG12, query with float calculations is slower than PG11

Andres Freund <andres@anarazel.de> writes:

On February 13, 2020 8:30:45 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I see some minor things I don't like here, eg float_*flow_error()
need some documentation as to why they exist. But I'll review,
fix those things up and then push.

Would be good to mark them noreturn too.

Yeah, that was one of the things I didn't like ;-). Also the lack
of pg_noinline.

Wonder if it's useful to add the"cold" marker to pg. Not as part of this patch, but for functions like these.

I'm only seeing about a 1.5kB reduction in the backend size from
this patch, which kinda surprises me, but it says that we're
not winning all that much from just having one copy of the ereport
calls. So I don't think that "cold" is going to add much.

regards, tom lane

#33

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Andres Freund (#31)

Re: In PG12, query with float calculations is slower than PG11

Andres Freund <andres@anarazel.de> writes:

On 2020-02-13 16:25:25 +0000, Emre Hasegeli wrote:

And also this commit is changing the usage of unlikely() to cover
the whole condition. Using it only for the result is not semantically
correct. It is more than likely for the result to be infinite when
the input is, or it to be 0 when the input is.

I'm not really convinced by this fwiw.

Comparing

if (unlikely(isinf(result) && !isinf(num)))
float_overflow_error();

with

if (unlikely(isinf(result)) && !isinf(num))
float_overflow_error();

I don't think it's clear that we want the former. What we want to
express is that it's unlikely that the result is infinite, and that the
compiler should optimize for that. Since there's a jump involved between
the check for isinf(result) and the one for !isinf(num), we want the
compiler to implement this so the non-overflow path follows the first
check, and the rest of the check is later.

Yeah, I was wondering about that. I'll change it as you suggest.

regards, tom lane

#34

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Tom Lane (#33)

Re: In PG12, query with float calculations is slower than PG11

... and pushed. One other change I made beyond those suggested
was to push the zero-divide ereport's out-of-line as well.

I did not do anything about adding unlikely() calls around the
unrelated isinf tests in float.c. That seemed to me to be a separate
matter, and I'm not quite convinced it'd be a win anyway.

regards, tom lane

#35

Andres Freund

andres@anarazel.de

almost 6 years ago

In reply to: Tom Lane (#34)

Re: In PG12, query with float calculations is slower than PG11

Hi,

On 2020-02-13 13:40:43 -0500, Tom Lane wrote:

... and pushed. One other change I made beyond those suggested
was to push the zero-divide ereport's out-of-line as well.

Thanks!

I did not do anything about adding unlikely() calls around the
unrelated isinf tests in float.c. That seemed to me to be a separate
matter, and I'm not quite convinced it'd be a win anyway.

I was mostly going for consistency...

Greetings,

Andres Freund

#36

Amit Langote

amitlangote09@gmail.com

almost 6 years ago

In reply to: Andres Freund (#35)

Re: In PG12, query with float calculations is slower than PG11

On Fri, Feb 14, 2020 at 3:47 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-02-13 13:40:43 -0500, Tom Lane wrote:

... and pushed. One other change I made beyond those suggested
was to push the zero-divide ereport's out-of-line as well.

Thanks!

Thank you all.

I repeated some of the tests I did earlier and things look good.

gcc-8
=====

HEAD

latency average = 296.842 ms

42.05% postgres postgres [.] ExecInterpExpr
15.14% postgres postgres [.] float8_accum
9.32% postgres libc-2.17.so [.] __isinf
7.32% postgres postgres [.] dsqrt
5.67% postgres postgres [.] float8mul
4.20% postgres postgres [.] ftod

11.7

latency average = 289.439 ms

41.52% postgres postgres [.] ExecInterpExpr
13.59% postgres libc-2.17.so [.] __isinf
10.98% postgres postgres [.] float8_accum
8.26% postgres postgres [.] dsqrt
6.17% postgres postgres [.] float8mul
3.65% postgres postgres [.] ftod

clang-7
=======

HEAD

latency average = 233.735 ms

43.84% postgres postgres [.] ExecInterpExpr
15.17% postgres postgres [.] float8_accum
8.25% postgres postgres [.] dsqrt
7.35% postgres postgres [.] float8mul
5.84% postgres postgres [.] ftod
3.78% postgres postgres [.] tts_buffer_heap_getsomeattrs

11.7

latency average = 221.009 ms

49.55% postgres postgres [.] ExecInterpExpr
12.05% postgres postgres [.] float8_accum
8.97% postgres postgres [.] dsqrt
6.72% postgres postgres [.] float8mul
5.62% postgres postgres [.] ftod
2.18% postgres postgres [.] slot_deform_tuple

HEAD and PG 11 are now comparable even when built with gcc.

Regards,
Amit

#37

keisuke kuroda

keisuke.kuroda.3862@gmail.com

almost 6 years ago

In reply to: Amit Langote (#36)

Re: In PG12, query with float calculations is slower than PG11

Thank you very much everyone.

Improvement was confirmed even if PG12_STABLE was built with gcc 4.8.5.

* PG_12_STABLE
* gcc 4.8.5

postgres=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on)
select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e)
from realtest;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.realtest (cost=0.00..288692.14 rows=9999873 width=40)
(actual time=0.012..4118.432 rows=10000001 loops=1)
Output: ('2'::double precision * a), ('2'::double precision * b),
('2'::double precision * c), ('2'::double precision * d), ('2'::double
precision * e)
Buffers: shared hit=63695
Planning Time: 0.034 ms
Execution Time: 4811.957 ms
(5 rows)

32.03% postgres postgres [.] ExecInterpExpr
12.28% postgres postgres [.] float84mul
9.62% postgres [vdso] [.] __vdso_clock_gettime
6.45% postgres libc-2.17.so [.] __isinf
5.15% postgres postgres [.] tts_buffer_heap_getsomeattrs
3.83% postgres postgres [.] ExecScan

Best Regards,
Keisuke Kuroda

2020年2月14日(金) 13:29 Amit Langote <amitlangote09@gmail.com>:

Show quoted text

On Fri, Feb 14, 2020 at 3:47 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-02-13 13:40:43 -0500, Tom Lane wrote:

... and pushed. One other change I made beyond those suggested
was to push the zero-divide ereport's out-of-line as well.

Thanks!

Thank you all.

I repeated some of the tests I did earlier and things look good.

gcc-8
=====

HEAD

latency average = 296.842 ms

42.05% postgres postgres [.] ExecInterpExpr
15.14% postgres postgres [.] float8_accum
9.32% postgres libc-2.17.so [.] __isinf
7.32% postgres postgres [.] dsqrt
5.67% postgres postgres [.] float8mul
4.20% postgres postgres [.] ftod

11.7

latency average = 289.439 ms

41.52% postgres postgres [.] ExecInterpExpr
13.59% postgres libc-2.17.so [.] __isinf
10.98% postgres postgres [.] float8_accum
8.26% postgres postgres [.] dsqrt
6.17% postgres postgres [.] float8mul
3.65% postgres postgres [.] ftod

clang-7
=======

HEAD

latency average = 233.735 ms

43.84% postgres postgres [.] ExecInterpExpr
15.17% postgres postgres [.] float8_accum
8.25% postgres postgres [.] dsqrt
7.35% postgres postgres [.] float8mul
5.84% postgres postgres [.] ftod
3.78% postgres postgres [.] tts_buffer_heap_getsomeattrs

11.7

latency average = 221.009 ms

49.55% postgres postgres [.] ExecInterpExpr
12.05% postgres postgres [.] float8_accum
8.97% postgres postgres [.] dsqrt
6.72% postgres postgres [.] float8mul
5.62% postgres postgres [.] ftod
2.18% postgres postgres [.] slot_deform_tuple

HEAD and PG 11 are now comparable even when built with gcc.

Regards,
Amit