Incorrect results from numeric round() and trunc()

Started by Dean Rasheedover 1 year ago6 messages
#1Dean Rasheed
dean.a.rasheed@gmail.com
1 attachment(s)

The numeric round() and trunc() functions clamp the scale argument to
the range between +/- NUMERIC_MAX_RESULT_SCALE, which is +/- 2000.
That's a long way short of the actual allowed range of type numeric,
so they produce incorrect results when rounding/truncating more than
2000 digits before or after the decimal point. For example,
round(1e-5000, 5000) returns 0 instead of 1e-5000.

Attached is a patch fixing that, using the actual documented range of
type numeric.

I've also tidied up a bit by replacing all instances of SHRT_MAX with
a new constant NUMERIC_WEIGHT_MAX, whose name more accurately
describes the limit, as used in various other overflow checks.

In doing so, I also noticed a comment in power_var() which claimed
that ln_dweight could be as low as -SHRT_MAX (-32767), which is wrong.
It can only be as small as -NUMERIC_DSCALE_MAX (-16383), though that
doesn't affect the point being made in that comment.

I'd like to treat this as a bug-fix and back-patch it, since the
current behaviour is clearly broken.

Regards,
Dean

Attachments:

fix-numeric-round-and-trunc-limits.patchtext/x-patch; charset=US-ASCII; name=fix-numeric-round-and-trunc-limits.patchDownload
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 5510a20..57386aa
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -249,6 +249,13 @@ struct NumericData
 	 | ((n)->choice.n_short.n_header & NUMERIC_SHORT_WEIGHT_MASK)) \
 	: ((n)->choice.n_long.n_weight))
 
+/*
+ * Maximum weight of a stored Numeric value (based on the use of int16 for the
+ * weight in NumericLong).  Note that intermediate values held in NumericVar
+ * and NumericSumAccum variables may have much larger weights.
+ */
+#define NUMERIC_WEIGHT_MAX			PG_INT16_MAX
+
 /* ----------
  * NumericVar is the format we use for arithmetic.  The digit-array part
  * is the same as the NumericData storage format, but the header is more
@@ -1545,10 +1552,15 @@ numeric_round(PG_FUNCTION_ARGS)
 		PG_RETURN_NUMERIC(duplicate_numeric(num));
 
 	/*
-	 * Limit the scale value to avoid possible overflow in calculations
+	 * Limit the scale value to avoid possible overflow in calculations.
+	 *
+	 * These limits are based on the maximum number of digits a Numeric value
+	 * can have before and after the decimal point, but we must allow for one
+	 * extra digit before the decimal point, in case the most significant
+	 * digit rounds up; we must check if that causes Numeric overflow.
 	 */
-	scale = Max(scale, -NUMERIC_MAX_RESULT_SCALE);
-	scale = Min(scale, NUMERIC_MAX_RESULT_SCALE);
+	scale = Max(scale, -(NUMERIC_WEIGHT_MAX + 1) * DEC_DIGITS - 1);
+	scale = Min(scale, NUMERIC_DSCALE_MAX);
 
 	/*
 	 * Unpack the argument and round it at the proper digit position
@@ -1594,10 +1606,13 @@ numeric_trunc(PG_FUNCTION_ARGS)
 		PG_RETURN_NUMERIC(duplicate_numeric(num));
 
 	/*
-	 * Limit the scale value to avoid possible overflow in calculations
+	 * Limit the scale value to avoid possible overflow in calculations.
+	 *
+	 * These limits are based on the maximum number of digits a Numeric value
+	 * can have before and after the decimal point.
 	 */
-	scale = Max(scale, -NUMERIC_MAX_RESULT_SCALE);
-	scale = Min(scale, NUMERIC_MAX_RESULT_SCALE);
+	scale = Max(scale, -(NUMERIC_WEIGHT_MAX + 1) * DEC_DIGITS);
+	scale = Min(scale, NUMERIC_DSCALE_MAX);
 
 	/*
 	 * Unpack the argument and truncate it at the proper digit position
@@ -7276,7 +7291,7 @@ set_var_from_non_decimal_integer_str(con
 					add_var(dest, &tmp_var, dest);
 
 					/* Result will overflow if weight overflows int16 */
-					if (dest->weight > SHRT_MAX)
+					if (dest->weight > NUMERIC_WEIGHT_MAX)
 						goto out_of_range;
 
 					/* Begin a new group */
@@ -7313,7 +7328,7 @@ set_var_from_non_decimal_integer_str(con
 					add_var(dest, &tmp_var, dest);
 
 					/* Result will overflow if weight overflows int16 */
-					if (dest->weight > SHRT_MAX)
+					if (dest->weight > NUMERIC_WEIGHT_MAX)
 						goto out_of_range;
 
 					/* Begin a new group */
@@ -7350,7 +7365,7 @@ set_var_from_non_decimal_integer_str(con
 					add_var(dest, &tmp_var, dest);
 
 					/* Result will overflow if weight overflows int16 */
-					if (dest->weight > SHRT_MAX)
+					if (dest->weight > NUMERIC_WEIGHT_MAX)
 						goto out_of_range;
 
 					/* Begin a new group */
@@ -7386,7 +7401,7 @@ set_var_from_non_decimal_integer_str(con
 	int64_to_numericvar(tmp, &tmp_var);
 	add_var(dest, &tmp_var, dest);
 
-	if (dest->weight > SHRT_MAX)
+	if (dest->weight > NUMERIC_WEIGHT_MAX)
 		goto out_of_range;
 
 	dest->sign = sign;
@@ -11025,7 +11040,8 @@ power_var(const NumericVar *base, const
 	/*
 	 * Set the scale for the low-precision calculation, computing ln(base) to
 	 * around 8 significant digits.  Note that ln_dweight may be as small as
-	 * -SHRT_MAX, so the scale may exceed NUMERIC_MAX_DISPLAY_SCALE here.
+	 * -NUMERIC_DSCALE_MAX, so the scale may exceed NUMERIC_MAX_DISPLAY_SCALE
+	 * here.
 	 */
 	local_rscale = 8 - ln_dweight;
 	local_rscale = Max(local_rscale, NUMERIC_MIN_DISPLAY_SCALE);
@@ -11133,7 +11149,7 @@ power_var_int(const NumericVar *base, in
 		f = 0;					/* result is 0 or 1 (weight 0), or error */
 
 	/* overflow/underflow tests with fuzz factors */
-	if (f > (SHRT_MAX + 1) * DEC_DIGITS)
+	if (f > (NUMERIC_WEIGHT_MAX + 1) * DEC_DIGITS)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("value overflows numeric format")));
@@ -11264,7 +11280,8 @@ power_var_int(const NumericVar *base, in
 		 * int16, the final result is guaranteed to overflow (or underflow, if
 		 * exp < 0), so we can give up before wasting too many cycles.
 		 */
-		if (base_prod.weight > SHRT_MAX || result->weight > SHRT_MAX)
+		if (base_prod.weight > NUMERIC_WEIGHT_MAX ||
+			result->weight > NUMERIC_WEIGHT_MAX)
 		{
 			/* overflow, unless neg, in which case result should be 0 */
 			if (!neg)
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
new file mode 100644
index 72f03c8..f30ac23
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -1346,6 +1346,108 @@ FROM generate_series(-5,5) AS t(i);
    5 |  -300000 |  -200000 |  -100000 |  100000 |  200000 |  300000
 (11 rows)
 
+-- Check limits of rounding before the decimal point
+SELECT round(4.4e131071, -131071) = 4e131071;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT round(4.5e131071, -131071) = 5e131071;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT round(4.5e131071, -131072); -- loses all digits
+ round 
+-------
+     0
+(1 row)
+
+SELECT round(5.5e131071, -131072); -- rounds up and overflows
+ERROR:  value overflows numeric format
+SELECT round(5.5e131071, -131073); -- loses all digits
+ round 
+-------
+     0
+(1 row)
+
+SELECT round(5.5e131071, -1000000); -- loses all digits
+ round 
+-------
+     0
+(1 row)
+
+-- Check limits of rounding after the decimal point
+SELECT round(5e-16383, 1000000) = 5e-16383;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT round(5e-16383, 16383) = 5e-16383;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT round(5e-16383, 16382) = 1e-16382;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT round(5e-16383, 16381) = 0;
+ ?column? 
+----------
+ t
+(1 row)
+
+-- Check limits of trunc() before the decimal point
+SELECT trunc(9.9e131071, -131071) = 9e131071;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT trunc(9.9e131071, -131072); -- loses all digits
+ trunc 
+-------
+     0
+(1 row)
+
+SELECT trunc(9.9e131071, -131073);  -- loses all digits
+ trunc 
+-------
+     0
+(1 row)
+
+SELECT trunc(9.9e131071, -1000000);  -- loses all digits
+ trunc 
+-------
+     0
+(1 row)
+
+-- Check limits of trunc() after the decimal point
+SELECT trunc(5e-16383, 1000000) = 5e-16383;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT trunc(5e-16383, 16383) = 5e-16383;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT trunc(5e-16383, 16382) = 0;
+ ?column? 
+----------
+ t
+(1 row)
+
 -- Testing for width_bucket(). For convenience, we test both the
 -- numeric and float8 versions of the function in this file.
 -- errors
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
new file mode 100644
index 83fc386..c863952
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -833,6 +833,31 @@ SELECT i as pow,
 	round((2.5 * 10 ^ i)::numeric, -i)
 FROM generate_series(-5,5) AS t(i);
 
+-- Check limits of rounding before the decimal point
+SELECT round(4.4e131071, -131071) = 4e131071;
+SELECT round(4.5e131071, -131071) = 5e131071;
+SELECT round(4.5e131071, -131072); -- loses all digits
+SELECT round(5.5e131071, -131072); -- rounds up and overflows
+SELECT round(5.5e131071, -131073); -- loses all digits
+SELECT round(5.5e131071, -1000000); -- loses all digits
+
+-- Check limits of rounding after the decimal point
+SELECT round(5e-16383, 1000000) = 5e-16383;
+SELECT round(5e-16383, 16383) = 5e-16383;
+SELECT round(5e-16383, 16382) = 1e-16382;
+SELECT round(5e-16383, 16381) = 0;
+
+-- Check limits of trunc() before the decimal point
+SELECT trunc(9.9e131071, -131071) = 9e131071;
+SELECT trunc(9.9e131071, -131072); -- loses all digits
+SELECT trunc(9.9e131071, -131073);  -- loses all digits
+SELECT trunc(9.9e131071, -1000000);  -- loses all digits
+
+-- Check limits of trunc() after the decimal point
+SELECT trunc(5e-16383, 1000000) = 5e-16383;
+SELECT trunc(5e-16383, 16383) = 5e-16383;
+SELECT trunc(5e-16383, 16382) = 0;
+
 -- Testing for width_bucket(). For convenience, we test both the
 -- numeric and float8 versions of the function in this file.
 
#2Joel Jacobson
joel@compiler.org
In reply to: Dean Rasheed (#1)
Re: Incorrect results from numeric round() and trunc()

On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:

The numeric round() and trunc() functions clamp the scale argument to
the range between +/- NUMERIC_MAX_RESULT_SCALE, which is +/- 2000.
That's a long way short of the actual allowed range of type numeric,
so they produce incorrect results when rounding/truncating more than
2000 digits before or after the decimal point. For example,
round(1e-5000, 5000) returns 0 instead of 1e-5000.

Attached is a patch fixing that, using the actual documented range of
type numeric.

I've also tidied up a bit by replacing all instances of SHRT_MAX with
a new constant NUMERIC_WEIGHT_MAX, whose name more accurately
describes the limit, as used in various other overflow checks.

In doing so, I also noticed a comment in power_var() which claimed
that ln_dweight could be as low as -SHRT_MAX (-32767), which is wrong.
It can only be as small as -NUMERIC_DSCALE_MAX (-16383), though that
doesn't affect the point being made in that comment.

I'd like to treat this as a bug-fix and back-patch it, since the
current behaviour is clearly broken.

Fix seems straightforward to me.
I agree it should be back-patched.

Regards,
Joel

#3Joel Jacobson
joel@compiler.org
In reply to: Dean Rasheed (#1)
Re: Incorrect results from numeric round() and trunc()

On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:

I've also tidied up a bit by replacing all instances of SHRT_MAX with
a new constant NUMERIC_WEIGHT_MAX, whose name more accurately
describes the limit, as used in various other overflow checks.

Having thought a bit more on this, I think we probably need a
DEC_DIGITS sensitive definition of NUMERIC_WEIGHT_MAX,
since per spec the max range for numeric is 0x20000 (131072)
decimal digits.

Therefore, I think perhaps what we want is:

+#define NUMERIC_DSCALE_MIN                     0
+#define NUMERIC_WEIGHT_MAX                     ((0x20000/DEC_DIGITS)-1)
+#define NUMERIC_WEIGHT_MIN                     (-(NUMERIC_DSCALE_MAX+1)/DEC_DIGITS)

Maybe also 0x20000 (131072) should be a defined constant.

Regards,
Joel

#4Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Joel Jacobson (#3)
Re: Incorrect results from numeric round() and trunc()

On Mon, 8 Jul 2024 at 00:40, Joel Jacobson <joel@compiler.org> wrote:

On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:

I've also tidied up a bit by replacing all instances of SHRT_MAX with
a new constant NUMERIC_WEIGHT_MAX, whose name more accurately
describes the limit, as used in various other overflow checks.

Having thought a bit more on this, I think we probably need a
DEC_DIGITS sensitive definition of NUMERIC_WEIGHT_MAX,
since per spec the max range for numeric is 0x20000 (131072)
decimal digits.

No, the maximum weight is determined by the use of int16 to store the
weight. Therefore if you did reduce DEC_DIGITS to 1 or 2, the number
of decimal digits allowed before the decimal point would be reduced
too.

Regards,
Dean

#5Joel Jacobson
joel@compiler.org
In reply to: Dean Rasheed (#4)
Re: Incorrect results from numeric round() and trunc()

On Mon, Jul 8, 2024, at 11:45, Dean Rasheed wrote:

On Mon, 8 Jul 2024 at 00:40, Joel Jacobson <joel@compiler.org> wrote:

On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:

I've also tidied up a bit by replacing all instances of SHRT_MAX with
a new constant NUMERIC_WEIGHT_MAX, whose name more accurately
describes the limit, as used in various other overflow checks.

Having thought a bit more on this, I think we probably need a
DEC_DIGITS sensitive definition of NUMERIC_WEIGHT_MAX,
since per spec the max range for numeric is 0x20000 (131072)
decimal digits.

No, the maximum weight is determined by the use of int16 to store the
weight. Therefore if you did reduce DEC_DIGITS to 1 or 2, the number
of decimal digits allowed before the decimal point would be reduced
too.

OK, that can actually be seen as a feature, especially since it's
of course more likely DEC_DIGITS could increase in the future
than decrease.

For example, let's say we would double it to 8,
then if NUMERIC_WEIGHT_MAX would still be 0x7FFF (32767),
then the maximum range for numeric would increase from 131072 to 262144
decimal digits allowed before the decimal point.

LGTM.

Regards,
Joel

#6Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Joel Jacobson (#5)
Re: Incorrect results from numeric round() and trunc()

On Mon, 8 Jul 2024 at 11:08, Joel Jacobson <joel@compiler.org> wrote:

LGTM.

Thanks for the review. I have pushed and back-patched this.

Regards,
Dean