Precision loss casting float to numeric

Started by Chapman Flackalmost 8 years ago11 messages

chap@anastigmatix.net

almost 8 years ago

Back in

/messages/by-id/4e384467-f28a-69ce-
75aa-4bc01125a39d%40anastigmatix.net

I got intrigued about casting float values to numeric. Two queries below
(one for float4, one for float8) show what happens for float values with
bits of precision from one up to the limit of the data type. Each value
generated has two 1 bits in the mantissa, one always at 2^0 and the other
at 2^(-p) for increasing p; in other words, each value is the sum of 1 and
2^(-p), and the last such value that compares unequal to 1 is the value
1 plus (FLT_EPSILON or DBL_EPSILON, respectively, for the machine). The
next value generated that way would be indistinguishable from 1.

I'm testing on Intel, IEEE-754 hardware, with 24 bits of precision for
float4 and 53 bits for float8.

TL;DR: casting these values to numeric loses their distinguishability
from 1, six bits early for float4 and five bits early for float8. Casting
to numeric and back again is even worse: you get only six of your 24 bits
back for float4, only 15 of 53 for float8.

Everyone expects the floats to be approximate types and that they
won't be able to exactly represent arbitrary values. However, that's
not how numeric is advertised, and the idea that a numeric can't safely
keep the precision of a float4 or float8, and give it back when you ask,
at best violates least astonishment and arguably is a bug.

I see where at least the first problem (5 or 6 bits lost casting to
numeric) comes in:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3298

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3227

These conversions work by sprintf of the float value to its text
representation with FLT_DIG or DBL_DIG decimal digits of precision.

The mistake is subtle and understandable, because even the C standards
didn't catch it until recently, when they introduced the new constants
FLT_DECIMAL_DIG and DBL_DECIMAL_DIG, because they realized the constant
you need depends on what you're doing.

The old FLT_DIG and DBL_DIG tell you how many digits of a decimal
number you can count on recovering intact after conversion to float
or double and back to decimal. That's inside-out from what's needed
here, where the constant of interest is the number of decimal digits
necessary to reliably represent any float or double so you can get
those original bits back, and that's what the FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG constants are, and they're larger (by 3 and 2 decimal
digits, respectively, on IEEE-754 hardware) than their FLT_DIG and
DBL_DIG counterparts.

So, a trivial fix for float4_numeric and float8_numeric would be to
change the constant used in the sprintf, and that would (I think) solve
at least the first precision-loss problem. But it would only compile
where the compiler is new enough to define the newer constants, and
my preferred fix is to just open-code the conversion using frexp
rather than going through the text representation at all. I'm working
on that patch.

The second problem (losing even more bits in the roundtrip to numeric
and back) suggests that in the other direction, numeric_float4 and
numeric_float8 need some love too, but right now I'm focused on the
to-numeric direction).

-Chap

Columns:
place - the place value of the 1 bit on the right; in the last row,
this is the machine epsilon for the type.

float4gt - whether 1+place is distinguishable from 1 using
float8gt all float4/float8 operations; true for every row.

numericgt - whether 1+place is still distinguishable from 1 after
casting to numeric.

rtgt - whether the roundtrip, 1+place cast to numeric and back,
is still distinguishable from 1. Ends up same as numericgt
(on my platform anyway).

rteq - whether the roundtrip, 1+place cast to numeric and back,
equals the original 1+place. Starts failing quite early.

WITH RECURSIVE
f4(place) AS (
VALUES (1::float4)
UNION
SELECT place/2::float4
FROM f4
WHERE 1::float4 + place/2::float4 > 1::float4
)
SELECT
place,
1::float4 + place > 1::float4 AS float4gt,
(1::float4 + place)::numeric > 1::numeric AS numericgt,
(1::float4 + place)::numeric::float4 > 1::float4 AS rtgt,
(1::float4 + place)::numeric::float4 = 1::float4 + place as rteq
FROM
f4;

place | float4gt | numericgt | rtgt | rteq
-------------+----------+-----------+------+------
1 | t | t | t | t
0.5 | t | t | t | t
0.25 | t | t | t | t
0.125 | t | t | t | t
0.0625 | t | t | t | t
0.03125 | t | t | t | t
0.015625 | t | t | t | f
0.0078125 | t | t | t | f
0.00390625 | t | t | t | f
0.00195312 | t | t | t | f
0.000976562 | t | t | t | f
0.000488281 | t | t | t | f
0.000244141 | t | t | t | f
0.00012207 | t | t | t | f
6.10352e-05 | t | t | t | f
3.05176e-05 | t | t | t | f
1.52588e-05 | t | t | t | f
7.62939e-06 | t | t | t | f
3.8147e-06 | t | f | f | f
1.90735e-06 | t | f | f | f
9.53674e-07 | t | f | f | f
4.76837e-07 | t | f | f | f
2.38419e-07 | t | f | f | f
1.19209e-07 | t | f | f | f
(24 rows)

WITH RECURSIVE
f8(place) AS (
VALUES (1::float8)
UNION
SELECT place/2::float8
FROM f8
WHERE 1::float8 + place/2::float8 > 1::float8
)
SELECT
place,
1::float8 + place > 1::float8 AS float8gt,
(1::float8 + place)::numeric > 1::numeric AS numericgt,
(1::float8 + place)::numeric::float8 > 1::float8 AS rtgt,
(1::float8 + place)::numeric::float8 = 1::float8 + place as rteq
FROM
f8;

place | float8gt | numericgt | rtgt | rteq
----------------------+----------+-----------+------+------
1 | t | t | t | t
0.5 | t | t | t | t
0.25 | t | t | t | t
0.125 | t | t | t | t
0.0625 | t | t | t | t
0.03125 | t | t | t | t
0.015625 | t | t | t | t
0.0078125 | t | t | t | t
0.00390625 | t | t | t | t
0.001953125 | t | t | t | t
0.0009765625 | t | t | t | t
0.00048828125 | t | t | t | t
0.000244140625 | t | t | t | t
0.0001220703125 | t | t | t | t
6.103515625e-05 | t | t | t | t
3.0517578125e-05 | t | t | t | f
1.52587890625e-05 | t | t | t | f
7.62939453125e-06 | t | t | t | f
3.814697265625e-06 | t | t | t | f
1.9073486328125e-06 | t | t | t | f
9.5367431640625e-07 | t | t | t | f
4.76837158203125e-07 | t | t | t | f
2.38418579101562e-07 | t | t | t | f
1.19209289550781e-07 | t | t | t | f
5.96046447753906e-08 | t | t | t | f
2.98023223876953e-08 | t | t | t | f
1.49011611938477e-08 | t | t | t | f
7.45058059692383e-09 | t | t | t | f
3.72529029846191e-09 | t | t | t | f
1.86264514923096e-09 | t | t | t | f
9.31322574615479e-10 | t | t | t | f
4.65661287307739e-10 | t | t | t | f
2.3283064365387e-10 | t | t | t | f
1.16415321826935e-10 | t | t | t | f
5.82076609134674e-11 | t | t | t | f
2.91038304567337e-11 | t | t | t | f
1.45519152283669e-11 | t | t | t | f
7.27595761418343e-12 | t | t | t | f
3.63797880709171e-12 | t | t | t | f
1.81898940354586e-12 | t | t | t | f
9.09494701772928e-13 | t | t | t | f
4.54747350886464e-13 | t | t | t | f
2.27373675443232e-13 | t | t | t | f
1.13686837721616e-13 | t | t | t | f
5.6843418860808e-14 | t | t | t | f
2.8421709430404e-14 | t | t | t | f
1.4210854715202e-14 | t | t | t | f
7.105427357601e-15 | t | t | t | f
3.5527136788005e-15 | t | f | f | f
1.77635683940025e-15 | t | f | f | f
8.88178419700125e-16 | t | f | f | f
4.44089209850063e-16 | t | f | f | f
2.22044604925031e-16 | t | f | f | f
(53 rows)

Chapman Flack

chap@anastigmatix.net

almost 8 years ago

In reply to: Chapman Flack (#1)

2 attachment(s)

Re: Precision loss casting float to numeric

Here are two patches.

The 0001-*.patch simply adds a regression test to numeric.sql that bits
aren't lost casting from float[48] to numeric. Naturally, it fails at this
stage.

The 0002-*.patch is a proof-of-concept patching float4_numeric and
float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
regression test pass. (It will only work under a compiler that has
__FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
those internal versions to avoid mucking with build tooling to change
the target C standard, which I assume wouldn't be welcome anyway.
This patch is just POC and not proposed as anything else.)

It changes the output of an existing numeric test and a few numeric
aggregate tests.

I have adjusted the numeric test: its purpose is to check the direction
of numeric rounding of ties, but the value to be rounded was computed
in float8 and then cast to numeric, and because negative powers of ten
aren't tidy in binary, it can turn out that the float8 computation
produces a correctly-rounded-53-bit-result that is on the nearer-to-zero
side of a tie, and now that that result is correctly cast, the resulting
numeric doesn't round in the away-from-zero direction.

I changed that test because I concluded it wasn't meant to test
float8-to-numeric casting, but only the rounding of tied numerics,
so I just made the inner expression be typed numeric, and the expected
output is unchanged.

The three aggregate tests with changed output are working from a table
of float4 values, and my assumption is they are now simply producing
the correct aggregate results given the full precision of the input
values, but I haven't confirmed that yet, so this patch leaves those
three failing for now.

-Chap

Attachments:

0001-Add-regression-test-for-cast-float-48-to-numeric.patchtext/x-patch; name=0001-Add-regression-test-for-cast-float-48-to-numeric.patchDownload

From 5b3b05009830b29d1d3167bcd3321228535161dc Mon Sep 17 00:00:00 2001
From: Chapman Flack <chap@anastigmatix.net>
Date: Sun, 25 Feb 2018 22:52:22 -0500
Subject: [PATCH 1/2] Add regression test for cast float[48] to numeric.

At this stage, as expected, it fails.
---
 src/test/regress/expected/numeric.out | 34 ++++++++++++++++++++++++++++++++++
 src/test/regress/sql/numeric.sql      | 31 +++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index 17985e8..c186385 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2063,3 +2063,37 @@ SELECT SUM((-9999)::numeric) FROM generate_series(1, 100000);
  -999900000
 (1 row)
 
+--
+-- Tests for cast from float4, float8
+--
+-- check precision of mantissa not lost
+WITH RECURSIVE
+  f4bit(place) AS (
+    VALUES (1::float4)
+    UNION
+    SELECT place/2::float4
+    FROM f4bit
+    WHERE 1::float4 + place/2::float4 > 1::float4
+  ),
+  f4(epsilon) AS (select min(place) FROM f4bit),
+  f8bit(place) AS (
+    VALUES (1::float8)
+    UNION
+    SELECT place/2::float8
+    FROM f8bit
+    WHERE 1::float8 + place/2::float8 > 1::float8
+  ),
+  f8(epsilon) AS (SELECT min(place) FROM f8bit),
+  testvalues(f4, f8) AS (
+    SELECT 1::float4+f4.epsilon, 1::float8+f8.epsilon
+    FROM f4, f8
+  )
+SELECT
+  f4::numeric > 1::numeric AS f4numericpreserved,
+  f8::numeric > 1::numeric AS f8numericpreserved
+FROM testvalues;
+ f4numericpreserved | f8numericpreserved 
+--------------------+--------------------
+ t                  | t
+(1 row)
+
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index d77504e..dab9ae2 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1036,3 +1036,34 @@ select scale(-13.000000000000000);
 -- cases that need carry propagation
 SELECT SUM(9999::numeric) FROM generate_series(1, 100000);
 SELECT SUM((-9999)::numeric) FROM generate_series(1, 100000);
+
+--
+-- Tests for cast from float4, float8
+--
+
+-- check precision of mantissa not lost
+WITH RECURSIVE
+  f4bit(place) AS (
+    VALUES (1::float4)
+    UNION
+    SELECT place/2::float4
+    FROM f4bit
+    WHERE 1::float4 + place/2::float4 > 1::float4
+  ),
+  f4(epsilon) AS (select min(place) FROM f4bit),
+  f8bit(place) AS (
+    VALUES (1::float8)
+    UNION
+    SELECT place/2::float8
+    FROM f8bit
+    WHERE 1::float8 + place/2::float8 > 1::float8
+  ),
+  f8(epsilon) AS (SELECT min(place) FROM f8bit),
+  testvalues(f4, f8) AS (
+    SELECT 1::float4+f4.epsilon, 1::float8+f8.epsilon
+    FROM f4, f8
+  )
+SELECT
+  f4::numeric > 1::numeric AS f4numericpreserved,
+  f8::numeric > 1::numeric AS f8numericpreserved
+FROM testvalues;
-- 
2.7.3

0002-Proof-of-concept-fix-of-float-48-_numeric.patchtext/x-patch; name=0002-Proof-of-concept-fix-of-float-48-_numeric.patchDownload

From 4d721ac45967f64415bafb82039a652f0a451e2c Mon Sep 17 00:00:00 2001
From: Chapman Flack <chap@anastigmatix.net>
Date: Mon, 26 Feb 2018 01:26:01 -0500
Subject: [PATCH 2/2] Proof of concept fix of float[48]_numeric.

This patch simply uses the new FLT_DECIMAL_DIG/DBL_DECIMAL_DIG
constants in place of the FLT_DIG/DBL_DIG, just to see what happens.
It uses the double-underscore versions of the new constants, to
avoid mucking with the build procedure to change the C standard
to C11 where they would be visible without underscores. Needless to say,
this POC patch won't work where those constants aren't available, and
isn't proposed as a final patch.

It passes the newly added cast-no-precision-loss regression test.
It changes the output of an existing numeric test and a few numeric
aggregate tests.

I have adjusted the numeric test: its purpose is to check the direction
of numeric rounding of ties, but the value to be rounded was computed
in float8 and then cast to numeric, and because negative powers of ten
aren't tidy in binary, it can turn out that the float8 computation
produces a correctly-rounded-53-bit-result that is on the nearer-to-zero
side of a tie, and now that that result is correctly cast, the resulting
numeric doesn't round in the away-from-zero direction.

I changed that test because I concluded it wasn't meant to test
float8-to-numeric casting, but only the rounding of tied numerics,
so I just made the inner expression be typed numeric, and the expected
output is unchanged.

The three aggregate tests with changed output are working from a table
of float4 values, and my assumption is they are now simply producing
the correct aggregate results given the full precision of the input
values, but I haven't confirmed that yet, so this patch leaves those
three failing for now.
---
 src/backend/utils/adt/numeric.c       |  4 ++--
 src/test/regress/expected/numeric.out | 12 ++++++------
 src/test/regress/sql/numeric.sql      | 12 ++++++------
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 6f40072..c05de08 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -3224,7 +3224,7 @@ float8_numeric(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot convert infinity to numeric")));
 
-	snprintf(buf, sizeof(buf), "%.*g", DBL_DIG, val);
+	snprintf(buf, sizeof(buf), "%.*g", __DBL_DECIMAL_DIG__, val);
 
 	init_var(&result);
 
@@ -3295,7 +3295,7 @@ float4_numeric(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot convert infinity to numeric")));
 
-	snprintf(buf, sizeof(buf), "%.*g", FLT_DIG, val);
+	snprintf(buf, sizeof(buf), "%.*g", __FLT_DECIMAL_DIG__, val);
 
 	init_var(&result);
 
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index c186385..0c757a3 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -753,12 +753,12 @@ SELECT a, ceil(a), ceiling(a), floor(a), round(a) FROM ceil_floor_round;
 DROP TABLE ceil_floor_round;
 -- Check rounding, it should round ties away from zero.
 SELECT i as pow,
-	round((-2.5 * 10 ^ i)::numeric, -i),
-	round((-1.5 * 10 ^ i)::numeric, -i),
-	round((-0.5 * 10 ^ i)::numeric, -i),
-	round((0.5 * 10 ^ i)::numeric, -i),
-	round((1.5 * 10 ^ i)::numeric, -i),
-	round((2.5 * 10 ^ i)::numeric, -i)
+	round((-2.5 * 10::numeric ^ i), -i),
+	round((-1.5 * 10::numeric ^ i), -i),
+	round((-0.5 * 10::numeric ^ i), -i),
+	round((0.5 * 10::numeric ^ i), -i),
+	round((1.5 * 10::numeric ^ i), -i),
+	round((2.5 * 10::numeric ^ i), -i)
 FROM generate_series(-5,5) AS t(i);
  pow |  round   |  round   |  round   |  round  |  round  |  round  
 -----+----------+----------+----------+---------+---------+---------
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index dab9ae2..c096c43 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -677,12 +677,12 @@ DROP TABLE ceil_floor_round;
 
 -- Check rounding, it should round ties away from zero.
 SELECT i as pow,
-	round((-2.5 * 10 ^ i)::numeric, -i),
-	round((-1.5 * 10 ^ i)::numeric, -i),
-	round((-0.5 * 10 ^ i)::numeric, -i),
-	round((0.5 * 10 ^ i)::numeric, -i),
-	round((1.5 * 10 ^ i)::numeric, -i),
-	round((2.5 * 10 ^ i)::numeric, -i)
+	round((-2.5 * 10::numeric ^ i), -i),
+	round((-1.5 * 10::numeric ^ i), -i),
+	round((-0.5 * 10::numeric ^ i), -i),
+	round((0.5 * 10::numeric ^ i), -i),
+	round((1.5 * 10::numeric ^ i), -i),
+	round((2.5 * 10::numeric ^ i), -i)
 FROM generate_series(-5,5) AS t(i);
 
 -- Testing for width_bucket(). For convenience, we test both the
-- 
2.7.3

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Chapman Flack (#2)

Re: Precision loss casting float to numeric

Chapman Flack <chap@anastigmatix.net> writes:

The 0002-*.patch is a proof-of-concept patching float4_numeric and
float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
regression test pass. (It will only work under a compiler that has
__FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
those internal versions to avoid mucking with build tooling to change
the target C standard, which I assume wouldn't be welcome anyway.

Nope. TBH, I'd think about just using "DBL_DIG + 3", given our existing
coding around extra_float_digits in places like pg_dump and postgres_fdw.
The knowledge that you need 2 or 3 extra digits is already well embedded.

Conceivably you could do it like

#ifndef DBL_DECIMAL_DIG
#ifdef __DBL_DECIMAL_DIG__
#define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__
#else
#define DBL_DECIMAL_DIG (DBL_DIG + 3)
#endif
#endif

but I'm not exactly seeing how that buys us anything.

The bigger question here is whether people actually want this behavioral
change. I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.

I have a vague idea that at some point in the past we discussed making
this conversion use extra_float_digits, which'd allow satisfying both
camps, at the nontrivial price that the conversion would have to be
considered stable not immutable. We didn't pull the trigger, if this
memory is real at all, presumably because of the mutability issue.

Another idea would be to leave the cast alone and introduce a named
function that does the "exact" conversion. Possibly that makes nobody
happy, but at least both the cast and the function could be immutable.
It'd dodge backwards-compatibility objections, too.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Tom Lane (#3)

Re: Precision loss casting float to numeric

On Mon, Feb 26, 2018 at 1:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The bigger question here is whether people actually want this behavioral
change. I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.

Yeah, I worry about that, too.

Of course, as you know, I also have a deep and abiding skepticism
about IEEE binary floats as a concept. Anomalies are unavoidable; we
can choose exactly which set users experience, but we can eliminate
them because the underlying storage format is poorly-adapted to the
behavior people actually want. It's too bad that IEEE decimal floats
weren't standardized until 2008.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Chapman Flack

chap@anastigmatix.net

almost 8 years ago

In reply to: Tom Lane (#3)

Re: Precision loss casting float to numeric

On 02/26/2018 01:29 PM, Tom Lane wrote:

I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.

Arguably, though, that's a moment that can be used to explain
exactly what the correctly-rounded value of 1.1::float8 is and
why, and then people both know something new and understand more
precisely what's happening to their data, and can apply round()
to it in exactly the places they want, if they want.

In contrast, the current fact that 1.1::float8 looks like 1.1
when cast to numeric puts a superficial smile on people's faces,
while they haven't really been asked how they feel about losing
five, six, or thirty-eight bits of precision when casting one data
type into another of supposedly greater precision and back. I think
the typical assumption is that, sure, you may lose precision
if you cast to a *less* precise type, but the other direction's
assumed value-preserving.

I can see the concern about changing behavior for code that may
exist already. I would never have thought of making the behavior
of a cast sensitive to extra_float_digits (in fact, I hadn't
known about extra_float_digits; it's described in the "locale
and formatting" section, which I never dreamed of consulting
for a cast between internal value representations; am I weird
in that?).

I wonder if an alternative to making a cast that can't be immutable,
because it looks at a GUC, could be to offer a choice of cast
functions: if you need the other behavior, create a schema, do a
CREATE CAST in it, and put it on your search path ahead of pg_catalog.
Kind of ugly, but that can happen dealing with kind of ugly situations.

-Chap

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Chapman Flack (#5)

Re: Precision loss casting float to numeric

Chapman Flack <chap@anastigmatix.net> writes:

I wonder if an alternative to making a cast that can't be immutable,
because it looks at a GUC, could be to offer a choice of cast
functions: if you need the other behavior, create a schema, do a
CREATE CAST in it, and put it on your search path ahead of pg_catalog.

Nope, because casts aren't schema-local, since they don't have names.
There can be only one cast between given source and target types.

regards, tom lane

Bear Giles

bgiles@coyotesong.com

almost 8 years ago

In reply to: Tom Lane (#3)

Re: Precision loss casting float to numeric

On Mon, Feb 26, 2018 at 11:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Chapman Flack <chap@anastigmatix.net> writes:

The 0002-*.patch is a proof-of-concept patching float4_numeric and
float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
regression test pass. (It will only work under a compiler that has
__FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
those internal versions to avoid mucking with build tooling to change
the target C standard, which I assume wouldn't be welcome anyway.

Nope. TBH, I'd think about just using "DBL_DIG + 3", given our existing
coding around extra_float_digits in places like pg_dump and postgres_fdw.
The knowledge that you need 2 or 3 extra digits is already well embedded.

Conceivably you could do it like

#ifndef DBL_DECIMAL_DIG
#ifdef __DBL_DECIMAL_DIG__
#define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__
#else
#define DBL_DECIMAL_DIG (DBL_DIG + 3)
#endif
#endif

but I'm not exactly seeing how that buys us anything.

The bigger question here is whether people actually want this behavioral
change. I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.

I have a vague idea that at some point in the past we discussed making
this conversion use extra_float_digits, which'd allow satisfying both
camps, at the nontrivial price that the conversion would have to be
considered stable not immutable. We didn't pull the trigger, if this
memory is real at all, presumably because of the mutability issue.

Another idea would be to leave the cast alone and introduce a named
function that does the "exact" conversion. Possibly that makes nobody
happy, but at least both the cast and the function could be immutable.
It'd dodge backwards-compatibility objections, too.

regards, tom lane

Working for a company that
has enterprise customers this can't be overemphasized.
Never require the user to do something so they keep getting the same
results.

It doesn't
matter if it's "wrong".

I would vote for a property. If you want the best effort to match the IEEE
spec
you need to execute 'set use_ieee_numbers' and you'll get the extra digits
and
rounding behavior. If not you'll get the existing behavior.

Bear

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Tom Lane (#6)

Re: Precision loss casting float to numeric

I wonder if an alternative to making a cast that can't be immutable,
because it looks at a GUC, could be to offer a choice of cast
functions: if you need the other behavior, create a schema, do a
CREATE CAST in it, and put it on your search path ahead of pg_catalog.

Nope, because casts aren't schema-local, since they don't have names.
There can be only one cast between given source and target types.

In this case, I cannot see any other option than adding those as
separate cast functions. Should we mark this entry as "returned with
feedback"?

We can also consider turning the current float to numeric casts to
explicit as they are causing data loss. I am not sure how much it
would impact backwards-compatibility. The counter argument is the
numeric to float casts being IMPLICIT. They are causing data loss for
sure, but I believe there are different reasons to keep them as
IMPLICIT.

Chapman Flack

chap@anastigmatix.net

almost 8 years ago

In reply to: Emre Hasegeli (#8)

Re: Precision loss casting float to numeric

On 03/09/18 12:05, Emre Hasegeli wrote:

In this case, I cannot see any other option than adding those as
separate cast functions. Should we mark this entry as "returned with
feedback"?

We can also consider turning the current float to numeric casts to
explicit as they are causing data loss. I am not sure how much it
would impact backwards-compatibility. The counter argument is the
numeric to float casts being IMPLICIT. They are causing data loss for
sure, but I believe there are different reasons to keep them as
IMPLICIT.

Thanks for the feedback. I will mark it RWF myself, as the backward-
compatibility issues are kind of paralyzing, and I don't think I'll
have time in this CF to give it enough further thought anyway.

I wonder whether even changing a formerly-implicit cast to explicit
would be too much of a behavior change for existing code that expects
the current behavior?

-Chap

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Chapman Flack (#9)

Re: Precision loss casting float to numeric

Chapman Flack <chap@anastigmatix.net> writes:

I wonder whether even changing a formerly-implicit cast to explicit
would be too much of a behavior change for existing code that expects
the current behavior?

We did exactly that during the 8.3 cycle, and got one heck of a lot of
pushback about it, despite the existence of a lot of examples showing
that it would be a good idea. I won't say we can't do it again, but
it won't be an easy sell.

regards, tom lane

#11

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Emre Hasegeli (#8)

Re: Precision loss casting float to numeric

Emre Hasegeli <emre@hasegeli.com> writes:

We can also consider turning the current float to numeric casts to
explicit as they are causing data loss. I am not sure how much it
would impact backwards-compatibility. The counter argument is the
numeric to float casts being IMPLICIT. They are causing data loss for
sure, but I believe there are different reasons to keep them as
IMPLICIT.

FWIW, the behavior of these casts is intended to model what the SQL
standard says about the behavior of "exact numeric" vs "approximate
numeric" types. (In our implementation, the integral types plus numeric
are in the first category, float4 and float8 in the second.) The spec is
perfectly clear that you can assign an approximate numeric to an exact
numeric, or vice versa, without any explicit cast. It is also clear that
arithmetic operations combining approximate and exact are allowed without
a cast, and yield approximate results. These rules lead to our
conclusions that exact -> approximate is an implicit cast (so that the
parser will choose eg. float8 multiplication over numeric multiplication
if you write numericvar * float8var) while approximate -> exact is an
assignment cast (so that you can assign float8 to numeric without explicit
casting). If the decisions had been driven purely by "what risks silent
precision loss", no doubt we'd have done it differently ... but it's hard
to do that and still meet the letter of the spec.

regards, tom lane