[PATCH] Fix float8 parsing of denormal values (on some platforms?)
Hi list,
Back in June we had a discussion about parsing denormal floating-point
values. A float8->text conversion could result in a number that can't
be converted back to float8 anymore for some values. Among other
things, this could break backups (though my searches didn't turn up
any reports of this ever happening).
The reason is that Linux strtod() sets errno=ERANGE for denormal
numbers. This behavior is also explicitly allowed
(implementation-defined) by the C99 standard.
Further analysis was done by Jeroen Vermeulen here:
http://archives.postgresql.org/pgsql-hackers/2011-06/msg01773.php
I think the least invasive fix, as proposed by Jeroen, is to fail only
when ERANGE is set *and* the return value is 0.0 or +/-HUGE_VAL.
Reading relevant specifications, this seems to be a fairly safe
assumption. That's what the attached patch does.
(I also updated the float4in function, but that's not strictly
necessary -- it would fail later in CHECKFLOATVAL() anyway)
What I don't know is how many platforms are actually capable of
parsing denormal double values. I added some regression tests, it
would be interesting to see results from pgbuildfarm and potentially
revert these tests later.
Regards,
Marti
On Wed, Dec 21, 2011 at 18:21, Marti Raudsepp <marti@juffo.org> wrote:
I think the least invasive fix, as proposed by Jeroen, is to fail only
when ERANGE is set *and* the return value is 0.0 or +/-HUGE_VAL.
Reading relevant specifications, this seems to be a fairly safe
assumption. That's what the attached patch does.
Oops, now attached the patch too.
Regards,
Marti
Attachments:
float8-denormal-parsing.patchtext/x-patch; charset=US-ASCII; name=float8-denormal-parsing.patchDownload
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
new file mode 100644
index 63b09a4..86e1661
*** a/src/backend/utils/adt/float.c
--- b/src/backend/utils/adt/float.c
*************** float4in(PG_FUNCTION_ARGS)
*** 238,247 ****
endptr = num + 9;
}
else if (errno == ERANGE)
! ereport(ERROR,
! (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
! errmsg("\"%s\" is out of range for type real",
! orig_num)));
else
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
--- 238,257 ----
endptr = num + 9;
}
else if (errno == ERANGE)
! {
! /*
! * We only fail for ERANGE if the return value is also out of
! * range. Some platforms parse and return denormal values
! * correctly, but still set errno to ERANGE.
! */
! if (val == 0.0 || val == HUGE_VAL || val == -HUGE_VAL)
! {
! ereport(ERROR,
! (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
! errmsg("\"%s\" is out of range for type real",
! orig_num)));
! }
! }
else
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
*************** float8in(PG_FUNCTION_ARGS)
*** 431,440 ****
endptr = num + 9;
}
else if (errno == ERANGE)
! ereport(ERROR,
! (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
! errmsg("\"%s\" is out of range for type double precision",
! orig_num)));
else
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
--- 441,460 ----
endptr = num + 9;
}
else if (errno == ERANGE)
! {
! /*
! * We only fail for ERANGE if the return value is also out of
! * range. Some platforms parse and return denormal values
! * correctly, but still set errno to ERANGE.
! */
! if (val == 0.0 || val == HUGE_VAL || val == -HUGE_VAL)
! {
! ereport(ERROR,
! (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
! errmsg("\"%s\" is out of range for type double precision",
! orig_num)));
! }
! }
else
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
diff --git a/src/test/regress/expected/float8.out b/src/test/regress/expected/float8.out
new file mode 100644
index 6221538..e3fb8d3
*** a/src/test/regress/expected/float8.out
--- b/src/test/regress/expected/float8.out
*************** SELECT '-10e-400'::float8;
*** 24,29 ****
--- 24,48 ----
ERROR: "-10e-400" is out of range for type double precision
LINE 1: SELECT '-10e-400'::float8;
^
+ -- test denormal value parsing
+ SELECT '4.95e-324'::float8 < '1.49e-323'::float8;
+ ?column?
+ ----------
+ t
+ (1 row)
+
+ SELECT '4.95e-324'::float8 > '0'::float8;
+ ?column?
+ ----------
+ t
+ (1 row)
+
+ SELECT substr('-4.95e-324'::float8::text, 1, 2);
+ substr
+ --------
+ -4
+ (1 row)
+
-- bad input
INSERT INTO FLOAT8_TBL(f1) VALUES ('');
ERROR: invalid input syntax for type double precision: ""
diff --git a/src/test/regress/sql/float8.sql b/src/test/regress/sql/float8.sql
new file mode 100644
index 92a574a..e13eb51
*** a/src/test/regress/sql/float8.sql
--- b/src/test/regress/sql/float8.sql
*************** SELECT '-10e400'::float8;
*** 16,21 ****
--- 16,26 ----
SELECT '10e-400'::float8;
SELECT '-10e-400'::float8;
+ -- test denormal value parsing
+ SELECT '4.95e-324'::float8 < '1.49e-323'::float8;
+ SELECT '4.95e-324'::float8 > '0'::float8;
+ SELECT substr('-4.95e-324'::float8::text, 1, 2);
+
-- bad input
INSERT INTO FLOAT8_TBL(f1) VALUES ('');
INSERT INTO FLOAT8_TBL(f1) VALUES (' ');
At 2011-12-21 18:24:17 +0200, marti@juffo.org wrote:
I think the least invasive fix, as proposed by Jeroen, is to fail only
when ERANGE is set *and* the return value is 0.0 or +/-HUGE_VAL.
Reading relevant specifications, this seems to be a fairly safe
assumption. That's what the attached patch does.Oops, now attached the patch too.
Approach seems sensible, patch applies, builds, and passes tests.
Marking ready for committer and crossing fingers about buildfarm
failures…
-- ams
Marti Raudsepp <marti@juffo.org> writes:
On Wed, Dec 21, 2011 at 18:21, Marti Raudsepp <marti@juffo.org> wrote:
I think the least invasive fix, as proposed by Jeroen, is to fail only
when ERANGE is set *and* the return value is 0.0 or +/-HUGE_VAL.
Reading relevant specifications, this seems to be a fairly safe
assumption. That's what the attached patch does.
Oops, now attached the patch too.
Applied with minor revisions. Notably, after staring at the code a bit
I got uncomfortable with its assumption that pg_strncasecmp() cannot
change errno, so I fixed it to not assume that. Also, on some platforms
HUGE_VAL isn't infinity but the largest finite value, so I made the
range tests be like ">= HUGE_VAL" not just "== HUGE_VAL". I know the
man page for strtod() specifies it should return exactly HUGE_VAL for
overflow, but who's to say that <math.h> is on the same page as the
actual function?
regards, tom lane
On Wed, Feb 1, 2012 at 20:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Applied with minor revisions.
Thanks! :)
We're already seeing first buildfarm failures, on system "narwhal"
using an msys/mingw compiler.
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=narwhal&dt=2012-02-02%2005%3A00%3A02
No idea which libc it uses though. The MSVC libc looks fine because
"currawong" passes these tests.
PS: it would be useful to have the output of 'cc -v' in buildfarm
reports since the "Compiler" field may be out of date.
Regards,
Marti
Marti Raudsepp <marti@juffo.org> writes:
We're already seeing first buildfarm failures, on system "narwhal"
using an msys/mingw compiler.
Yeah. After a full day's cycle, the answer seems to be that
denormalized input works fine everywhere except:
1. mingw on Windows (even though MSVC builds work)
2. some Gentoo builds fail (knowing that platform, the phase of the moon
is probably the most predictable factor determining this).
I'm inclined at this point to remove the regression test cases, because
we have no principled way to deal with case #2. We could install
alternative "expected" files that accept the failure, but then what's
the point of having the test case at all?
It might be worth pressuring the mingw folk to get with the program,
but I'm not going to be the one to do that.
regards, tom lane
On 02/03/2012 03:49 AM, Tom Lane wrote:
Marti Raudsepp<marti@juffo.org> writes:
We're already seeing first buildfarm failures, on system "narwhal"
using an msys/mingw compiler.Yeah. After a full day's cycle, the answer seems to be that
denormalized input works fine everywhere except:1. mingw on Windows (even though MSVC builds work)
2. some Gentoo builds fail (knowing that platform, the phase of the moon
is probably the most predictable factor determining this).I'm inclined at this point to remove the regression test cases, because
we have no principled way to deal with case #2. We could install
alternative "expected" files that accept the failure, but then what's
the point of having the test case at all?It might be worth pressuring the mingw folk to get with the program,
but I'm not going to be the one to do that.
No doubt they would tell us to upgrade the compiler. Narwhal's is
horribly old. Neither of my mingw-based members is failing.
cheers
andrew