Bug fix for glibc broke freebsd build in REL_11_STABLE
Collegues,
Few days ago commit
1f349aa7d9a6633e87db071390c73a39ac279ba4
Fix 8a934d67 for libc++ and make more include order resistant.
was introduced into branch REL_11_STABLE.
It seems that it deals with GLIBC-based systems (i.e Linux and Hurd)
only, and shouldn't affect systems with non-GNU libc (such as BSD
family).
It changes code which has a comment:
/*
* Glibc doesn't use the builtin for clang due to a *gcc* bug in a
version
* newer than the gcc compatibility clang claims to have. This would
cause a
* *lot* of superfluous function calls, therefore revert when using
clang. In
* C++ there's issues with libc++ (not libstdc++), so disable as well.
*/
However, this commit broke float8 test on 32-bit FreeBSD 11 with clang
3.8.0 compiler. Regressions.diff follows:
================== pgsql.build/src/test/regress/regression.diffs
===================
*** /usr/home/buildfarm/build-farm/REL_11_STABLE/pgsql.build/src/test/regress/expected/float8.out
Tue Sep 4 11:57:47 2018
--- /usr/home/buildfarm/build-farm/REL_11_STABLE/pgsql.build/src/test/regress/results/float8.out
Tue Sep 4 12:03:28 2018 *************** *** 418,424 **** SET f1 =
FLOAT8_TBL.f1 * '-1' WHERE FLOAT8_TBL.f1 > '0.0';
SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f;
! ERROR: value out of range: overflow
SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f;
ERROR: value out of range: overflow
SELECT 0 ^ 0 + 0 ^ 1 + 0 ^ 0.0 + 0 ^ 0.5;
--- 418,432 ----
SET f1 = FLOAT8_TBL.f1 * '-1'
WHERE FLOAT8_TBL.f1 > '0.0';
SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f;
! bad | ?column?
! -----+------------------
! | 0
! | -3.484e+201
! | -1.0043e+203
! | -Infinity
! | -1.2345678901234
! (5 rows)
!
SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f;
ERROR: value out of range: overflow
SELECT 0 ^ 0 + 0 ^ 1 + 0 ^ 0.0 + 0 ^ 0.5;
================================================================
Problem doesn't arise on 64-bit FreeBSD and MacOS. (we have no 32-bit
MacOs at hand).
Following patch fixes this problem:
diff --git a/src/include/port.h b/src/include/port.h
index 0ce72e50e5..7daaebf568 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -350,7 +350,7 @@ extern int isinf(double x);
* *lot* of superfluous function calls, therefore revert when using
clang. In
* C++ there's issues with libc++ (not libstdc++), so disable as well.
*/
-#if defined(__clang__) && !defined(__cplusplus)
+#if defined(__clang__) && !defined(__cplusplus) && defined(__linux__)
/* needs to be separate to not confuse other compilers */
#if __has_builtin(__builtin_isinf)
/* need to include before, to avoid getting overwritten */
Idea of the patch is that because patch is only GLIBC-related, we
should apply this logic only if we have linux (more precisely linux or
hurd, but adding hurd would make condition overcomplicated and no one
seems to use it) system.
--
On September 4, 2018 6:16:24 AM PDT, Victor Wagner <vitus@wagner.pp.ru> wrote:
Collegues,
Few days ago commit
1f349aa7d9a6633e87db071390c73a39ac279ba4
Fix 8a934d67 for libc++ and make more include order resistant.
was introduced into branch REL_11_STABLE.
It seems that it deals with GLIBC-based systems (i.e Linux and Hurd)
only, and shouldn't affect systems with non-GNU libc (such as BSD
family).
What libc is this using?
It changes code which has a comment:
/*
* Glibc doesn't use the builtin for clang due to a *gcc* bug in a
version
* newer than the gcc compatibility clang claims to have. This would
cause a
* *lot* of superfluous function calls, therefore revert when using
clang. In
* C++ there's issues with libc++ (not libstdc++), so disable as well.
*/However, this commit broke float8 test on 32-bit FreeBSD 11 with clang
3.8.0 compiler. Regressions.diff follows:
Does this happen with a newer clang version too?
================== pgsql.build/src/test/regress/regression.diffs =================== *** /usr/home/buildfarm/build-farm/REL_11_STABLE/pgsql.build/src/test/regress/expected/float8.out Tue Sep 4 11:57:47 2018 --- /usr/home/buildfarm/build-farm/REL_11_STABLE/pgsql.build/src/test/regress/results/float8.out Tue Sep 4 12:03:28 2018 *************** *** 418,424 **** SET f1 = FLOAT8_TBL.f1 * '-1' WHERE FLOAT8_TBL.f1 > '0.0'; SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f; ! ERROR: value out of range: overflow SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f; ERROR: value out of range: overflow SELECT 0 ^ 0 + 0 ^ 1 + 0 ^ 0.0 + 0 ^ 0.5; --- 418,432 ---- SET f1 = FLOAT8_TBL.f1 * '-1' WHERE FLOAT8_TBL.f1 > '0.0'; SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f; ! bad | ?column? ! -----+------------------ ! | 0 ! | -3.484e+201 ! | -1.0043e+203 ! | -Infinity ! | -1.2345678901234 ! (5 rows) ! SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f; ERROR: value out of range: overflow SELECT 0 ^ 0 + 0 ^ 1 + 0 ^ 0.0 + 0 ^ 0.5; ================================================================ Problem doesn't arise on 64-bit FreeBSD and MacOS. (we have no 32-bit MacOs at hand).
This seems like something that needs to be understood. If a compiler intrinsic doesn't work, it's serious.
Following patch fixes this problem:
diff --git a/src/include/port.h b/src/include/port.h index 0ce72e50e5..7daaebf568 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -350,7 +350,7 @@ extern int isinf(double x); * *lot* of superfluous function calls, therefore revert when using clang. In * C++ there's issues with libc++ (not libstdc++), so disable as well. */ -#if defined(__clang__) && !defined(__cplusplus) +#if defined(__clang__) && !defined(__cplusplus) && defined(__linux__) /* needs to be separate to not confuse other compilers */ #if __has_builtin(__builtin_isinf) /* need to include before, to avoid getting overwritten */Idea of the patch is that because patch is only GLIBC-related, we
should apply this logic only if we have linux (more precisely linux or
hurd, but adding hurd would make condition overcomplicated and no one
seems to use it) system.
This seems too restrictive. There's nothing Linux specific in the issue - causes slowdowns on other platforms too.
Thanks!
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
"Andres" == Andres Freund <andres@anarazel.de> writes:
However, this commit broke float8 test on 32-bit FreeBSD 11 with
clang 3.8.0 compiler. Regressions.diff follows:
Andres> Does this happen with a newer clang version too?
float8 test (and all other tests) passes for me on clang 3.9.1 on fbsd11
on 32-bit ARM, and on -m32 builds on amd64.
I also confirmed that without #define isinf(x) __builtin_isinf(x), on
both 32bit and 64bit fbsd isinf() compiles as a function call, so the
OP's proposed change would not be desirable.
--
Andrew (irc:RhodiumToad)
On Tue, Sep 4, 2018 at 9:39 AM Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
"Andres" == Andres Freund <andres@anarazel.de> writes:
However, this commit broke float8 test on 32-bit FreeBSD 11 with
clang 3.8.0 compiler. Regressions.diff follows:Andres> Does this happen with a newer clang version too?
float8 test (and all other tests) passes for me on clang 3.9.1 on fbsd11
on 32-bit ARM, and on -m32 builds on amd64.I also confirmed that without #define isinf(x) __builtin_isinf(x), on
both 32bit and 64bit fbsd isinf() compiles as a function call, so the
OP's proposed change would not be desirable.
I installed FreeBSD 11.2 i386 on a virtual machine. I couldn't
reproduce the problem with either the base cc (clang 6.0.0) or clang38
(clang 3.8.1) installed via pkg.
The OP reported clang 3.8.0, so a minor version behind what I tested.
I did learn that "make check" fails in rolenames if your Unix user is
called "user".
--
Thomas Munro
http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes:
On Tue, Sep 4, 2018 at 9:39 AM Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:I also confirmed that without #define isinf(x) __builtin_isinf(x), on
both 32bit and 64bit fbsd isinf() compiles as a function call, so the
OP's proposed change would not be desirable.
Presumably what we are looking at here is a compiler codegen bug for
__builtin_isinf(). The proposed change dodges it precisely by
substituting the library function instead --- at a performance penalty.
I agree that this isn't a real desirable fix, and we definitely ought
not cause it to happen on platforms that haven't got the bug.
I installed FreeBSD 11.2 i386 on a virtual machine. I couldn't
reproduce the problem with either the base cc (clang 6.0.0) or clang38
(clang 3.8.1) installed via pkg.
The OP reported clang 3.8.0, so a minor version behind what I tested.
I tried to reproduce the problem, without success, on a few different
FreeBSD images I had laying about:
FreeBSD 11.0/x86_64, clang version 3.8.0
(this confirms OP's report that x86_64 is OK)
FreeBSD 10.3/ppc, gcc 4.2.1
FreeBSD 12.0-CURRENT (from around mid-May)/arm64, clang version 6.0.0
(I was testing PG HEAD, not the 11 branch, but I don't see a reason
to think that'd make a difference.)
I also looked for evidence of a bug of this sort in the clang bugzilla.
I couldn't find anything, but it's possible that "isinf" isn't what
I should have searched on.
Anyway, my estimation is that this is a compiler bug that's been
repaired, and it probably isn't widespread enough to justify our
inserting some klugy workaround.
regards, tom lane
В Tue, 4 Sep 2018 11:06:56 -0700
Thomas Munro <thomas.munro@enterprisedb.com> пишет:
On Tue, Sep 4, 2018 at 9:39 AM Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:"Andres" == Andres Freund <andres@anarazel.de> writes:
However, this commit broke float8 test on 32-bit FreeBSD 11 with
clang 3.8.0 compiler. Regressions.diff follows:Andres> Does this happen with a newer clang version too?
float8 test (and all other tests) passes for me on clang 3.9.1 on
fbsd11 on 32-bit ARM, and on -m32 builds on amd64.I also confirmed that without #define isinf(x) __builtin_isinf(x),
on both 32bit and 64bit fbsd isinf() compiles as a function call,
so the OP's proposed change would not be desirable.I installed FreeBSD 11.2 i386 on a virtual machine. I couldn't
reproduce the problem with either the base cc (clang 6.0.0) or clang38
(clang 3.8.1) installed via pkg.
Do you reproducing problem in REL_11_STABLE? It doesn't exist in master.
Also may be it might depend on some configure options. I usually
compile postgres with as much --with-something enabled as possible
Ive put my cconfigure/build/check logs to
https://wagner.pp.ru/~vitus/stuff/freebsd-i386-logs.tar.gz
Unfortunately there is a bit too much of them to attach to the list
message.
These logs are produced after I've upgraded my outdated system
to current 11.2 with clang 6.0.0. It haven't eliminated problem.
I can publish my KVM images both of old system (11.0 witth clang 3.8.0)
and new one (current 11.2 with clang 6.0.0)
The OP reported clang 3.8.0, so a minor version behind what I tested.
I did learn that "make check" fails in rolenames if your Unix user is
called "user".
--
Victor Wagner <vitus@wagner.pp.ru>
В Tue, 04 Sep 2018 15:48:24 -0400
Tom Lane <tgl@sss.pgh.pa.us> пишет:
I tried to reproduce the problem, without success, on a few different
FreeBSD images I had laying about:FreeBSD 11.0/x86_64, clang version 3.8.0
(this confirms OP's report that x86_64 is OK)FreeBSD 10.3/ppc, gcc 4.2.1
FreeBSD 12.0-CURRENT (from around mid-May)/arm64, clang version 6.0.0
(I was testing PG HEAD, not the 11 branch, but I don't see a reason
to think that'd make a difference.)
Alas, it does. First thing I've done after discovering this bug, it is
to look if it exists in master. And master passes this test on the same
machine where problem was discovered.
I also looked for evidence of a bug of this sort in the clang
bugzilla. I couldn't find anything, but it's possible that "isinf"
isn't what I should have searched on.Anyway, my estimation is that this is a compiler bug that's been
repaired, and it probably isn't widespread enough to justify our
inserting some klugy workaround.
It doesn't look so, as bug persists after I've upgraded system to
current 11.2-RELEASE with clang 6.0.0.
--
Victor Wagner <vitus@wagner.pp.ru>
On Tue, Sep 4, 2018 at 12:59 PM Victor Wagner <vitus@wagner.pp.ru> wrote:
В Tue, 04 Sep 2018 15:48:24 -0400
Tom Lane <tgl@sss.pgh.pa.us> пишет:I tried to reproduce the problem, without success, on a few different
FreeBSD images I had laying about:FreeBSD 11.0/x86_64, clang version 3.8.0
(this confirms OP's report that x86_64 is OK)FreeBSD 10.3/ppc, gcc 4.2.1
FreeBSD 12.0-CURRENT (from around mid-May)/arm64, clang version 6.0.0
(I was testing PG HEAD, not the 11 branch, but I don't see a reason
to think that'd make a difference.)Alas, it does. First thing I've done after discovering this bug, it is
to look if it exists in master. And master passes this test on the same
machine where problem was discovered.I also looked for evidence of a bug of this sort in the clang
bugzilla. I couldn't find anything, but it's possible that "isinf"
isn't what I should have searched on.Anyway, my estimation is that this is a compiler bug that's been
repaired, and it probably isn't widespread enough to justify our
inserting some klugy workaround.It doesn't look so, as bug persists after I've upgraded system to
current 11.2-RELEASE with clang 6.0.0.
Ah, right it fails for me too on 32 bit FreeBSD 11.2 with
REL_11_STABLE. Investigating...
--
Thomas Munro
http://www.enterprisedb.com
Victor Wagner <vitus@wagner.pp.ru> writes:
Tom Lane <tgl@sss.pgh.pa.us> пишет:
(I was testing PG HEAD, not the 11 branch, but I don't see a reason
to think that'd make a difference.)
Alas, it does. First thing I've done after discovering this bug, it is
to look if it exists in master. And master passes this test on the same
machine where problem was discovered.
Oh really ... I'll go try again. Now I agree with Andres, we need
to understand *exactly* what is failing here rather than guess.
My first thought was that the C99 changeover might explain the
difference. But clang doesn't require any special switch to select
C99 mode, so in principle that should've made no difference to clang
builds.
regards, tom lane
"Victor" == Victor Wagner <vitus@wagner.pp.ru> writes:
Victor> Do you reproducing problem in REL_11_STABLE? It doesn't exist
Victor> in master.
Oh, I missed that. It is reproducible on REL_11_STABLE with clang 3.9.1,
I'll dig into it more.
--
Andrew (irc:RhodiumToad)
On 2018-09-04 16:13:45 -0400, Tom Lane wrote:
Victor Wagner <vitus@wagner.pp.ru> writes:
Tom Lane <tgl@sss.pgh.pa.us> пишет:
(I was testing PG HEAD, not the 11 branch, but I don't see a reason
to think that'd make a difference.)Alas, it does. First thing I've done after discovering this bug, it is
to look if it exists in master. And master passes this test on the same
machine where problem was discovered.Oh really ... I'll go try again. Now I agree with Andres, we need
to understand *exactly* what is failing here rather than guess.My first thought was that the C99 changeover might explain the
difference. But clang doesn't require any special switch to select
C99 mode, so in principle that should've made no difference to clang
builds.
Thomas and I are sitting in a cafe and are trying to figure out what's
going on...
Greetings,
Andres Freund
"Andres" == Andres Freund <andres@anarazel.de> writes:
Andres> Thomas and I are sitting in a cafe and are trying to figure out
Andres> what's going on...
I have a standalone test case:
#include <stdio.h>
#include <math.h>
int main(int argc, char **argv)
{
double d1 = (argc ? 1e180 : 0);
double d2 = (argv ? 1e200 : 0);
int r2 = __builtin_isinf(d1 * d2);
int r1 = isinf(d1 * d2);
printf("r1 = %d, r2 = %d\n", r1, r2);
return 0;
}
Note that swapping the r1 and r2 lines makes the problem disappear (!).
on amd64, clang 3.9.1:
cc -O2 -m32 flttst.c && ./a.out
r1 = 1, r2 = 0
cc -O2 flttst.c && ./a.out
r1 = 1, r2 = 1
Can't reproduce on 32-bit arm.
--
Andrew (irc:RhodiumToad)
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Andres" == Andres Freund <andres@anarazel.de> writes:
Andres> Thomas and I are sitting in a cafe and are trying to figure out
Andres> what's going on...
Andrew> I have a standalone test case:
Andrew> #include <stdio.h>
Andrew> #include <math.h>
Andrew> int main(int argc, char **argv)
Andrew> {
Andrew> double d1 = (argc ? 1e180 : 0);
Andrew> double d2 = (argv ? 1e200 : 0);
Andrew> int r2 = __builtin_isinf(d1 * d2);
Andrew> int r1 = isinf(d1 * d2);
Andrew> printf("r1 = %d, r2 = %d\n", r1, r2);
Andrew> return 0;
Andrew> }
Andrew> Note that swapping the r1 and r2 lines makes the problem
Andrew> disappear (!).
And that's the clue to why it happens.
The reason it behaves oddly is this: on i387 FPU (and NOT on arm32 or on
32-bit i386 with a modern architecture specified to the compiler), the
result of 1e200 * 1e180 is not in fact infinite, because it fits in an
80-bit long double. So __builtin_isinf reports that it is finite; but if
it gets stored to memory as a double (e.g. to pass as a parameter to a
function), it then becomes infinite.
Andrew> cc -O2 -m32 flttst.c && ./a.out
Andrew> r1 = 1, r2 = 0
Specifying a recent microarch makes it use 64-bit FP registers rather
than 80-bit ones:
cc -O2 -m32 -march=skylake flttst.c && ./a.out
r1 = 1, r2 = 1
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
The reason it behaves oddly is this: on i387 FPU (and NOT on arm32 or on
32-bit i386 with a modern architecture specified to the compiler), the
result of 1e200 * 1e180 is not in fact infinite, because it fits in an
80-bit long double. So __builtin_isinf reports that it is finite; but if
it gets stored to memory as a double (e.g. to pass as a parameter to a
function), it then becomes infinite.
Ah-hah. Can we fix it by explicitly casting the argument of isinf
to double?
regards, tom lane
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> Specifying a recent microarch makes it use 64-bit FP registers
Andrew> rather than 80-bit ones:
Andrew> cc -O2 -m32 -march=skylake flttst.c && ./a.out
Andrew> r1 = 1, r2 = 1
where "recent" means "since about 2000 or so":
cc -O2 -m32 -march=pentium4 flttst.c && ./a.out
r1 = 1, r2 = 1
(the actual requirement is for SSE2 support)
--
Andrew (irc:RhodiumToad)
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
The reason it behaves oddly is this: on i387 FPU (and NOT on arm32
or on 32-bit i386 with a modern architecture specified to the
compiler), the result of 1e200 * 1e180 is not in fact infinite,
because it fits in an 80-bit long double. So __builtin_isinf reports
that it is finite; but if it gets stored to memory as a double (e.g.
to pass as a parameter to a function), it then becomes infinite.
Tom> Ah-hah. Can we fix it by explicitly casting the argument of isinf
Tom> to double?
No; the generated code doesn't change. Presumably the compiler regards
the value as already being of type "double", just one that happens to be
stored in a longer register.
--
Andrew (irc:RhodiumToad)
On 2018-09-04 17:46:29 -0400, Tom Lane wrote:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
The reason it behaves oddly is this: on i387 FPU (and NOT on arm32 or on
32-bit i386 with a modern architecture specified to the compiler), the
result of 1e200 * 1e180 is not in fact infinite, because it fits in an
80-bit long double. So __builtin_isinf reports that it is finite; but if
it gets stored to memory as a double (e.g. to pass as a parameter to a
function), it then becomes infinite.Ah-hah. Can we fix it by explicitly casting the argument of isinf
to double?
No, that doesn't work, afaict the cast has vanished long before the
optimizer stage. Note that we've previously encountered similar issues
on gcc, which is why we've tried to force gcc's hand with
-fexcess-precision=standard.
The apparent reason this doesn't fail on master is that there
check_float8_val() is an inline function, rather than a macro, but the
inline function doesn't get inlined (probably due to the number of
callsites). As the parameter passing convention doesn't do 80bit FPU
registers, the value is infinity by the time it gets to
__builtin_isinf().
I think this is pretty clearly a C99 violation by clang (note how gcc
automatically enables -fexcess-precision=standard in C99 mode). I'll
report something, but I've little hope this getting fixed quickly - and
it'll definitely not get fixed for such old clang versions as the OP used.
While we obviously could just neuter the isinf() macro hackery in this
edge case, my concern is that it seems likely that further such issues
are hidden in code that doesn't immediately cause regression test
failure.
I kinda wonder if we should add -mno-x87 or such in configure when we
detect clang, obviously it doesn't deal correctly with this. SSE2 based
floating point math is far from new... Or just error out in configure,
although that seems a bit harder.
I'm now looking at how it comes that clang on linux doesn't default to
x87 math, but freebsd does.
Greetings,
Andres Freund
On 2018-09-04 15:16:27 -0700, Andres Freund wrote:
I'm now looking at how it comes that clang on linux doesn't default to
x87 math, but freebsd does.
Oh well. I present you, without comments, the following:
switch (Triple.getOS()) {
case llvm::Triple::FreeBSD:
case llvm::Triple::NetBSD:
case llvm::Triple::OpenBSD:
return "i486";
case llvm::Triple::Haiku:
return "i586";
default:
// Fallback to p4.
return "pentium4";
}
Greetings,
Andres Freund
Hi,
On 2018-09-04 15:16:27 -0700, Andres Freund wrote:
I think this is pretty clearly a C99 violation by clang (note how gcc
automatically enables -fexcess-precision=standard in C99 mode). I'll
report something, but I've little hope this getting fixed quickly - and
it'll definitely not get fixed for such old clang versions as the OP used.
https://bugs.llvm.org/show_bug.cgi?id=38833
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
... Note that we've previously encountered similar issues
on gcc, which is why we've tried to force gcc's hand with
-fexcess-precision=standard.
Right. Annoying that clang doesn't have that. We can't realistically
program around an issue that might or might not show up depending on the
whims of the compiler's register allocation.
I kinda wonder if we should add -mno-x87 or such in configure when we
detect clang, obviously it doesn't deal correctly with this.
Seems worth looking into, but what happens if someone tries to compile
for x87 hardware? Or do we care anymore?
regards, tom lane