powerpc(32) point/polygon regression failures on Debian Jessie
The point/polygon regression tests have started to fail on 32-bit
powerpc on Debian Jessie. So far I could reproduce the problem with
PostgreSQL 9.4.10+11 and 9.6.1, on several different machines. Debian
unstable is unaffected.
The failure looks like this:
******** build/src/test/regress/regression.diffs ********
*** /�PKGBUILDDIR�/build/../src/test/regress/expected/point.out Mon Oct 24 20:08:51 2016
--- /�PKGBUILDDIR�/build/src/test/regress/results/point.out Mon Jan 23 15:17:51 2017
***************
*** 125,131 ****
| (-3,4) | 5
| (-10,0) | 10
| (-5,-12) | 13
! | (10,10) | 14.142135623731
| (5.1,34.5) | 34.8749193547455
(6 rows)
--- 125,131 ----
| (-3,4) | 5
| (-10,0) | 10
| (-5,-12) | 13
! | (10,10) | 14.1421356237309
| (5.1,34.5) | 34.8749193547455
(6 rows)
***************
*** 150,157 ****
| (-5,-12) | (-10,0) | 13
| (-5,-12) | (0,0) | 13
| (0,0) | (-5,-12) | 13
! | (0,0) | (10,10) | 14.142135623731
! | (10,10) | (0,0) | 14.142135623731
| (-3,4) | (10,10) | 14.3178210632764
| (10,10) | (-3,4) | 14.3178210632764
| (-5,-12) | (-3,4) | 16.1245154965971
--- 150,157 ----
| (-5,-12) | (-10,0) | 13
| (-5,-12) | (0,0) | 13
| (0,0) | (-5,-12) | 13
! | (0,0) | (10,10) | 14.1421356237309
! | (10,10) | (0,0) | 14.1421356237309
| (-3,4) | (10,10) | 14.3178210632764
| (10,10) | (-3,4) | 14.3178210632764
| (-5,-12) | (-3,4) | 16.1245154965971
***************
*** 221,227 ****
| (-10,0) | (0,0) | 10
| (-10,0) | (-5,-12) | 13
| (-5,-12) | (0,0) | 13
! | (0,0) | (10,10) | 14.142135623731
| (-3,4) | (10,10) | 14.3178210632764
| (-5,-12) | (-3,4) | 16.1245154965971
| (-10,0) | (10,10) | 22.3606797749979
--- 221,227 ----
| (-10,0) | (0,0) | 10
| (-10,0) | (-5,-12) | 13
| (-5,-12) | (0,0) | 13
! | (0,0) | (10,10) | 14.1421356237309
| (-3,4) | (10,10) | 14.3178210632764
| (-5,-12) | (-3,4) | 16.1245154965971
| (-10,0) | (10,10) | 22.3606797749979
======================================================================
*** /�PKGBUILDDIR�/build/../src/test/regress/expected/polygon.out Mon Oct 24 20:08:51 2016
--- /�PKGBUILDDIR�/build/src/test/regress/results/polygon.out Mon Jan 23 15:17:51 2017
***************
*** 222,229 ****
'(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
'(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
'(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
! on_corner | on_segment | inside | near_corner | near_segment
! -----------+------------+--------+-----------------+--------------
! 0 | 0 | 0 | 1.4142135623731 | 3.2
(1 row)
--- 222,229 ----
'(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
'(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
'(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
! on_corner | on_segment | inside | near_corner | near_segment
! -----------+------------+--------+------------------+--------------
! 0 | 0 | 0 | 1.41421356237309 | 3.2
(1 row)
The 9.4.11 log contains the same point.out diff, but not polygon.out:
https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=powerpc&ver=9.4.11-0%2Bdeb8u1&stamp=1487517299&raw=0
Does that ring any bell? As Debian unstable is unaffected, it's likely
the toolchain to be blamed, but it worked for Debian Jessie before.
Christoph
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Christoph Berg <christoph.berg@credativ.de> writes:
The point/polygon regression tests have started to fail on 32-bit
powerpc on Debian Jessie. So far I could reproduce the problem with
PostgreSQL 9.4.10+11 and 9.6.1, on several different machines. Debian
unstable is unaffected.
Hmph. We haven't touched that code in awhile, and certainly not in the
9.4.x branch. I'd have to agree that this must be a toolchain change.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: Tom Lane 2017-02-20 <30737.1487598355@sss.pgh.pa.us>
Hmph. We haven't touched that code in awhile, and certainly not in the
9.4.x branch. I'd have to agree that this must be a toolchain change.
FYI, in the meantime we could indeed trace it back to an libc issue on
Jessie:
$ cat sqrt.c
#include <math.h>
#include <stdio.h>
#include <fenv.h>
double
pg_hypot(double x, double y)
{
double yx;
/* Some PG-specific code deleted here */
/* Else, drop any minus signs */
x = fabs(x);
y = fabs(y);
/* Swap x and y if needed to make x the larger one */
if (x < y)
{
double temp = x;
x = y;
y = temp;
}
/*
* If y is zero, the hypotenuse is x. This test saves a few cycles in
* such cases, but more importantly it also protects against
* divide-by-zero errors, since now x >= y.
*/
if (y == 0.0)
return x;
/* Determine the hypotenuse */
yx = y / x;
return x * sqrt(1.0 + (yx * yx));
}
int main ()
{
//fesetround(FE_TONEAREST);
printf("fegetround is %d\n", fegetround());
double r = pg_hypot(10.0, 10.0);
printf("14 %.14g\n", r);
printf("15 %.15g\n", r);
printf("16 %.16g\n", r);
printf("17 %.17g\n", r);
return 0;
}
Jessie output:
fegetround is 0
14 14.142135623731
15 14.1421356237309
16 14.14213562373095
17 14.142135623730949
Sid output:
fegetround is 0
14 14.142135623731
15 14.142135623731
16 14.14213562373095
17 14.142135623730951
The Sid output is what the point and polygon tests are expecting.
Possible culprit is this bug report from November:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843904
(Though that doesn't explain why it affects 32bit powerpc only.)
Christoph
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Christoph Berg <christoph.berg@credativ.de> writes:
Re: Tom Lane 2017-02-20 <30737.1487598355@sss.pgh.pa.us>
Hmph. We haven't touched that code in awhile, and certainly not in the
9.4.x branch. I'd have to agree that this must be a toolchain change.
FYI, in the meantime we could indeed trace it back to an libc issue on
Jessie:
I wonder whether it's a compiler change, maybe along the lines of
rearranging the computation so that it gives a slightly different result.
Although you'd think that 10.0/10.0 would give exactly 1.0 no matter what.
Still, it'd be worth comparing the assembly code for your test program.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: Tom Lane 2017-02-20 <13825.1487607143@sss.pgh.pa.us>
Still, it'd be worth comparing the assembly code for your test program.
I was compiling the program on jessie and on sid, and running the
jessie binary on sid made it output the same as the sid binary, so the
difference isn't in the binary, but in some system library.
Christoph
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: To Tom Lane 2017-02-20 <20170220161556.5ukosuj5o572b4rn@msg.credativ.de>
I was compiling the program on jessie and on sid, and running the
jessie binary on sid made it output the same as the sid binary, so the
difference isn't in the binary, but in some system library.
Fwiw, the problem will be fixed in Jessie's glibc by backporting this update:
2015-02-12 Joseph Myers <joseph@codesourcery.com>
[BZ #17964]
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
__builtin_fma instead of relying on contraction of a * b + c.
(Upstream it's probably one of these, didn't dig deeper:
https://sourceware.org/git/?p=glibc.git&a=search&h=HEAD&st=commit&s=__builtin_fma)
Thanks for the input,
Christoph
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers