OpenBSD/Sparc status

Started by Andrew Dunstanabout 21 years ago23 messages
#1Andrew Dunstan
andrew@dunslane.net

The fix for unflushed changed to pg_database records seems to have fixed
the problem we were seeing on spoonbill ... but it is now seeing
problems with the seg module:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58

cheers

andrew

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#1)
Re: OpenBSD/Sparc status

Andrew Dunstan <andrew@dunslane.net> writes:

The fix for unflushed changed to pg_database records seems to have fixed
the problem we were seeing on spoonbill ... but it is now seeing
problems with the seg module:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&amp;dt=2004-11-18%2016:02:58

Don't tell me that just started happening? We haven't touched seg in
weeks...

I'm unsure how this could fail when float4 passes, because it's using
float4in to convert the strings.

regards, tom lane

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: OpenBSD/Sparc status

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The fix for unflushed changed to pg_database records seems to have fixed
the problem we were seeing on spoonbill ... but it is now seeing
problems with the seg module:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&amp;dt=2004-11-18%2016:02:58

Don't tell me that just started happening? We haven't touched seg in
weeks...

I'm unsure how this could fail when float4 passes, because it's using
float4in to convert the strings.

We're only seeing it now because up to now the run on this platform was
bombing out on the error you so brilliantly fixed last night.

You might recall I wanted to patch contrib/Makefile to force
installcheck on all modules regardless of error - if we had that we'd
have seen this before.

cheers

andrew

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#3)
Re: OpenBSD/Sparc status

Andrew Dunstan <andrew@dunslane.net> writes:

We're only seeing it now because up to now the run on this platform was
bombing out on the error you so brilliantly fixed last night.

Consistently? I'd have thought that problem would only fail once in a
while. It's hard to believe the timing would work out to make it a 100%
failure.

regards, tom lane

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#4)
Re: OpenBSD/Sparc status

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

We're only seeing it now because up to now the run on this platform was
bombing out on the error you so brilliantly fixed last night.

Consistently? I'd have thought that problem would only fail once in a
while. It's hard to believe the timing would work out to make it a 100%
failure.

You can see the history of the latest build runs here:

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&amp;br=HEAD

cheers

andrew

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: OpenBSD/Sparc status

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Consistently? I'd have thought that problem would only fail once in a
while. It's hard to believe the timing would work out to make it a 100%
failure.

You can see the history of the latest build runs here:
http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&amp;br=HEAD

Remarkable. There is one run (2004-11-15) where it got past the rtree
test (and did indeed fail at seg) but the failure rate is certainly
upwards of 90%. Curious. There must be some effect that is
synchronizing the bgwriter's actions with the test sequence.

Back at the ranch, I am even more surprised to note that the bogus
seg output in the 11-15 run is different from what it is in today's.
There's not much I can do about it without access to a machine where
it's failing though. Can we get personal accounts on the buildfarm
machines?

regards, tom lane

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#6)
Re: OpenBSD/Sparc status

Tom Lane wrote:

Can we get personal accounts on the buildfarm
machines?

That's up to the owner of each machine - it's a distributed system.

I've sent email to the owner of this one.

When I get a few minutes soon I hope to start some discussion on
-hackers about what members we want in the buildfarm and what our
expectations are about help with solving problems.

cheers

andrew

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#7)
Re: OpenBSD/Sparc status

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39
$ gcc -v
Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
Configured with:
Thread model: single
gcc version 3.3.2 (propolice)
$

regards, tom lane

#include <stdio.h>

float
returnfloat(float *x)
{
return *x;
}

int
main()
{
float x = 12.3;
union {
float f;
char *t;
} y;

y.f = returnfloat(&x);

printf("x = %g\n", x);
printf("y = %g\n", y.f);

return 0;
}

#9Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Tom Lane (#8)
Re: OpenBSD/Sparc status

Tom Lane wrote:

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39

woa - scary. I will report that to the OpenBSD-folks upstream - many
thanks for the nice testcase!

Stefan

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Stefan Kaltenbrunner (#9)
Re: OpenBSD/Sparc status

Stefan Kaltenbrunner wrote:

Tom Lane wrote:

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39

woa - scary. I will report that to the OpenBSD-folks upstream - many
thanks for the nice testcase!

very scary.

Meanwhile, what do we do? Turn off -O in src/template/openbsd for
some/all releases?

cheers

andrew

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#10)
Re: OpenBSD/Sparc status

Andrew Dunstan <andrew@dunslane.net> writes:

Meanwhile, what do we do? Turn off -O in src/template/openbsd for
some/all releases?

Certainly not. This problem is only known to exist in one gcc version
for one architecture, and besides it's only affecting (so far as we can
tell) one rather inessential contrib module. I'd say ignore the test
failure until Stefan can get a fixed gcc.

regards, tom lane

#12Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Tom Lane (#11)
Re: OpenBSD/Sparc status

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Meanwhile, what do we do? Turn off -O in src/template/openbsd for
some/all releases?

Certainly not. This problem is only known to exist in one gcc version
for one architecture, and besides it's only affecting (so far as we can
tell) one rather inessential contrib module. I'd say ignore the test
failure until Stefan can get a fixed gcc.

FWIW: I got the bug confirmed by Miod Vallat (OpenBSD hacker) on IRC, it
looks that at least OpenBSD 3.6-STABLE and OpenBSD-current on Sparc64
with the stock system compiler are affected.

Stefan

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Stefan Kaltenbrunner (#12)
Re: OpenBSD/Sparc status

Stefan Kaltenbrunner said:

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Meanwhile, what do we do? Turn off -O in src/template/openbsd for
some/all releases?

Certainly not. This problem is only known to exist in one gcc version
for one architecture, and besides it's only affecting (so far as we
can tell) one rather inessential contrib module. I'd say ignore the
test failure until Stefan can get a fixed gcc.

FWIW: I got the bug confirmed by Miod Vallat (OpenBSD hacker) on IRC,
it looks that at least OpenBSD 3.6-STABLE and OpenBSD-current on
Sparc64 with the stock system compiler are affected.

I guess my concern is that on Sparc64/OpenBSD-3.6* at least, this bug is
exposed by the seg tests but might well occur elsewhere and bite us in
various unpleasant ways.

I have no idea how many people out there are using this combination. Of
course, even it it's only one (and I suspect that's the right order of
magnitude) we should want to be careful with their data.

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
Re: OpenBSD/Sparc status

"Andrew Dunstan" <andrew@dunslane.net> writes:

I guess my concern is that on Sparc64/OpenBSD-3.6* at least, this bug is
exposed by the seg tests but might well occur elsewhere and bite us in
various unpleasant ways.

The experimentation I did to develop the test case suggested that the
problem only occurs when the result of a function returning float is
stored directly into a union member. That's a sufficiently weird case
that I'm reasonably confident it doesn't occur elsewhere in the backend.
It might be worth Stefan's time to vary the test case a bit (eg try
double instead of float, struct instead of union, etc) and see just how
general the bug is.

regards, tom lane

#15Darcy Buskermolen
darcy@wavefire.com
In reply to: Tom Lane (#8)
Re: OpenBSD/Sparc status

On November 19, 2004 10:55 am, you wrote:

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39
$ gcc -v
Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
Configured with:
Thread model: single
gcc version 3.3.2 (propolice)
$

I can confirm this behavior on Solaris 8/sparc 64 as well.

bash-2.03$ gcc -O -m64 test.c
bash-2.03$ ./a.out
x = 12.3
y = 2.51673e-42
bash-2.03$ file a.out
a.out: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically
linked, not stripped
bash-2.03$ gcc -v
Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.8/3.3.2/specs
Configured with: ../configure --with-as=/usr/ccs/bin/as
--with-ld=/usr/ccs/bin/ld --disable-nls
Thread model: posix
gcc version 3.3.2
bash-2.03$ gcc -m64 test.c
bash-2.03$ ./a.out
x = 12.3
y = 12.3
bash-2.03$ gcc -m64 -02 test.c
gcc: unrecognized option `-02'
bash-2.03$ gcc -m64 -O2 test.c
bash-2.03$ ./a.out
x = 12.3
y = 2.51673e-42
bash-2.03$ gcc -m64 -O3 test.c
bash-2.03$ ./a.out
x = 12.3
y = 12.3
bash-2.03$

regards, tom lane

#include <stdio.h>

float
returnfloat(float *x)
{
return *x;
}

int
main()
{
float x = 12.3;
union {
float f;
char *t;
} y;

y.f = returnfloat(&x);

printf("x = %g\n", x);
printf("y = %g\n", y.f);

return 0;
}

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx: 250.763.1759
http://www.wavefire.com

#16Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Tom Lane (#8)
Re: OpenBSD/Sparc status

Tom Lane wrote:

Darcy Buskermolen <darcy@wavefire.com> writes:

I can confirm this behavior on Solaris 8/sparc 64 as well.

bash-2.03$ gcc -m64 -O2 test.c
bash-2.03$ ./a.out
x = 12.3
y = 2.51673e-42
bash-2.03$ gcc -m64 -O3 test.c
bash-2.03$ ./a.out
x = 12.3
y = 12.3
bash-2.03$

Hmm. I hadn't bothered to try -O3 ... interesting that it works
correctly again at that level.

-O3 works on my box too

Anyway, this proves that it is an upstream gcc bug and not something
OpenBSD broke.

I just tried on solaris9 with gcc 3.4.2 - seems the bug is fixed in this
version. Unfortunably it is quite problematic to change the compiler
at least on OpenBSD gcc 3.3.2 is quite heavily modified on that platform
and switching the base system compiler might screw a boatload of other
tools.
The actual recommendation I got from the OpenBSD-folks was to add
"-mfaster-structs" to the compiler flags with seems to work around the
issue - I'm currently doing a full build to verify that though ...

Stefan

#17Michael Fuhr
mike@fuhr.org
In reply to: Darcy Buskermolen (#15)
Re: OpenBSD/Sparc status

On Tue, Nov 23, 2004 at 09:57:03AM -0800, Darcy Buskermolen wrote:

I can confirm this behavior on Solaris 8/sparc 64 as well.

gcc 3.4.2 on Solaris 9/sparc 64 appears to be okay.

% gcc -v
Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.9/3.4.2/specs
Configured with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --disable-nls
Thread model: posix
gcc version 3.4.2
% gcc -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% gcc -O -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% gcc -O2 -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% gcc -O3 -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% file a.out
a.out: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

#18Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Darcy Buskermolen (#15)
Re: OpenBSD/Sparc status

Darcy Buskermolen wrote:

On November 19, 2004 10:55 am, you wrote:

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39
$ gcc -v
Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
Configured with:
Thread model: single
gcc version 3.3.2 (propolice)
$

I can confirm this behavior on Solaris 8/sparc 64 as well.

some more datapoints:

solaris 2.9 with gcc 3.1 is broken(-O3 does not help here)
linux/sparc64 (debian) with gcc 3.3.5 is broken too

So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64
on all operating systems.

Stefan

#19Noname
jseymour@linxnet.com
In reply to: Stefan Kaltenbrunner (#18)
Re: OpenBSD/Sparc status

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

Darcy Buskermolen wrote:

On November 19, 2004 10:55 am, you wrote:

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39
$ gcc -v
Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
Configured with:
Thread model: single
gcc version 3.3.2 (propolice)
$

I can confirm this behavior on Solaris 8/sparc 64 as well.

some more datapoints:

solaris 2.9 with gcc 3.1 is broken(-O3 does not help here)
linux/sparc64 (debian) with gcc 3.3.5 is broken too

So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64
on all operating systems.

Yet Another Datapoint:

$ uname -a
SunOS jimsun 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine
$ gcc -v
...
gcc version 3.3.1
$ gcc -O -m64 test.c
$ a.out
x = 12.3
y = 2.55036e-42

Same on a "real" UltraSparc box, running Solaris 8 and gcc 3.3.1
at work.

Looks like it's time for a gcc upgrade.

Jim

#20Darcy Buskermolen
darcy@wavefire.com
In reply to: Noname (#19)
Re: OpenBSD/Sparc status

On November 23, 2004 11:37 am, Jim Seymour wrote:

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

Darcy Buskermolen wrote:

On November 19, 2004 10:55 am, you wrote:

The answer is: it's a gcc bug. The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39
$ gcc -v
Reading specs from
/usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs Configured
with:
Thread model: single
gcc version 3.3.2 (propolice)
$

I can confirm this behavior on Solaris 8/sparc 64 as well.

some more datapoints:

solaris 2.9 with gcc 3.1 is broken(-O3 does not help here)
linux/sparc64 (debian) with gcc 3.3.5 is broken too

So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64
on all operating systems.

Yet Another Datapoint:

$ uname -a
SunOS jimsun 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine
$ gcc -v
...
gcc version 3.3.1
$ gcc -O -m64 test.c
$ a.out
x = 12.3
y = 2.55036e-42

Same on a "real" UltraSparc box, running Solaris 8 and gcc 3.3.1
at work.

Looks like it's time for a gcc upgrade.

Jim

The following compilers work fine producing 12.3 at all optimization levels:

Sun C 5.5 2003/03/12
and
sparc-sun-solaris2.9-gcc (GCC) 3.4.1

I'm guessing we need to add some more configure logic to detect gcc versions
3.4 on sparc trying to produce 64bit code and disable optimizations, or else
bail out and ask them to upgrade.

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx: 250.763.1759
http://www.wavefire.com

#21Michael Fuhr
mike@fuhr.org
In reply to: Darcy Buskermolen (#20)
Re: OpenBSD/Sparc status

On Tue, Nov 23, 2004 at 12:47:28PM -0800, Darcy Buskermolen wrote:

I'm guessing we need to add some more configure logic to detect gcc versions
3.4 on sparc trying to produce 64bit code and disable optimizations, or else
bail out and ask them to upgrade.

Shouldn't that be gcc versions 3.3?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

#22Michael Fuhr
mike@fuhr.org
In reply to: Michael Fuhr (#17)
Re: OpenBSD/Sparc status

On Tue, Nov 23, 2004 at 11:34:44AM -0700, Michael Fuhr wrote:

gcc 3.4.2 on Solaris 9/sparc 64 appears to be okay.

But gcc 3.3.2 on Solaris 9/sparc 64 isn't.

% gcc -m64 test.c
% ./a.out
x = 12.3
y = 12.3

% gcc -O -m64 test.c
% ./a.out
x = 12.3
y = 2.51673e-42

% gcc -O2 -m64 test.c
% ./a.out
x = 12.3
y = 2.51673e-42

% gcc -O3 -m64 test.c
% ./a.out
x = 12.3
y = 12.3

% file a.out
a.out: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

#23Darcy Buskermolen
darcy@wavefire.com
In reply to: Michael Fuhr (#21)
Re: OpenBSD/Sparc status

On November 23, 2004 06:18 pm, Michael Fuhr wrote:

On Tue, Nov 23, 2004 at 12:47:28PM -0800, Darcy Buskermolen wrote:

I'm guessing we need to add some more configure logic to detect gcc
versions 3.4 on sparc trying to produce 64bit code and disable
optimizations, or else bail out and ask them to upgrade.

Shouldn't that be gcc versions 3.3?

My bad, It should have read prior to 3.4.

--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx: 250.763.1759
http://www.wavefire.com