2nd try @NetBSD/2.0 Alpha

Started by Larry Rosenmanabout 20 years ago15 messages
#1Larry Rosenman
ler@lerctr.org

Upped the stack to 8Mb. Now it dies in Plcheck.

Logs/bt.out in: http://www.lerctr.org/~ler/alphadeath2.tar.gz

$ tar tzvf alphadeath2.tar.gz
drwxr-xr-x 2 ler users 0 Oct 18 16:01 lastrun-logs
-rw-r--r-- 1 ler users 22708 Oct 18 14:50 lastrun-logs/CVS.log
-rw-r--r-- 1 ler users 11889 Oct 18 14:56
lastrun-logs/configure.log
-rw-r--r-- 1 ler users 228889 Oct 18 14:56 lastrun-logs/config.log
-rw-r--r-- 1 ler users 156157 Oct 18 15:32 lastrun-logs/make.log
-rw-r--r-- 1 ler users 201363 Oct 18 15:38 lastrun-logs/check.log
-rw-r--r-- 1 ler users 40733 Oct 18 15:45
lastrun-logs/make-contrib.log
-rw-r--r-- 1 ler users 49672 Oct 18 15:48
lastrun-logs/make-install.log
-rw-r--r-- 1 ler users 1531 Oct 18 15:49 lastrun-logs/initdb.log
-rw-r--r-- 1 ler users 61 Oct 18 15:49
lastrun-logs/startdb-1.log
-rw-r--r-- 1 ler users 128491 Oct 18 15:55
lastrun-logs/install-check.log
-rw-r--r-- 1 ler users 68 Oct 18 15:55
lastrun-logs/stopdb-1.log
-rw-r--r-- 1 ler users 61 Oct 18 15:55
lastrun-logs/startdb-2.log
-rw-r--r-- 1 ler users 2929 Oct 18 15:56
lastrun-logs/pl-install-check.log
-rw-r--r-- 1 ler users 6996 Oct 18 15:56
lastrun-logs/web-txn.data
-rw-r--r-- 1 ler users 3844 Oct 18 16:01 lastrun-logs/bt.out
tar: ustar vol 1, 16 files, 870400 bytes read, 0 bytes written in 1 secs
(870400 bytes/sec)
$

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611 US

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Larry Rosenman (#1)
Re: 2nd try @NetBSD/2.0 Alpha

On Tue, Oct 18, 2005 at 04:04:42PM -0500, Larry Rosenman wrote:

Upped the stack to 8Mb. Now it dies in Plcheck.

Logs/bt.out in: http://www.lerctr.org/~ler/alphadeath2.tar.gz

Wierd, it's dying in malloc() because the C library called kill() from
__libc_mutex_unlock().

You're not the only one though:

http://archive.netbsd.se/?ml=netbsd-users&a=2004-01&m=18027

No-one answered that one either...
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#2)
Re: 2nd try @NetBSD/2.0 Alpha

Martijn van Oosterhout <kleptog@svana.org> writes:

On Tue, Oct 18, 2005 at 04:04:42PM -0500, Larry Rosenman wrote:

Upped the stack to 8Mb. Now it dies in Plcheck.

Wierd, it's dying in malloc() because the C library called kill() from
__libc_mutex_unlock().

I wonder if this is related to the "threaded libpython doesn't work"
problem we've seen on some BSDen. Does this platform have separate
implementations of libc for threaded and unthreaded applications?
If so, and if libperl is trying to pull in a threaded libc along with
itself, maybe this is the symptom you'd see. It's reasonably probable
that this is the first call to malloc() after libperl has been loaded
into the backend ...

regards, tom lane

#4Larry Rosenman
ler@lerctr.org
In reply to: Tom Lane (#3)
Re: 2nd try @NetBSD/2.0 Alpha

Tom Lane wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

On Tue, Oct 18, 2005 at 04:04:42PM -0500, Larry Rosenman wrote:

Upped the stack to 8Mb. Now it dies in Plcheck.

Wierd, it's dying in malloc() because the C library called kill()
from __libc_mutex_unlock().

I wonder if this is related to the "threaded libpython doesn't work"
problem we've seen on some BSDen. Does this platform have separate
implementations of libc for threaded and unthreaded applications?
If so, and if libperl is trying to pull in a threaded libc along with
itself, maybe this is the symptom you'd see. It's reasonably
probable that this is the first call to malloc() after libperl has
been loaded into the backend ...

regards, tom lane

Doesn't appear to have a separate libc, HOWEVER, -lpthread may be screwing
us:

$ ldd perl
perl:
-lm.0 => /usr/lib/libm.so.0
-lcrypt.0 => /usr/lib/libcrypt.so.0
-lpthread.0 => /usr/lib/libpthread.so.0
-lperl =>
/usr/pkg/lib/perl5/5.8.0/alpha-netbsd-thread-multi/CORE/libperl.so
-lc.12 => /usr/lib/libc.so.12
$

I'm not the machines owner, but I can ask if we can get a NON-threaded PERL.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611 US

#5Martijn van Oosterhout
kleptog@svana.org
In reply to: Larry Rosenman (#4)
Re: 2nd try @NetBSD/2.0 Alpha

On Tue, Oct 18, 2005 at 05:03:35PM -0500, Larry Rosenman wrote:

Doesn't appear to have a separate libc, HOWEVER, -lpthread may be screwing
us:

<snip>

If it is that, does it work if you compile postgres with -lpthread.
Sure, we don't use the functions but maybe it's a prerequisite to be
able to dlopen() thread libs.

Should be quicker to test, just rerun the final link command by hand
with the extra option...

--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Larry Rosenman (#1)
Re: 2nd try @NetBSD/2.0 Alpha

I think in this case you need to try to install the PL manually and see
what happens - run psql, attach the debugger to the backend, and issue
"create language plperl ..."

Having it die at this stage is rather strange.

cheers

andrew

Larry Rosenman wrote:

Show quoted text

Upped the stack to 8Mb. Now it dies in Plcheck.

#7Larry Rosenman
ler@lerctr.org
In reply to: Martijn van Oosterhout (#5)
Re: 2nd try @NetBSD/2.0 Alpha

On Oct 18, 2005, at 5:11 PM, Martijn van Oosterhout wrote:

On Tue, Oct 18, 2005 at 05:03:35PM -0500, Larry Rosenman wrote:

Doesn't appear to have a separate libc, HOWEVER, -lpthread may be
screwing
us:

<snip>

If it is that, does it work if you compile postgres with -lpthread.
Sure, we don't use the functions but maybe it's a prerequisite to be
able to dlopen() thread libs.

Should be quicker to test, just rerun the final link command by hand
with the extra option...

I added a LIBS += -lpthread to the end of src/makefiles/
Makefile.netbsd and got a LOOP
on the make check :(

More ideas?

LER

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-351-4152 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611

#8Larry Rosenman
ler@lerctr.org
In reply to: Larry Rosenman (#7)
Re: 2nd try @NetBSD/2.0 Alpha

On Oct 18, 2005, at 8:49 PM, Larry Rosenman wrote:

On Oct 18, 2005, at 5:11 PM, Martijn van Oosterhout wrote:

On Tue, Oct 18, 2005 at 05:03:35PM -0500, Larry Rosenman wrote:

Doesn't appear to have a separate libc, HOWEVER, -lpthread may be
screwing
us:

<snip>

If it is that, does it work if you compile postgres with -lpthread.
Sure, we don't use the functions but maybe it's a prerequisite to be
able to dlopen() thread libs.

Should be quicker to test, just rerun the final link command by hand
with the extra option...

I added a LIBS += -lpthread to the end of src/makefiles/
Makefile.netbsd and got a LOOP
on the make check :(

More ideas?

LER

I've removed the --with-perl from the config for now, and am re-
running it yet again :)

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-351-4152 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Larry Rosenman (#7)
Re: 2nd try @NetBSD/2.0 Alpha

Larry Rosenman <ler@lerctr.org> writes:

I added a LIBS += -lpthread to the end of src/makefiles/
Makefile.netbsd and got a LOOP
on the make check :(

Er ... define "LOOP"?

regards, tom lane

#10Larry Rosenman
ler@lerctr.org
In reply to: Tom Lane (#9)
Re: 2nd try @NetBSD/2.0 Alpha

On Oct 18, 2005, at 9:39 PM, Tom Lane wrote:

Larry Rosenman <ler@lerctr.org> writes:

I added a LIBS += -lpthread to the end of src/makefiles/
Makefile.netbsd and got a LOOP
on the make check :(

Er ... define "LOOP"?

postgres master process sitting with 98%+ cpu for >1hour and NO
progress being made.

I could not find a truss/strace binary on the box :(

regards, tom lane

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-351-4152 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611

#11Michael Fuhr
mike@fuhr.org
In reply to: Larry Rosenman (#10)
Re: 2nd try @NetBSD/2.0 Alpha

On Tue, Oct 18, 2005 at 09:41:21PM -0500, Larry Rosenman wrote:

I could not find a truss/strace binary on the box :(

In BSD land try ktrace.

--
Michael Fuhr

#12Michael Fuhr
mike@fuhr.org
In reply to: Michael Fuhr (#11)
Re: 2nd try @NetBSD/2.0 Alpha

On Tue, Oct 18, 2005 at 08:59:23PM -0600, Michael Fuhr wrote:

On Tue, Oct 18, 2005 at 09:41:21PM -0500, Larry Rosenman wrote:

I could not find a truss/strace binary on the box :(

In BSD land try ktrace.

...or attach with a debugger like gdb.

--
Michael Fuhr

#13Larry Rosenman
ler@lerctr.org
In reply to: Michael Fuhr (#12)
Re: 2nd try @NetBSD/2.0 Alpha

On Oct 18, 2005, at 10:03 PM, Michael Fuhr wrote:

On Tue, Oct 18, 2005 at 08:59:23PM -0600, Michael Fuhr wrote:

On Tue, Oct 18, 2005 at 09:41:21PM -0500, Larry Rosenman wrote:

I could not find a truss/strace binary on the box :(

In BSD land try ktrace.

...or attach with a debugger like gdb.

d'oh. I go stupid occasionally :)

If someone wants me to, I can try that.

As to the without perl build, it dies in contribcheck. Logs in:
http://www.lerctr.org/~ler/alphacontribdeath.tar.gz

$ tar tzvf alphacontribdeath.tar.gz
drwxr-xr-x 2 ler users 0 Oct 18 22:00 lastrun-logs
-rw-r--r-- 1 ler users 22708 Oct 18 21:25 lastrun-logs/
CVS.log
-rw-r--r-- 1 ler users 11453 Oct 18 21:29 lastrun-logs/
configure.log
-rw-r--r-- 1 ler users 227987 Oct 18 21:29 lastrun-logs/
config.log
-rw-r--r-- 1 ler users 154407 Oct 18 21:47 lastrun-logs/
make.log
-rw-r--r-- 1 ler users 201363 Oct 18 21:50 lastrun-logs/
check.log
-rw-r--r-- 1 ler users 40733 Oct 18 21:54 lastrun-logs/
make-contrib.log
-rw-r--r-- 1 ler users 49358 Oct 18 21:54 lastrun-logs/
make-install.log
-rw-r--r-- 1 ler users 1531 Oct 18 21:55 lastrun-logs/
initdb.log
-rw-r--r-- 1 ler users 60 Oct 18 21:55 lastrun-logs/
startdb-1.log
-rw-r--r-- 1 ler users 128491 Oct 18 21:57 lastrun-logs/
install-check.log
-rw-r--r-- 1 ler users 65 Oct 18 21:57 lastrun-logs/
stopdb-1.log
-rw-r--r-- 1 ler users 60 Oct 18 21:57 lastrun-logs/
startdb-2.log
-rw-r--r-- 1 ler users 711 Oct 18 21:57 lastrun-logs/pl-
install-check.log
-rw-r--r-- 1 ler users 64 Oct 18 21:57 lastrun-logs/
stopdb-2.log
-rw-r--r-- 1 ler users 60 Oct 18 21:57 lastrun-logs/
startdb-3.log
-rw-r--r-- 1 ler users 18425 Oct 18 21:57 lastrun-logs/
install-contrib.log
-rw-r--r-- 1 ler users 41443 Oct 18 22:00 lastrun-logs/
contrib-install-check.log
-rw-r--r-- 1 ler users 45714 Oct 18 22:00 lastrun-logs/web-
txn.data
tar: ustar vol 1, 19 files, 962560 bytes read, 0 bytes written in 1
secs (962560 bytes/sec)
$

--
Michael Fuhr

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-351-4152 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611

#14Martijn van Oosterhout
kleptog@svana.org
In reply to: Larry Rosenman (#13)
Re: 2nd try @NetBSD/2.0 Alpha

On Tue, Oct 18, 2005 at 10:07:26PM -0500, Larry Rosenman wrote:

...or attach with a debugger like gdb.

d'oh. I go stupid occasionally :)

If someone wants me to, I can try that.

Yes, actually. See, its dying in the seg test already with:

-- Open intervals
SELECT '0..'::seg AS seg;
! ERROR: floating-point exception
! DETAIL: An invalid floating-point operation was signaled. This
probably means an out-of-range result or an invalid operation, s
uch as division by zero.
SELECT '0...'::seg AS seg;
! ERROR: floating-point exception
! DETAIL: An invalid floating-point operation was signaled. This
probably means an out-of-range result or an invalid operation, s
uch as division by zero.

You need to attach a debugger to find out where that error is actually
happenening. Just startup the backend, connect to it and connect gdb to
the newly spawned backend and just run that query by hand. Then you
should get the backtrace at SIGFPE.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#15Larry Rosenman
ler@lerctr.org
In reply to: Martijn van Oosterhout (#14)
Re: 2nd try @NetBSD/2.0 Alpha

Martijn van Oosterhout wrote:

On Tue, Oct 18, 2005 at 10:07:26PM -0500, Larry Rosenman wrote:

...or attach with a debugger like gdb.

d'oh. I go stupid occasionally :)

If someone wants me to, I can try that.

Yes, actually. See, its dying in the seg test already with:

-- Open intervals
SELECT '0..'::seg AS seg;
! ERROR: floating-point exception
! DETAIL: An invalid floating-point operation was signaled. This
probably means an out-of-range result or an invalid operation, s
uch as division by zero. SELECT '0...'::seg AS seg; ! ERROR:
floating-point exception ! DETAIL: An invalid floating-point
operation was signaled. This probably means an out-of-range result or
an invalid operation, s uch as division by zero.

You need to attach a debugger to find out where that error is
actually happenening. Just startup the backend, connect to it and
connect gdb to the newly spawned backend and just run that query by
hand. Then you should get the backtrace at SIGFPE.

I don't have the time today (need to do some paying work). However,
If someone wants, I can pass an ID/PW along so that they
May find it. (or it can wait till the weekend).

LER

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler@lerctr.org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611 US