Re: [HACKERS] Re: Bug#41277: postgresql 6.5.1-3 + sparc (sun4u) == nasty nasty crashes

Started by Tom Laneover 26 years ago3 messages
#1Tom Lane
tgl@sss.pgh.pa.us

"Oliver Elphick" <olly@lfix.co.uk> writes:

Yes. On followup, I am getting intermittant hard crashes when running
regress.sh or doing any operation with postgresql. Obviously, this is
more on the level of a sparc64 kernel problem, even, than a purely
postgres problem -- after all, no user process should be able to take
out the system this way.

Yipes...

Can postgresql developers tell from this what routine we are in when the
crash occurs? I suppose that log output is buffered; where can we turn
off buffering so that all possible output is saved to disk before the
crash?

The log is not nearly detailed enough to tell what routine we're in,
even if there weren't the buffering problem. Also, given that this is
a kernel crash, I'm not sure I'd assume that even fsync() after every
line of output would ensure that the last line made it to disk.

What you really want is a truss or strace log of kernel calls, anyhow,
but there's still the problem of getting it out to disk before the
crash. Better find a kernel-debugging expert to ask for advice...

regards, tom lane

#2Michael Alan Dorman
mdorman-pgsql.hackers@debian.org
In reply to: Tom Lane (#1)

Tom Lane <tgl@sss.pgh.pa.us> writes:

What you really want is a truss or strace log of kernel calls, anyhow,
but there's still the problem of getting it out to disk before the
crash. Better find a kernel-debugging expert to ask for advice...

Serial terminal, or printer or some such hooked up to a serial port.

Mike.

#3Adam Di Carlo
adam@onshore.com
In reply to: Tom Lane (#1)

Can postgresql developers tell from this what routine we are in when the
crash occurs? I suppose that log output is buffered; where can we turn
off buffering so that all possible output is saved to disk before the
crash?

The log is not nearly detailed enough to tell what routine we're in,
even if there weren't the buffering problem. Also, given that this is
a kernel crash, I'm not sure I'd assume that even fsync() after every
line of output would ensure that the last line made it to disk.

What you really want is a truss or strace log of kernel calls, anyhow,
but there's still the problem of getting it out to disk before the
crash. Better find a kernel-debugging expert to ask for advice...

Hopefully someone from the sparc or sparc64 team at Debian can look
into this. I am going on business travel for 4 days so will be away
from any Debian/SPARC machines for a while.

These are the questions which need to be answered:

* do other people running debian sparc finding the problem, using the
recipe I mentioned in previous email?

* Is it 2.2.9 specific? Sun4u specific?

* get strace output as Tom suggests

* shouldn't we notify the Sparc/Linux folks?

--
.....Adam Di Carlo....adam@onShore.com.....<URL:http://www.onShore.com/&gt;