osprey dumped core on 8.2
Osprey is a NetBSD running on m68k
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=osprey&dt=2007-02-22%2023:00:18
It dumped core running VACUUM:
--- 1,5 ----
VACUUM;
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost
The stack trace report looks incomplete:
================== stack trace: pgsql.27009/src/test/regress/tmp_check/data/postgres.core ==================
Core was generated by `postgres'.
Program terminated with signal 11, Segmentation fault.
#0 0x001f74d6 in AllocSetAlloc (context=0x307d10, size=16777212) at aset.c:546
546 if (set->blocks != NULL)
It's missing the "bt" part.
I don't understand how can this happen, given that "set" cannot be NULL
at this point.
--
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Puedes elegir el color de tu auto -- siempre y cuando sea negro."
(Henry Ford)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Osprey is a NetBSD running on m68k
Yeah, it's been failing consistently on the 8.2 branch for a while, but
not either 8.1 or HEAD, which is awfully strange.
Program terminated with signal 11, Segmentation fault.
#0 0x001f74d6 in AllocSetAlloc (context=0x307d10, size=16777212) at aset.c:546
546 if (set->blocks != NULL)
I don't understand how can this happen, given that "set" cannot be NULL
at this point.
I talked to Remi about this last month, and we concluded that the core
dump is probably really at the line just prior, where it's trying to
stick a marker at the end of the used space:
((char *) AllocChunkGetPointer(chunk))[size] = 0x7E;
But neither of us could see how that could happen unless malloc is
outright broken. Remi did some gdb'ing that seemed to indicate
that malloc had failed to provide a block as large as it claimed:
: =?ISO-8859-1?Q?R=E9mi_Zara?= <remi_zara@mac.com> writes:
: > (gdb) info locals
: > block = 0x4395000
: > chunk = 0x4395010
: > priorfree = 0x5395020
: > chunk_size = 16777216
: > blksize = 70864912
: > (gdb) p *block
: > $5 = {aset = 0x306d10, next = 0x0, freeptr = 0x5395020 <Address 0x5395020 out of bounds>, endptr = 0x5395020 <Address 0x5395020 out of bounds>}
:
: Well, that's pretty dang interesting. If the end of the block is indeed
: out of bounds as gdb claims, that'd explain why it crashes right here
: (actually the crash would be induced by the preceding line of code,
: where it tries to store a marker byte). But how can that be, unless
: malloc is completely broken? And if it is, why's it only affecting the
: 8.2 branch? I'm confused.
and it kinda tailed off there ...
regards, tom lane