Re: Why is "osprey" dumping core in REL8_2 branch?

Started by Tom Lanealmost 19 years ago3 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I wrote:

=?ISO-8859-1?Q?R=E9mi_Zara?= <remi_zara@mac.com> writes:

(gdb) info locals
block = 0x4395000
chunk = 0x4395010
priorfree = 0x5395020
chunk_size = 16777216
blksize = 70864912
(gdb) p *block
$5 = {aset = 0x306d10, next = 0x0, freeptr = 0x5395020 <Address 0x5395020 out of bounds>, endptr = 0x5395020 <Address 0x5395020 out of bounds>}

Well, that's pretty dang interesting. If the end of the block is indeed
out of bounds as gdb claims, that'd explain why it crashes right here
(actually the crash would be induced by the preceding line of code,
where it tries to store a marker byte). But how can that be, unless
malloc is completely broken? And if it is, why's it only affecting the
8.2 branch? I'm confused.

Whoa ... osprey just went green in the 8.2 branch, following what is
most surely an unrelated patch in vacuum.c. Can anyone explain that?
I distrust gift horses ...

regards, tom lane

#2Rémi Zara
remi_zara@mac.com
In reply to: Tom Lane (#1)
1 attachment(s)

Hi,

I know the answer :)

I tried to find the patch that caused the failure, and when doing so,
rechecking a build which had succeeded now failed. So this was an
environment problem.

The solution was to change the ulimit for data segment size. I hadn't
thought of that because I had originally enabled this conf because pg
would not otherwise BUILD...

Doesn't this mean that there is some place where the return value of
malloc is not checked for null ?

Regards,

Rémi Zara

Le 11 mars 07 à 08:32, Tom Lane a écrit :

Show quoted text

I wrote:

=?ISO-8859-1?Q?R=E9mi_Zara?= <remi_zara@mac.com> writes:

(gdb) info locals
block = 0x4395000
chunk = 0x4395010
priorfree = 0x5395020
chunk_size = 16777216
blksize = 70864912
(gdb) p *block
$5 = {aset = 0x306d10, next = 0x0, freeptr = 0x5395020 <Address
0x5395020 out of bounds>, endptr = 0x5395020 <Address 0x5395020
out of bounds>}

Well, that's pretty dang interesting. If the end of the block is
indeed
out of bounds as gdb claims, that'd explain why it crashes right here
(actually the crash would be induced by the preceding line of code,
where it tries to store a marker byte). But how can that be, unless
malloc is completely broken? And if it is, why's it only
affecting the
8.2 branch? I'm confused.

Whoa ... osprey just went green in the 8.2 branch, following what is
most surely an unrelated patch in vacuum.c. Can anyone explain that?
I distrust gift horses ...

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that
your
message can get through to the mailing list cleanly

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rémi Zara (#2)

=?ISO-8859-1?Q?R=E9mi_Zara?= <remi_zara@mac.com> writes:

The solution was to change the ulimit for data segment size.

Oh really ...

Doesn't this mean that there is some place where the return value of
malloc is not checked for null ?

You can see for yourself that the value *is* checked in the routine
that's at issue --- see line 520 in 8.2's aset.c. Also the gdb'ing
you did showed that a nonzero value had been returned.

I think what you're looking at is a platform-specific bug in malloc().

regards, tom lane