User functions and AIX

Started by D'Arcy J.M. Cainalmost 25 years ago9 messageshackers
Jump to latest
#1D'Arcy J.M. Cain
darcy@druid.net

IBM is trying to find the answer to this but I thought I would throw
this out here to see if anyone can help me. I am compiling a user
defined type on AIX and it fails when I try to use it. The type is
chkpass and it is in the contrib directory. It fails with a core dump
at line 88 in chkpass.c. The line reads as follows.

result = (chkpass *) palloc(sizeof(chkpass));

The top of the backtrace looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1087a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x10045cf4 in or_clause (clause=0x0) at clauses.c:211
#3 0x10075d68 in int82ge (fcinfo=0x1015cfc8) at int8.c:343
#4 0x1005909c in _readArrayRef () at readfuncs.c:924
#5 0x10059b68 in _readSeqScan () at readfuncs.c:600

It looks like the dynamically loaded object (chkpass.so) can't determine
the address of palloc() from the parent. I assume I need a flag for the
compile either on the main build to export the addresses or on the build
of chkpass to tell it where to look up the addresses. Anyone been
through this that might be able to shed some light?

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: D'Arcy J.M. Cain (#1)
Re: User functions and AIX

darcy@druid.net (D'Arcy J.M. Cain) writes:

The top of the backtrace looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1087a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x10045cf4 in or_clause (clause=0x0) at clauses.c:211
#3 0x10075d68 in int82ge (fcinfo=0x1015cfc8) at int8.c:343
#4 0x1005909c in _readArrayRef () at readfuncs.c:924
#5 0x10059b68 in _readSeqScan () at readfuncs.c:600

I don't believe a word of that backtrace, and neither should you.
The alleged call arcs at levels below #1 do not exist in the code.
Ergo, I doubt the top two levels can be trusted either.

regards, tom lane

#3D'Arcy J.M. Cain
darcy@druid.net
In reply to: Tom Lane (#2)
Re: User functions and AIX

Thus spake Tom Lane

darcy@druid.net (D'Arcy J.M. Cain) writes:

The top of the backtrace looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1087a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x10045cf4 in or_clause (clause=0x0) at clauses.c:211
#3 0x10075d68 in int82ge (fcinfo=0x1015cfc8) at int8.c:343
#4 0x1005909c in _readArrayRef () at readfuncs.c:924
#5 0x10059b68 in _readSeqScan () at readfuncs.c:600

I don't believe a word of that backtrace, and neither should you.
The alleged call arcs at levels below #1 do not exist in the code.
Ergo, I doubt the top two levels can be trusted either.

Can you clarify? I see or_clause takes a clause arg and I assumed that
the fcinfo is hidden in the macro. I don't understand how the arg for
chkpass_in can be NULL. I'm also not sure why these functions are involved
in reading the chkpass type.

Hmm. I just rebooted and reran the test (SELECT 'hello'::chkpass) and
it gave me a different stacktrace. It looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1085a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x1004b874 in OidFunctionCall3 (functionId=269952520, arg1=269952532,
arg2=269952540, arg3=269952548) at fmgr.c:1136
#3 0x1007f350 in stringTypeDatum (tp=0x10172694, string=0x101726a0 "pendant",
atttypmod=269952680) at parse_type.c:181
#4 0x10060630 in parser_typecast_constant (expr=0x10172794,
typename=0x101727a0) at parse_expr.c:876
#5 0x10061188 in transformExpr (pstate=0x10172910, expr=0x10172920,
precedence=269953332) at parse_expr.c:118
#6 0x10076f28 in transformTargetEntry (pstate=0x258, node=0x5c,
expr=0x2ff1df70, colname=0x101729f4 "inner", resjunk=16 '\020')
at parse_target.c:56
#7 0x10077198 in transformTargetList (pstate=0x10172ab8,
targetlist=0x10172ac0) at parse_target.c:158
#8 0x10093c10 in transformSelectStmt (pstate=0x10172b80, stmt=0x10172b88)
at analyze.c:1835
#9 0x1009497c in transformStmt (pstate=0x20000890, parseTree=0x2001f43c)
at analyze.c:226
#10 0x10094ca4 in parse_analyze (parseTree=0x100195f8,
parentParseState=0x200008a4) at analyze.c:86

I still can't follow the logic through the code. And chkpass_in is still
being called with a null pointer according to this.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: D'Arcy J.M. Cain (#3)
Re: User functions and AIX

darcy@druid.net (D'Arcy J.M. Cain) writes:

I'm also not sure why these functions are involved
in reading the chkpass type.

Precisely my point: they're not. That backtrace is false data.

Hmm. I just rebooted and reran the test (SELECT 'hello'::chkpass) and
it gave me a different stacktrace. It looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1085a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x1004b874 in OidFunctionCall3 (functionId=269952520, arg1=269952532,
arg2=269952540, arg3=269952548) at fmgr.c:1136
#3 0x1007f350 in stringTypeDatum (tp=0x10172694, string=0x101726a0 "pendant",
atttypmod=269952680) at parse_type.c:181
#4 0x10060630 in parser_typecast_constant (expr=0x10172794,
typename=0x101727a0) at parse_expr.c:876

This one I believe to the extent of the series of function calls, but
it's still giving you wrong info about the passed parameters, which
is pretty common if you compiled at -O2 or higher. Try recompiling with
"-O0 -g" if you need trustworthy parameter info from the backtrace.

regards, tom lane

#5D'Arcy J.M. Cain
darcy@druid.net
In reply to: Tom Lane (#4)
Re: User functions and AIX

Thus spake Tom Lane

darcy@druid.net (D'Arcy J.M. Cain) writes:

I'm also not sure why these functions are involved
in reading the chkpass type.

Precisely my point: they're not. That backtrace is false data.

Hmm. I just rebooted and reran the test (SELECT 'hello'::chkpass) and
it gave me a different stacktrace. It looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1085a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x1004b874 in OidFunctionCall3 (functionId=269952520, arg1=269952532,
arg2=269952540, arg3=269952548) at fmgr.c:1136
#3 0x1007f350 in stringTypeDatum (tp=0x10172694, string=0x101726a0 "pendant",
atttypmod=269952680) at parse_type.c:181
#4 0x10060630 in parser_typecast_constant (expr=0x10172794,
typename=0x101727a0) at parse_expr.c:876

This one I believe to the extent of the series of function calls, but
it's still giving you wrong info about the passed parameters, which
is pretty common if you compiled at -O2 or higher. Try recompiling with
"-O0 -g" if you need trustworthy parameter info from the backtrace.

Is that an AIX thing? I generally get reasonable traces on NetBSD.

Anyway, I took your advice and now I get this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1085aac in chkpass_in (fcinfo=0x2ff1dcb8) at chkpass.c:88
#2 0x1004b874 in OidFunctionCall3 (functionId=269952520, arg1=269952532,
arg2=269952540, arg3=269952548) at fmgr.c:1136
#3 0x1007f350 in stringTypeDatum (tp=0x10172694, string=0x101726a0 "pendant",
atttypmod=269952680) at parse_type.c:181
#4 0x10060630 in parser_typecast_constant (expr=0x10172794,
typename=0x101727a0) at parse_expr.c:876
#5 0x10061188 in transformExpr (pstate=0x10172910, expr=0x10172920,
precedence=269953332) at parse_expr.c:118
#6 0x10076f28 in transformTargetEntry (pstate=0x258, node=0x5c,
expr=0x2ff1df70, colname=0x101729f4 "inner", resjunk=16 '\020')
at parse_target.c:56
#7 0x10077198 in transformTargetList (pstate=0x10172ab8,
targetlist=0x10172ac0) at parse_target.c:158
#8 0x10093c10 in transformSelectStmt (pstate=0x10172b80, stmt=0x10172b88)
at analyze.c:1835
#9 0x1009497c in transformStmt (pstate=0x20000890, parseTree=0x2001f43c)
at analyze.c:226
#10 0x10094ca4 in parse_analyze (parseTree=0x100195f8,
parentParseState=0x200008a4) at analyze.c:86

Looking better. It still seems to be the same error I saw to start with
though. It seems that the loaded dynamic object can't find the address
for palloc() and so jumps to 0. I'm sure that it is an AIX thing but
even IBM can't seem to find the problem.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#6Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: D'Arcy J.M. Cain (#5)
AW: User functions and AIX

IBM is trying to find the answer to this but I thought I would throw
this out here to see if anyone can help me. I am compiling a user
defined type on AIX and it fails when I try to use it. The type is
chkpass and it is in the contrib directory. It fails with a core dump
at line 88 in chkpass.c. The line reads as follows.

result = (chkpass *) palloc(sizeof(chkpass));

The top of the backtrace looks like this.

#0 0x0 in ?? () from (unknown load module)
#1 0xd1087a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88
#2 0x10045cf4 in or_clause (clause=0x0) at clauses.c:211
#3 0x10075d68 in int82ge (fcinfo=0x1015cfc8) at int8.c:343
#4 0x1005909c in _readArrayRef () at readfuncs.c:924
#5 0x10059b68 in _readSeqScan () at readfuncs.c:600

It looks like the dynamically loaded object (chkpass.so)
can't determine
the address of palloc() from the parent. I assume I need a
flag for the
compile either on the main build to export the addresses or
on the build
of chkpass to tell it where to look up the addresses. Anyone been
through this that might be able to shed some light?

Tell me your link line, OS and compiler version.
And have you forgotten to include -bI:postgres.imp ?
In general it is imho a good idea to copy the appropriate
compile and link flags from the regression test, that compiles
shared libs in .../contrib.

Andreas

#7D'Arcy J.M. Cain
darcy@druid.net
In reply to: Zeugswetter Andreas SB (#6)
Re: AW: User functions and AIX

Thus spake Zeugswetter Andreas SB

IBM is trying to find the answer to this but I thought I would throw

...

Tell me your link line, OS and compiler version.
And have you forgotten to include -bI:postgres.imp ?

Bingo! I can't believe that IBM has been wrestling with this for a week.
Part of the reason we are thinking of going with IBM is for the support.

Here is my Makefile now. I'm not sure about that -lc there as I get duplicate
symbol warnings but it appears to work fine.

#
# Local PostgreSQL types
# Written by D'Arcy J.M. Cain (darcy@druid.net)
#
# $Id: Makefile,v 1.1 2000/06/23 17:03:40 root Exp $

PGDIR = /usr/local/pgsql
PGINCDIR = /home/darcy/postgresql-7.1/src/include
PGLIBDIR = /usr/local/pgsql/lib
CFLAGS = -g -O0 -pipe -ansi -Wall -Wshadow -Wpointer-arith -Wcast-qual \
-I ${PGINCDIR} -L ${PGLIBDIR} \
-Wwrite-strings -Wmissing-prototypes
OBJS = chkpass.o
SH_OBJS = chkpass.so

.SUFFIXES: .so

.o.so:
ld -G -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \
-bexpall -bnoentry -lc

.c.o:
gcc ${CFLAGS} -c $<

all: ${SH_OBJS}

install: all
cp ${SH_OBJS} ${PGDIR}/modules
sed "s+%%PGDIR%%+${PGDIR}+g" < chkpass.sql > ${PGDIR}/modules/chkpass.sql

clean:
rm -f ${OBJS} ${SH_OBJS}

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#8Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: D'Arcy J.M. Cain (#7)
AW: AW: User functions and AIX

IBM is trying to find the answer to this but I thought I would throw ...

Tell me your link line, OS and compiler version.
And have you forgotten to include -bI:postgres.imp ?

Bingo! I can't believe that IBM has been wrestling with this for a week.
Part of the reason we are thinking of going with IBM is for the support.

Shared libs are obviously not their strong side :-)
Basically we are very happy with their RS6000's and AIX though.

Here is my Makefile now. I'm not sure about that -lc there
as I get duplicate symbol warnings but it appears to work fine.

they don't matter

CFLAGS = -g -O0 -pipe -ansi -Wall -Wshadow -Wpointer-arith

gcc and not xlc :-) actually xlc produces faster code, but I don't think that makes a
noticeable difference.

.o.so:
ld -G -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \
-bexpall -bnoentry -lc

Always use the compiler for linking instead of ld:
gcc -Wl,-H512 -Wl,-bM:SRE -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \
-bexpall -bnoentry

You are not allowed to leave anything unresolved, thus do not use -G, or you won't notice
unresolved externals (-G includes -berok which you don't want at all).

Andreas

#9D'Arcy J.M. Cain
darcy@druid.net
In reply to: Zeugswetter Andreas SB (#8)
Re: AW: AW: User functions and AIX

Thus spake Zeugswetter Andreas SB

Bingo! I can't believe that IBM has been wrestling with this for a week.
Part of the reason we are thinking of going with IBM is for the support.

Shared libs are obviously not their strong side :-)

Tell me about it.

Basically we are very happy with their RS6000's and AIX though.

With PostgreSQL? See below.

Here is my Makefile now. I'm not sure about that -lc there
as I get duplicate symbol warnings but it appears to work fine.

they don't matter

CFLAGS = -g -O0 -pipe -ansi -Wall -Wshadow -Wpointer-arith

gcc and not xlc :-) actually xlc produces faster code, but I don't think that makes a
noticeable difference.

Hmm. Should I get rid of gcc and build PostgreSQL with xlc do you think?
Some people have told me that gcc is actually faster.

.o.so:
ld -G -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \
-bexpall -bnoentry -lc

Always use the compiler for linking instead of ld:
gcc -Wl,-H512 -Wl,-bM:SRE -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \
-bexpall -bnoentry

I'll do that. Thanks.

You are not allowed to leave anything unresolved, thus do not use -G, or you won't notice
unresolved externals (-G includes -berok which you don't want at all).

I wasn't sure if I needed that. I will remove it.

OK, so I built it and loaded my database. I tried to load a very big
table (383969 rows) and the copy failed because it was too big. I split
the input into smaller chunks but when I ran it I got the following error.

ERROR: copy: line 1, Memory exhausted in AllocSetAlloc(858864139)

There is no way that I could have used that much memory in the first row.
I dropped the table and recreated it and the load worked. Although it
works now I still feel a little uneasy.

Thanks for your help.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.