porting question: funky uid names?
Hi pgsql-hackers,
I'm currently porting 7.0.3 to the HP MPE/iX OS to join my other ports of
Apache, BIND, sendmail, Perl, and others. I'm at the point where I'm trying to
run the "make runcheck" regression tests, and I've just run into a problem
where I need to seek the advice of psql-hackers.
MPE is a proprietary OS with a POSIX layer on top. The concept of POSIX uids
and gids has been mapped to the concept of MPE usernames and MPE accountnames.
An example MPE username would be "MGR.BIXBY", and if you do a POSIX
getpwuid(getuid()), the contents of pw_name will be the same "MGR.BIXBY".
The fact that pw_name contains a period on MPE has been confusing to some
previous ports I've done, and it now appears PostgreSQL is being confused too.
Make runcheck is dying in the initdb phase:
Creating global relations in /blah/blah/blah
ERROR: pg_atoi: error in "BIXBY": can't parse "BIXBY"
ERROR: pg_atoi: error in "BIXBY": can't parse "BIXBY"
syntax error 25 : -> .
I'm guessing that something tried to parse "MGR.BIXBY", saw the decimal point
character and passed the string to pg_atoi() thinking it's a number instead of
a name. This seems like a really bad omen hinting at trouble on a fundamental
level.
What are my options here?
1) I'm screwed; go try porting MySQL instead. ;-)
2) Somehow modify username parsing to be tolerant of the "." character? I was
able to do this when I ported sendmail. Where should I be looking in the
PostgreSQL source? Is this going to require language grammar changes?
3) Always specify numeric uids instead of user names. Is this even possible?
Your advice will be greatly appreciated. MPE users are currently whining on
their mailing list about the lack of standard databases for the platform, and I
wanted to surprise them by releasing a PostgreSQL port.
Thanks!
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Mark Bixby <mark@bixby.org> writes:
MPE is a proprietary OS with a POSIX layer on top. The concept of
POSIX uids and gids has been mapped to the concept of MPE usernames
and MPE accountnames. An example MPE username would be "MGR.BIXBY",
and if you do a POSIX getpwuid(getuid()), the contents of pw_name will
be the same "MGR.BIXBY".
Hm. And what is returned in pw_uid?
I think you are getting burnt by initdb's attempt to assign the postgres
superuser's numeric ID to be the same as the Unix userid number of the
user running initdb. Look at the uses of pg_id in the initdb script,
and experiment with running pg_id by hand to see what it produces.
A quick and dirty experiment would be to run "initdb -i 42" (or
whatever) to override the result of pg_id. If that succeeds, the
real answer may be that pg_id needs a patch to behave reasonably on MPE.
Let us know...
regards, tom lane
Tom Lane wrote:
Mark Bixby <mark@bixby.org> writes:
MPE is a proprietary OS with a POSIX layer on top. The concept of
POSIX uids and gids has been mapped to the concept of MPE usernames
and MPE accountnames. An example MPE username would be "MGR.BIXBY",
and if you do a POSIX getpwuid(getuid()), the contents of pw_name will
be the same "MGR.BIXBY".Hm. And what is returned in pw_uid?
A valid numeric uid.
I think you are getting burnt by initdb's attempt to assign the postgres
superuser's numeric ID to be the same as the Unix userid number of the
user running initdb. Look at the uses of pg_id in the initdb script,
and experiment with running pg_id by hand to see what it produces.
pg_id without parameters returns uid=484(MGR.BIXBY), which matches what I get
from MPE's native id command.
The pg_id -n and -u options behave as expected.
A quick and dirty experiment would be to run "initdb -i 42" (or
whatever) to override the result of pg_id. If that succeeds, the
real answer may be that pg_id needs a patch to behave reasonably on MPE.
I just hacked src/test/regress/run_check.sh to invoke initdb with --show. The
user name/id is behaving "correctly" for an MPE machine:
SUPERUSERNAME: MGR.BIXBY
SUPERUSERID: 484
The initdb -i option will only override the SUPERUSERID, but it's already
correct.
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Mark Bixby <mark@bixby.org> writes:
I just hacked src/test/regress/run_check.sh to invoke initdb with
--show. The user name/id is behaving "correctly" for an MPE machine:
SUPERUSERNAME: MGR.BIXBY
SUPERUSERID: 484
Okay, so much for that theory.
Can you set a breakpoint at elog() and provide a stack backtrace so we
can see where this is happening? I can't think where else in the code
might be affected, but obviously the problem is somewhere else...
regards, tom lane
Mark Bixby writes:
Creating global relations in /blah/blah/blah
ERROR: pg_atoi: error in "BIXBY": can't parse "BIXBY"
ERROR: pg_atoi: error in "BIXBY": can't parse "BIXBY"
syntax error 25 : -> .
I'm curious about that last line. Is that the shell complaining?
The offending command seems to be
insert OID = 0 ( POSTGRES PGUID t t t t _null_ _null_ )
in the file global1.bki.source. (This is the file the creates the global
relations.) The POSTGRES and PGUID quantities are substituted when initdb
runs:
cat "$GLOBAL" \
| sed -e "s/POSTGRES/$POSTGRES_SUPERUSERNAME/g" \
-e "s/PGUID/$POSTGRES_SUPERUSERID/g" \
| "$PGPATH"/postgres $BACKENDARGS template1
For some reason the line probably ends up being
insert OID = 0 ( MGR BIXBY 484 t t t t _null_ _null_ )
^
which causes the observed failure to parse BIXBY as user id. This brings
us back to why the dot disappears, which seems to be related to the error
message
syntax error 25 : -> .
^^^
Can you try using a different a sed command (e.g, GNU sed)?
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes:
cat "$GLOBAL" \
| sed -e "s/POSTGRES/$POSTGRES_SUPERUSERNAME/g" \
-e "s/PGUID/$POSTGRES_SUPERUSERID/g" \
| "$PGPATH"/postgres $BACKENDARGS template1
For some reason the line probably ends up being
insert OID = 0 ( MGR BIXBY 484 t t t t _null_ _null_ )
^
which causes the observed failure to parse BIXBY as user id.
Good thought. Just looking at this, I wonder if we shouldn't flip the
order of the sed patterns --- as is, won't it mess up if the superuser
name contains PGUID?
A further exercise would be to make it not foul up if the superuser name
contains '/'. I'd be kind of inclined to use ':' for the pattern
delimiter, since in normal Unix practice usernames can't contain colons
(cf. passwd file format). Of course one doesn't generally put a slash
in a username either, but I think it's physically possible to do it...
But none of these fully explain Mark's problem. If we knew where the
"syntax error 25 : -> ." came from, we'd be closer to an answer.
regards, tom lane
Tom Lane wrote:
But none of these fully explain Mark's problem. If we knew where the
"syntax error 25 : -> ." came from, we'd be closer to an answer.
After scanning the source for "syntax error", line 126 of
backend/bootstrap/bootscanner.l seems to be the likely culprit.
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Mark Bixby <mark@bixby.org> writes:
Tom Lane wrote:
But none of these fully explain Mark's problem. If we knew where the
"syntax error 25 : -> ." came from, we'd be closer to an answer.
After scanning the source for "syntax error", line 126 of
backend/bootstrap/bootscanner.l seems to be the likely culprit.
Oh, of course: foo.bar is not a single token to the boot scanner.
It needs to be in quotes. Try this patch (line numbers are for 7.1
but probably OK for 7.0.*)
*** src/include/catalog/pg_shadow.h~ Wed Jan 24 16:01:30 2001
--- src/include/catalog/pg_shadow.h Fri Mar 9 16:57:53 2001
***************
*** 73,78 ****
* user choices.
* ----------------
*/
! DATA(insert OID = 0 ( POSTGRES PGUID t t t t _null_ _null_ ));
#endif /* PG_SHADOW_H */
--- 73,78 ----
* user choices.
* ----------------
*/
! DATA(insert OID = 0 ( "POSTGRES" PGUID t t t t _null_ _null_ ));
#endif /* PG_SHADOW_H */
You'll need to rebuild global.bki (over in src/backend/catalog)
afterwards, but the executables don't change.
regards, tom lane
Tom Lane wrote:
Mark Bixby <mark@bixby.org> writes:
I just hacked src/test/regress/run_check.sh to invoke initdb with
--show. The user name/id is behaving "correctly" for an MPE machine:SUPERUSERNAME: MGR.BIXBY
SUPERUSERID: 484Okay, so much for that theory.
Can you set a breakpoint at elog() and provide a stack backtrace so we
can see where this is happening? I can't think where else in the code
might be affected, but obviously the problem is somewhere else...
Here's a stack trace from the native MPE debugger (we don't have gdb support
yet). I'm assuming that all results after the initdb failure should be
suspect, and that's possibly why pg_log wasn't created. I haven't tried
troubleshooting the pg_log problem yet until after I resolve the uid names
issue.
=============== Initializing check database instance ================
DEBUG/iX C.25.06
DEBUG Intrinsic at: 129.0009d09c ?$START$
$1 ($4b) nmdebug > b elog
added: NM [1] PROG 129.001ad7d8 elog
$2 ($4b) nmdebug > c
Break at: NM [1] PROG 129.001ad7d8 elog
$3 ($4b) nmdebug > tr
PC=129.001ad7d8 elog
* 0) SP=41843ef0 RP=129.0018f7a4 pg_atoi+$b4
1) SP=41843ef0 RP=129.00182994 int4in+$14
2) SP=41843e70 RP=129.0018296c ?int4in+$8
export stub: 129.001aed28 $CODE$+$138
3) SP=41843e30 RP=129.001af428 fmgr+$98
4) SP=41843db0 RP=129.000c3354 InsertOneValue+$264
5) SP=41843cf0 RP=129.000c05d4 Int_yyparse+$924
6) SP=41843c70 RP=129.00000000
(end of NM stack)
$4 ($4b) nmdebug > c
=============== Starting regression postmaster ================
Regression postmaster is running - PID=125239393 PGPORT=65432
=============== Creating regression database... ================
NOTICE: mdopen: couldn't open
/BIXBY/PUB/src/postgresql-7.0.3-mpe/src/test/regr
ess/tmp_check/data/pg_log: No such file or directory
NOTICE: mdopen: couldn't open
/BIXBY/PUB/src/postgresql-7.0.3-mpe/src/test/regr
ess/tmp_check/data/pg_log: No such file or directory
psql: FATAL 1: cannot open relation pg_log
createdb: database creation failed
createdb failed
make: *** [runcheck] Error 1
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Tom Lane wrote:
Oh, of course: foo.bar is not a single token to the boot scanner.
It needs to be in quotes. Try this patch (line numbers are for 7.1
but probably OK for 7.0.*)
...snip...
--- src/include/catalog/pg_shadow.h Fri Mar 9 16:57:53 2001
...snip...
! DATA(insert OID = 0 ( "POSTGRES" PGUID t t t t _null_ _null_ ));
#endif /* PG_SHADOW_H */
You'll need to rebuild global.bki (over in src/backend/catalog)
afterwards, but the executables don't change.
I modified pg_shadow.h as instructed and ran a make from src, and that rebuilt
global1.bki.source in src/backend/catalog.
However, when I did make runtest, it appears to install from
src/backend/global1.bki.source which was still the old version. I modified
that old version by hand and reran make runtest. The uid name error has been
solved. Thanks!
So why is there a backend/global1.bki.source *and* a
backend/catalog/global1.bki.source?
But now runcheck dies during the install of PL/pgSQL, with createlang
complaining about a missing lib/plpgsql.sl.
I did do an MPE implementation of dynloader.c, but I was under the dim
impression this was only used for user-added functions, not core
functionality. Am I mistaken? Are you dynaloading core functionality too?
It seems that plpgsql.sl didn't get built. Might be an autoconf issue, since
quite frequently config scripts don't know about shared libraries on MPE. I
will investigate this further.
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Mark Bixby wrote:
It seems that plpgsql.sl didn't get built. Might be an autoconf issue, since
quite frequently config scripts don't know about shared libraries on MPE. I
will investigate this further.
Ah. I found src/Makefile.shlib and added the appropriate stuff.
Woohoo! We have test output! The regression README was clear about how some
platform dependent errors can be expected, and how to code for these
differences in the expected outputs.
Now I'm off to examine the individual failures....
MULTIBYTE=;export MULTIBYTE; \
/bin/sh ./run_check.sh hppa1.0-hp-mpeix
=============== Removing old ./tmp_check directory ... ================
=============== Create ./tmp_check directory ================
=============== Installing new build into ./tmp_check ================
=============== Initializing check database instance ================
=============== Starting regression postmaster ================
Regression postmaster is running - PID=125042790 PGPORT=65432
=============== Creating regression database... ================
CREATE DATABASE
=============== Installing PL/pgSQL... ================
=============== Running regression queries... ================
parallel group1 (12 tests) ...
boolean text name oid float4 varchar char int4 int2 float8 int8
nume
ric
test boolean ... ok
test char ... ok
test name ... ok
test varchar ... ok
test text ... ok
test int2 ... ok
test int4 ... ok
test int8 ... ok
test oid ... ok
test float4 ... ok
test float8 ... FAILED
test numeric ... ok
sequential test strings ... ok
sequential test numerology ... ok
parallel group2 (15 tests) ...
comments path polygon lseg point box reltime interval tinterval
circle
inet timestamp type_sanity opr_sanity oidjoins
test point ... ok
test lseg ... ok
test box ... ok
test path ... ok
test polygon ... ok
test circle ... ok
test interval ... FAILED
test timestamp ... FAILED
test reltime ... ok
test tinterval ... ok
test inet ... ok
test comments ... ok
test oidjoins ... ok
test type_sanity ... ok
test opr_sanity ... ok
sequential test abstime ... ok
sequential test geometry ... FAILED
sequential test horology ... FAILED
sequential test create_function_1 ... ok
sequential test create_type ... ok
sequential test create_table ... ok
sequential test create_function_2 ... ok
sequential test copy ... ok
parallel group3 (6 tests) ...
create_aggregate create_operator triggers constraints create_misc
create_i
ndex
test constraints ... ok
test triggers ... ok
test create_misc ... ok
test create_aggregate ... ok
test create_operator ... ok
test create_index ... ok
sequential test create_view ... ok
sequential test sanity_check ... ok
sequential test errors ... ok
sequential test select ... ok
parallel group4 (16 tests) ...
arrays union select_having transactions portals join select_implicit
sel
ect_distinct_on subselect case random select_distinct select_into
aggregat
es hash_index btree_index
test select_into ... ok
test select_distinct ... ok
test select_distinct_on ... ok
test select_implicit ... ok
test select_having ... ok
test subselect ... ok
test union ... ok
test case ... ok
test join ... ok
test aggregates ... ok
test transactions ... ok
test random ... ok
test portals ... ok
test arrays ... ok
test btree_index ... ok
test hash_index ... ok
sequential test misc ... ok
parallel group5 (5 tests) ...
portals_p2 foreign_key rules alter_table select_views
test select_views ... ok
test alter_table ... ok
test portals_p2 ... ok
test rules ... ok
test foreign_key ... ok
parallel group6 (3 tests) ...
temp limit plpgsql
test limit ... ok
test plpgsql ... FAILED
test temp ... ok
=============== Terminating regression postmaster ================
ACTUAL RESULTS OF REGRESSION TEST ARE NOW IN FILES run_check.out
AND regress.out
To run the optional big test(s) too, type 'make bigcheck'
These big tests can take over an hour to complete
These actually are: numeric_big
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Mark Bixby <mark@bixby.org> writes:
So why is there a backend/global1.bki.source *and* a
backend/catalog/global1.bki.source?
You don't want to know ;-) ... it's all cleaned up for 7.1 anyway.
I think in 7.0 you have to run make install in src/backend to get the
.bki files installed.
But now runcheck dies during the install of PL/pgSQL, with createlang
complaining about a missing lib/plpgsql.sl.
I did do an MPE implementation of dynloader.c, but I was under the dim
impression this was only used for user-added functions, not core
functionality. Am I mistaken? Are you dynaloading core functionality too?
No, but the regress tests try to test plpgsql too ... you should be able
to dike out the createlang call and have all tests except the plpgsql
regress test work.
regards, tom lane
Tom Lane wrote:
But now runcheck dies during the install of PL/pgSQL, with createlang
complaining about a missing lib/plpgsql.sl.I did do an MPE implementation of dynloader.c, but I was under the dim
impression this was only used for user-added functions, not core
functionality. Am I mistaken? Are you dynaloading core functionality too?No, but the regress tests try to test plpgsql too ... you should be able
to dike out the createlang call and have all tests except the plpgsql
regress test work.
Is it possible to re-run failing regression tests individually? It took
somewhere between 30-45 minutes for me to run the entire suite, and if I have
to run the whole thing every time when I'm only trying to fix just a single
test, that will get old pretty fast, and so will I. ;-)
Thanks.
--
mark@bixby.org
Remainder of .sig suppressed to conserve scarce California electrons...
Mark Bixby <mark@bixby.org> writes:
Is it possible to re-run failing regression tests individually?
I believe so, but it's not very convenient in the "runcheck" mode, since
that normally wants to make a fresh install and start a temporary
postmaster. Instead, do a real install, start a real postmaster, and
do "make runtest" to create the regression DB in the real installation.
Then you can basically just do "psql regression <foo.sql" --- look at
the regression driver script to get the details of what switches to
pass and how to do the output comparison.
There are some order dependencies among the tests, but I think all the
ones you were having trouble with should be able to work this way in
an end-state regression DB.
Also, rerunning the whole suite is much quicker this way, since you
don't have to go through install/initdb/start postmaster each time.
BTW, the results you posted looked good --- with the exception of
plpgsql, the failing tests all seemed to be ones that are notorious
for platform-dependent output.
regards, tom lane