Alpha initdb fixed!

Started by Dwayne Baileyalmost 28 years ago25 messages
#1Dwayne Bailey
dwayne@mika.com

-----BEGIN PGP SIGNED MESSAGE-----

I've gotten 6.3 initdb to run to a successful completion on my
Alpha running OSF/1 V3.2c. Forget the change that I sent in
earlier. While I still think that there's something funky with
that code, it doesn not need to be modifed. Actually, the
modifications are miniscule. The only files that need to be
changed are backend/main/main.c and template/alpha.

The real trick is to add -Dalpha to the CFLAGS setting. The
changes to main.c are only to add some extra includes to support
some code that's suddenly being used.

The #define ASSEMBLER is to prevent most of the code of
sys/proc.h from being included, as it ends up conflicting with
some of the postgresql definitions. This may or may not work on
other versions of Digital Unix.

As far as I'm concerned, this is a hack fix. There's still some
underlying 32/64 bit assumtions that this is masking. Perhaps
I'll make that my pet project.

Here are the diffs for the two files that I modified:

*** backend/main/main.c Mon Mar 16 15:53:26 1998
- --- backend/main/main.c.orig Mon Mar 16 16:05:07 1998
***************
*** 15,28 ****
#include <string.h>
#include <unistd.h>

- - #ifdef alpha
- - #include <sys/sysinfo.h>
- - #include <machine/hal_sysinfo.h>
- - #define ASSEMBLER
- - #include <sys/proc.h>
- - #undef ASSEMBLER
- - #endif
- -
#include "postgres.h"
#ifdef USE_LOCALE
#include <locale.h>
- --- 15,20 ----

*** template/alpha Mon Mar 16 16:06:08 1998
- --- template/alpha.orig Mon Mar 16 16:11:25 1998
***************
*** 5,11 ****
# This is defined here because a bunch of clients include tmp/c.h,
# which is where the work is done on HP-UX. It only affects the
# backend on Ultrix and OSF/1.
! CFLAGS:-DNOFIXADE -Dalpha
SHARED_LIB:
ALL:
SRCH_INC:
- --- 5,11 ----
# This is defined here because a bunch of clients include tmp/c.h,
# which is where the work is done on HP-UX. It only affects the
# backend on Ultrix and OSF/1.
! CFLAGS:-DNOFIXADE
SHARED_LIB:
ALL:
SRCH_INC:

- --
Dwayne Bailey + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com + What is your favorite color?
http://www.mika.com/~dwayne + Blue ... no, Yelloooooooooooooooooow
finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNQ2YLaA2uleK7maRAQG50gMAne7myS15kxEjkC95WexnZKxBobKGFG8L
NRNv0u7JeNSuDTHR5xf4UDSiacGLXlDvMwhUk83W+GnUdwACsQuX1ASfVfc2mCAP
IN6HiMK+DQuzpYfrf4gT3sdymQGyPl00
=F/Mt
-----END PGP SIGNATURE-----

#2Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Dwayne Bailey (#1)
Re: [HACKERS] Alpha initdb fixed!

On Mon, 16 Mar 1998, Dwayne Bailey wrote:

I've gotten 6.3 initdb to run to a successful completion on my
Alpha running OSF/1 V3.2c. Forget the change that I sent in
earlier. While I still think that there's something funky with
that code, it doesn not need to be modifed. Actually, the
modifications are miniscule. The only files that need to be
changed are backend/main/main.c and template/alpha.

The real trick is to add -Dalpha to the CFLAGS setting. The
changes to main.c are only to add some extra includes to support
some code that's suddenly being used.

The #define ASSEMBLER is to prevent most of the code of
sys/proc.h from being included, as it ends up conflicting with
some of the postgresql definitions. This may or may not work on
other versions of Digital Unix.

I'll try it immediately, but I have a suggestion. On my DU 3.2c system, cc
defines automatically the symbols "__osf__" and "__alpha", and gcc defines
"__osf__", "__alpha" and "__alpha__". I think it would be easier to change
every "#ifdef alpha" to "#ifdef __alpha", and stop worrying about it in
the Makefiles.

Can any of the linux-alpha folks try out which symbols does the compiler
define? And someone who has DU 4.0x installed?

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#3Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Pedro J. Lobo (#2)
Re: [HACKERS] Alpha initdb fixed!

On Tue, 17 Mar 1998, Pedro J. Lobo wrote:

On Mon, 16 Mar 1998, Dwayne Bailey wrote:

I've gotten 6.3 initdb to run to a successful completion on my
Alpha running OSF/1 V3.2c. Forget the change that I sent in
earlier. While I still think that there's something funky with
that code, it doesn not need to be modifed. Actually, the
modifications are miniscule. The only files that need to be
changed are backend/main/main.c and template/alpha.

The real trick is to add -Dalpha to the CFLAGS setting. The
changes to main.c are only to add some extra includes to support
some code that's suddenly being used.

The #define ASSEMBLER is to prevent most of the code of
sys/proc.h from being included, as it ends up conflicting with
some of the postgresql definitions. This may or may not work on
other versions of Digital Unix.

I'll try it immediately, but I have a suggestion. On my DU 3.2c system, cc
defines automatically the symbols "__osf__" and "__alpha", and gcc defines
"__osf__", "__alpha" and "__alpha__". I think it would be easier to change
every "#ifdef alpha" to "#ifdef __alpha", and stop worrying about it in
the Makefiles.

I've just tried it, and it works partially. The initdb works fine, so I've
tried to run the regression tests. Here is the output:

==============================================================
boolean .. ok
char .. ok
char2 .. ok
char4 .. ok
char8 .. ok
char16 .. ok
varchar .. ok
text .. ok
strings .. ok
int2 .. failed
int4 .. failed
oid .. ok
oidint2 .. failed
oidint4 .. failed
oidname .. failed
[...]
==============================================================

All tests after oid fail, because the postmaster dies with this message:

========================
[...]
ERROR: pg_atoi: error reading "123456": Result too large
ERROR: pg_atoi: error in "asdfasd": can't parse "asdfasd"
semget: No space left on device
This type of error is usually caused by improper
shared memory or System V IPC semaphore configuration.
See the FAQ for more detailed information
FATAL 1: AttachSLockMemory: could not attach segment
=========================

Running the regression test after starting the postmaster with "-d 2"
gives:

========================
[...]
/usr/local/pgsql.beta/bin/postmaster child[0]:
execv(/usr/local/pgsql.beta/bin/postgres, -p, -d2, -P4, -F, -e, -B, 256, -v 65536, regression, )
/usr/local/pgsql.beta/bin/postmaster: BackendStartup: pid 6011 user pgbeta
db regression socket 4
FindBackend: found "/usr/local/pgsql.beta/bin/postgres" using argv[0]
binding ShmemCreate(key=0, size=2414376)
semget: No space left on device
This type of error is usually caused by improper
shared memory or System V IPC semaphore configuration.
See the FAQ for more detailed information
---debug info---
Quiet = f
Noversion = f
timings = f
dates = European
bufsize = 256
sortmem = 512
query echo = f
DatabaseName = [regression]
----------------

InitPostgres()..
/usr/local/pgsql.beta/bin/postmaster: reaping dead processes...
/usr/local/pgsql.beta/bin/postmaster: CleanupProc: pid 6011 exited with
status 768
/usr/local/pgsql.beta/bin/postmaster: CleanupProc: reinitializing shared
memory and semaphores
FATAL 1: AttachSLockMemory: could not attach segment
===========================

I am using these options (they worked fine with 6.2.1 and 6.2.1p6):

postmaster -d 2 -o '-F -e' -B 256 -D/usr/local/pgsql.beta/data

Also, after the postmaster dies I have to manually remove (using ipcrm) 15
semaphores and one shared memory area. Since there is one more semaphore
owned by root, there are 16 semaphores allocated when the postmaster dies.
I have looked at my system configuration, and that's the system limit. I
can raise it to, say, 32, but the 6.2.1 system worked fine with my current
configuration. I suspect that the postmaster is allocating semaphores and
never releasing them.

Any hints?

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#4Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Pedro J. Lobo (#3)
Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Tue, 17 Mar 1998, Pedro J. Lobo wrote:

On Tue, 17 Mar 1998, Pedro J. Lobo wrote:

I've just tried it, and it works partially. The initdb works fine, so I've
tried to run the regression tests. Here is the output:

==============================================================
boolean .. ok
char .. ok
char2 .. ok
char4 .. ok
char8 .. ok
char16 .. ok
varchar .. ok
text .. ok
strings .. ok
int2 .. failed
int4 .. failed
oid .. ok
oidint2 .. failed
oidint4 .. failed
oidname .. failed
[...]
==============================================================

I've done more tests. The problem is that if you start the postmaster
without the '-p' option and without assigning a value to the PGPORT
environment variable, then all the ipc stuff is messed up. No shared
memory regions are created, and the semaphores are created but never
freed. When a port number is specified, the sempahores (and the shared
memory regions) have a 'key' value that contains the port number. Without
port number, there is no shared memory and the sempahores have 0 as the
key value.

I don't know if this behaviour is due to the use of a non-standard port
(5440), but since it's been specified in configure (--with-pgport=5440) it
should work. shouldn't it?

These are the regression tests when a port number is specified (note that
you *must* assign a value to PGPORT before running the tests):

===============================================================
boolean .. ok
char .. ok
char2 .. ok
char4 .. ok
char8 .. ok
char16 .. ok
varchar .. ok
text .. ok
strings .. ok
int2 .. failed
int4 .. failed
oid .. ok
oidint2 .. failed
oidint4 .. failed
oidname .. ok
float4 .. ok
float8 .. failed
numerology .. ok
point .. ok
lseg .. ok
box .. ok
path .. ok
polygon .. ok
circle .. ok
geometry .. failed
timespan .. ok
datetime .. failed
reltime .. ok
abstime .. failed
tinterval .. failed
horology .. failed
comments .. ok
create_function_1 .. ok
create_type .. ok
create_table .. ok
create_function_2 .. ok
constraints .. ok
triggers .. ok
copy .. ok
create_misc .. ok
create_aggregate .. ok
create_operator .. ok
create_view .. ok
create_index .. ok
sanity_check .. ok
errors .. ok
select .. ok
select_into .. ok
select_distinct .. ok
select_distinct_on .. ok
subselect .. ok
aggregates .. ok
transactions .. ok
random .. failed
portals .. ok
misc .. ok
arrays .. ok
btree_index .. ok
hash_index .. ok
select_views .. ok
alter_table .. ok
portals_p2 .. ok
==========================================

Some of them fail (most notably int2, int4 and float8), but anyway it's
better than before :-)

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#5Dwayne Bailey
dwayne@mika.com
In reply to: Pedro J. Lobo (#4)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

-----BEGIN PGP SIGNED MESSAGE-----

On Tue, 17 Mar 1998, Pedro J. Lobo wrote:

I've done more tests. The problem is that if you start the postmaster
without the '-p' option and without assigning a value to the PGPORT
environment variable, then all the ipc stuff is messed up. No shared
memory regions are created, and the semaphores are created but never
freed. When a port number is specified, the sempahores (and the shared
memory regions) have a 'key' value that contains the port number. Without
port number, there is no shared memory and the sempahores have 0 as the
key value.

I don't know if this behaviour is due to the use of a non-standard port
(5440), but since it's been specified in configure (--with-pgport=5440) it
should work. shouldn't it?

I got the same results that you did. I was planning on
investigating this morning, but it looks like you beat me to it.
I ALSO built 6.3 with a non-standard port, so that I could keep
my current database live while I work on this.

I'll try your suggestion, but I'll also try rebuilding using the
standard port, to see if it makes any difference.

Re: your suggestion to use __alpha and not worry about the
makefile, I'm a little uncomfortable with that. DEC's cc will
actually output different symbols, depending on the use of the
- -std flag. I'd rather have something that we have explicit
control over, rather than relying on the compiler like this. I'm
not violently opposed to useing __alpha or anything, it's just a
preference against it.

- --
Dwayne Bailey + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com + What is your favorite color?
http://www.mika.com/~dwayne + Blue ... no, Yelloooooooooooooooooow
finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNQ5svqA2uleK7maRAQHJ1gL/ULW54HyDSjLZv++z2j1taxfdchgpPAL1
9WDrJAdPHmEjm1iAZfQT6gqIpwZ70fp2VpRneqZZyoUw1ZCHE3ufcDHz29t43Rbb
QJL6lDl99J0R3ZH6rA8JHhd6Mn0uV9YM
=eZ4b
-----END PGP SIGNATURE-----

#6The Hermit Hacker
scrappy@hub.org
In reply to: Pedro J. Lobo (#4)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Tue, 17 Mar 1998, Pedro J. Lobo wrote:

I don't know if this behaviour is due to the use of a non-standard port
(5440), but since it's been specified in configure (--with-pgport=5440) it
should work. shouldn't it?

Yes, and there was a fix submitted and applied for this...its miss
defined in configure...

#7Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Pedro J. Lobo (#4)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

I've done more tests. The problem is that if you start the postmaster
without the '-p' option and without assigning a value to the PGPORT
environment variable, then all the ipc stuff is messed up. No shared
memory regions are created, and the semaphores are created but never
freed. When a port number is specified, the sempahores (and the shared
memory regions) have a 'key' value that contains the port number. Without
port number, there is no shared memory and the sempahores have 0 as the
key value.

I don't know if this behaviour is due to the use of a non-standard port
(5440), but since it's been specified in configure (--with-pgport=5440) it
should work. shouldn't it?

These are the regression tests when a port number is specified (note that
you *must* assign a value to PGPORT before running the tests):

Let's get a patch for this alpha fix. Not sure about the pgport problem.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#8Dwayne Bailey
dwayne@mika.com
In reply to: Bruce Momjian (#7)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

-----BEGIN PGP SIGNED MESSAGE-----

On Tue, 17 Mar 1998, Bruce Momjian wrote:

Let's get a patch for this alpha fix. Not sure about the pgport problem.

I included a diff in my original report. I can resend it to the
patches list, if required. However, I would prefer to hear that
somebody tested it on DU 4.0. Thus far, AFAIK, only 3.2 has been
tested.

I'm confident that the patched template/alpha file will be fine,
but the corresponding changes to backend/main/main.c leave me
less comfortable. There's a #define ASSEMBLER there to prevent
the loading of wholesale portions of sys/proc.h. I'd like to
know if that works as expected on other versions of DU.

The pgport problem has been identified as a problem with
configure, which had been previously reported. (A report that I
must have missed.)

- --
Dwayne Bailey + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com + What is your favorite color?
http://www.mika.com/~dwayne + Blue ... no, Yelloooooooooooooooooow
finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNQ6hI6A2uleK7maRAQEZcQMAgBQGn9smBHdf1aIGGz5a22qVSSOE4wBe
lpvCCvWzc0X09Qa1I2xdr4+Tln5gp1iWUQfi/0jaADuI/RgzRDABTcTjBt2vXY8S
7z/GKfxsXWie54LyrviDAxqfAGlpI16z
=rCSI
-----END PGP SIGNATURE-----

#9The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#7)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Tue, 17 Mar 1998, Bruce Momjian wrote:

I've done more tests. The problem is that if you start the postmaster
without the '-p' option and without assigning a value to the PGPORT
environment variable, then all the ipc stuff is messed up. No shared
memory regions are created, and the semaphores are created but never
freed. When a port number is specified, the sempahores (and the shared
memory regions) have a 'key' value that contains the port number. Without
port number, there is no shared memory and the sempahores have 0 as the
key value.

I don't know if this behaviour is due to the use of a non-standard port
(5440), but since it's been specified in configure (--with-pgport=5440) it
should work. shouldn't it?

These are the regression tests when a port number is specified (note that
you *must* assign a value to PGPORT before running the tests):

Let's get a patch for this alpha fix. Not sure about the pgport problem.

The pgport problem, I *think*, is the one that was configure
related, where the port is set wrong by default.

Try this:

Index: pgsql/src/configure
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/configure,v
retrieving revision 1.132
retrieving revision 1.134
diff -r1.132 -r1.134
811c811
< #define DEF_PGPORT "${DEF_PGPORT}"
---
Show quoted text

#define DEF_PGPORT "${withval}"

#10Ryan Kirkpatrick
rkirkpat@nag.cs.colorado.edu
In reply to: Pedro J. Lobo (#2)
Re: [HACKERS] Alpha initdb fixed!

On Tue, 17 Mar 1998, Pedro J. Lobo wrote:

I'll try it immediately, but I have a suggestion. On my DU 3.2c system, cc
defines automatically the symbols "__osf__" and "__alpha", and gcc defines
"__osf__", "__alpha" and "__alpha__". I think it would be easier to change
every "#ifdef alpha" to "#ifdef __alpha", and stop worrying about it in
the Makefiles.

Can any of the linux-alpha folks try out which symbols does the compiler
define? And someone who has DU 4.0x installed?

Linux/Alpha provides the following useful/relavent symbols:
linux
__alpha
__alpha__
__linux
__linux__

I had gone through the pgsql 6.2.1 source trying to fix/replace
all instances of 'linuxalpha' and such used as defines with '(defined
__alpha__) && (defined __linux__)'. But I hit a few snags in testing (i.e.
lack of time), and by the time I got things about sorted out, 6.3 came out
and changed so much I need to go through again and do it all anew. The
baisc problem it looks like you hit as well, is that non-standard define
names were used, and then never included in the platform specific defines.
This was the reason Linux/Alpha couldn't even get initdb to run (probably
same for you). Of course, the regression tests are still not perfect, and
there is a good deal of cleanup on the Linux/Alpha end of things as well.
It will be a while, but things are moving.

----------------------------------------------------------------------------
| "For to me to live is Christ, and to die is gain." |
| --- Philippians 1:21 (KJV) |
----------------------------------------------------------------------------
| Ryan Kirkpatrick | Boulder, Colorado | rkirkpat@nag.cs.colorado.edu |
----------------------------------------------------------------------------
| http://www-ugrad.cs.colorado.edu/~rkirkpat/ |
----------------------------------------------------------------------------

#11Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Pedro J. Lobo (#4)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

These are the regression tests when a port number is specified (note that
you *must* assign a value to PGPORT before running the tests):

===============================================================
boolean .. ok
char .. ok
char2 .. ok
char4 .. ok
char8 .. ok
char16 .. ok
varchar .. ok
text .. ok
strings .. ok
int2 .. failed
int4 .. failed
oid .. ok
oidint2 .. failed
oidint4 .. failed
oidname .. ok
float4 .. ok
float8 .. failed
numerology .. ok
point .. ok
lseg .. ok
box .. ok
path .. ok
polygon .. ok
circle .. ok
geometry .. failed
timespan .. ok
datetime .. failed
reltime .. ok
abstime .. failed
tinterval .. failed
horology .. failed
comments .. ok
create_function_1 .. ok
create_type .. ok
create_table .. ok
create_function_2 .. ok
constraints .. ok
triggers .. ok
copy .. ok
create_misc .. ok
create_aggregate .. ok
create_operator .. ok
create_view .. ok
create_index .. ok
sanity_check .. ok
errors .. ok
select .. ok
select_into .. ok
select_distinct .. ok
select_distinct_on .. ok
subselect .. ok
aggregates .. ok
transactions .. ok
random .. failed
portals .. ok
misc .. ok
arrays .. ok
btree_index .. ok
hash_index .. ok
select_views .. ok
alter_table .. ok
portals_p2 .. ok
==========================================

Some of them fail (most notably int2, int4 and float8), but anyway it's
better than before :-)

Oooh. I think you might have a running system now! Those int2, int4,
float8 "failures" are probably just error message differences and are
expected. The date and time stuff may or may not be a problem, and the
geometry stuff is probably OK (rounding trouble in the math libraries).

Make sure your date/time stuff looks OK, at least for simple tests; it
may be, for example, that your timezone database is just different for
dates before 1960...

- Tom

#12Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Dwayne Bailey (#5)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Tue, 17 Mar 1998, Dwayne Bailey wrote:

Re: your suggestion to use __alpha and not worry about the
makefile, I'm a little uncomfortable with that. DEC's cc will
actually output different symbols, depending on the use of the
- -std flag. I'd rather have something that we have explicit
control over, rather than relying on the compiler like this. I'm
not violently opposed to useing __alpha or anything, it's just a
preference against it.

Here's an extract from the DEC's cc man page:

The following table shows which macros are defined for each of the -std
flags.

-----------------------------------------------
Macro std0 std std1
(default)
-----------------------------------------------
LANGUAGE_C yes no no
__LANGUAGE_C__ yes yes yes
unix yes no no
__unix__ yes yes yes
__osf__ yes yes yes
__alpha yes yes yes
SYSTYPE_BSD yes no no
_SYSTYPE_BSD yes yes yes
LANGUAGE_ASSEMBLY yes yes yes
__LANGUAGE_ASSEMBLY__ yes yes yes
-----------------------------------------------

As you can see, __alpha and __osf__ are always defined. However, I
understand your point. If we define 'alpha' in the template file, we are
protected from mind-changing vendors that define __alpha in DU 3.2 and
__alpha__ in DU 4.0 and alpha__ in DU 5.0 (just an example). From this
point of view, the current approach is better. And, it's always easier
(and safer) to leave things untouched.

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#13Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Pedro J. Lobo (#12)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Wed, 18 Mar 1998, Pedro J. Lobo wrote:

On Tue, 17 Mar 1998, Dwayne Bailey wrote:

Re: your suggestion to use __alpha and not worry about the
makefile, I'm a little uncomfortable with that. DEC's cc will
actually output different symbols, depending on the use of the
- -std flag. I'd rather have something that we have explicit
control over, rather than relying on the compiler like this. I'm
not violently opposed to useing __alpha or anything, it's just a
preference against it.

[stuff deleted...]

As you can see, __alpha and __osf__ are always defined. However, I
understand your point. If we define 'alpha' in the template file, we are
protected from mind-changing vendors that define __alpha in DU 3.2 and
__alpha__ in DU 4.0 and alpha__ in DU 5.0 (just an example). From this
point of view, the current approach is better. And, it's always easier
(and safer) to leave things untouched.

Just a thought: I think we should make a distinction between architecture
(i.e., define 'alpha') and OS (i.e., define 'osf' or something like that),
now that linux runs also on alpha (and NT, if someone ever makes a port).

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#14Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Thomas G. Lockhart (#11)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Wed, 18 Mar 1998, Thomas G. Lockhart wrote:

hash_index .. ok
select_views .. ok
alter_table .. ok
portals_p2 .. ok
==========================================

Some of them fail (most notably int2, int4 and float8), but anyway it's
better than before :-)

Oooh. I think you might have a running system now! Those int2, int4,

Yes, it seems so.

float8 "failures" are probably just error message differences and are
expected.

Yes. For int2: Expected:
! ERROR: pg_atoi: error reading "100000": Math result not representable

Got:
! ERROR: pg_atoi: error reading "100000": Result too large

For int4: Expected:
! ERROR: pg_atoi: error reading "1000000000000": Math result not
representable

Got:
! ERROR: pg_atoi: error reading "1000000000000": Result too large

The same goes for oidint2 and oidint4.

For float8: Expected:
! ERROR: Bad float8 input format -- overflow

Got:
! ERROR: floating point exception! The last floating point operation
either exceeded legal ranges or was a divide by zero

This one was harmless, but there is another one: Expected:
QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! bad| ?column?
! ---+--------------------
! | 1
! |7.39912306090513e-16
! | 0
! | 0
! | 1
! (5 rows)
!

Got:
QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! ERROR: exp() result is out of range

Can someone comment on this?

The date and time stuff may or may not be a problem, and the
geometry stuff is probably OK (rounding trouble in the math libraries).

You are right on the geometry stuff. I am not sure about the date stuff.
Some are differences of one second between the expected and the actual
results, some others are dates that appear displaced by 19 years (for
example, expecter year 1997 becomes 2016, expected 1957 becomes 1976...).
The diff output is very long on this.

Make sure your date/time stuff looks OK, at least for simple tests; it
may be, for example, that your timezone database is just different for
dates before 1960...

The date/time stuff has never worked completely right. And, if the problem
lies in postgres, that's ok. Sooner or later it will be fixed. But if, as
it seems, the problem lies in the timezone databases, we might be in big
trouble. Perhaps we could make a test, so we can say for sure "your
timezone database is incorrect, go and ask your verdor for a patch".

Also, the test fails form the random stuff:
*** expected/random.out ma 29 abr 07:23:40 1997
--- results/random.out  ma 17 mar 03:51:57 1998
***************
*** 7,18 ****
  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
  count
  -----
!    92
  (1 row)

QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
count
-----
! 98
(1 row)

--- 7,18 ----
  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
  count
  -----
!    95
  (1 row)

QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
count
-----
! 88
(1 row)

----------------------

Yes, the results are different, but... aren't they random? O:-)

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#15Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Pedro J. Lobo (#14)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

This one was harmless, but there is another one: Expected:
QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! bad| ?column?
! ---+--------------------
! | 1
! |7.39912306090513e-16
! | 0
! | 0
! | 1
! (5 rows)
!

Got:
QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! ERROR: exp() result is out of range

Can someone comment on this?

I think you are getting a better result than the regression test machine
gets. That's good.

Some are differences of one second between the expected and the actual
results, some others are dates that appear displaced by 19 years (for
example, expecter year 1997 becomes 2016, expected 1957 becomes
1976...). The diff output is very long on this.
The date/time stuff has never worked completely right. And, if the
problem lies in postgres, that's ok. Sooner or later it will be fixed.
But if, as it seems, the problem lies in the timezone databases, we
might be in big trouble. Perhaps we could make a test, so we can say
for sure "your timezone database is incorrect, go and ask your verdor
for a patch".

No, you still have date/time trouble, and it looks as though the
timezone stuff is not being set properly. By definition, it is a problem
with your machine, since the code works on several other platforms, and
no, it isn't likely to get fixed eventually unless you pursue it, since
it does work on the ~20 other OS/processor combinations listed as
supported platforms.

OK, what I meant by "timezone database" trouble would have been sort of
obvious in that only dates from times before computers existed would
have shown problems, and then usually 1 hour differences due to daylight
savings time settings. That is not what you are seeing.

The 19 year differences usually seem to come from mis-handling the
HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
in config.h and see if it helps.

Yes, the results are different, but... aren't they random? O:-)

Right. OK for random to be different.

- Tom

#16Pedro J. Lobo
pjlobo@euitt.upm.es
In reply to: Thomas G. Lockhart (#15)
Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

On Wed, 18 Mar 1998, Thomas G. Lockhart wrote:

Got:
QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! ERROR: exp() result is out of range

Can someone comment on this?

I think you are getting a better result than the regression test machine
gets. That's good.

Ok.

Some are differences of one second between the expected and the actual
results, some others are dates that appear displaced by 19 years (for
example, expecter year 1997 becomes 2016, expected 1957 becomes
1976...). The diff output is very long on this.
The date/time stuff has never worked completely right. And, if the
problem lies in postgres, that's ok. Sooner or later it will be fixed.
But if, as it seems, the problem lies in the timezone databases, we
might be in big trouble. Perhaps we could make a test, so we can say
for sure "your timezone database is incorrect, go and ask your verdor
for a patch".

No, you still have date/time trouble, and it looks as though the
timezone stuff is not being set properly. By definition, it is a problem
with your machine, since the code works on several other platforms, and
no, it isn't likely to get fixed eventually unless you pursue it, since
it does work on the ~20 other OS/processor combinations listed as
supported platforms.

You have misinterpreted me. What I mean is that if the problem lies in
postgres, we can hunt it and fix it, but if the problem lies in the
timezone libraries then it is out of our hands. Of course, the problem
isn't going to vanish into nothingness by itself (although it would be
very nice, wouldn't it? :-)

OK, what I meant by "timezone database" trouble would have been sort of
obvious in that only dates from times before computers existed would
have shown problems, and then usually 1 hour differences due to daylight
savings time settings. That is not what you are seeing.

The 19 year differences usually seem to come from mis-handling the
HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
in config.h and see if it helps.

I am going to be offline for 4 days, until next Monday. I will dig into
that problem then.

-------------------------------------------------------------------
Pedro José Lobo Perea Tel: +34 1 336 78 19
Centro de Cálculo Fax: +34 1 331 92 29
EUIT Telecomunicación - UPM e-mail: pjlobo@euitt.upm.es

#17Mattias Kregert
matti@algonet.se
In reply to: Pedro J. Lobo (#14)
Timezone problems / HAVE_INT_TIMEZINE

Thomas G. Lockhart wrote:

The 19 year differences usually seem to come from mis-handling the
HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
in config.h and see if it helps.

Couldn't this be tested for, just like there is a "flex test" which finds
out if flex is ok or not?
Can the configure script find out and add HAVE_INT_TIMEZONE if appropriate?

/* m */

#18Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Pedro J. Lobo (#14)
Re: Timezone problems / HAVE_INT_TIMEZINE

Couldn't this be tested for, just like there is a "flex test" which
finds out if flex is ok or not? Can the configure script find out and
add HAVE_INT_TIMEZONE if appropriate?

Uh, it does a test already by trying to compile a program referencing a
global integer variable called "timezone". Somehow a few systems will
compile that but don't really have a useful integer timezone
(RH5.0/glibc2.0 is one of those).

I'm wondering if we could change the sense of the test, to try instead
to test for the presence of a timezone field in the tm structure? That
might fix the glibc2.0 port (assuming it still has problems at v2.0.7;
haven't tested recently) but I don't know which other ports might break.

Can we experiment with this Marc?? Post-megapatch of course :)

- Tom

#19The Hermit Hacker
scrappy@hub.org
In reply to: Thomas G. Lockhart (#18)
Re: Timezone problems / HAVE_INT_TIMEZINE

On Thu, 19 Mar 1998, Thomas G. Lockhart wrote:

Couldn't this be tested for, just like there is a "flex test" which
finds out if flex is ok or not? Can the configure script find out and
add HAVE_INT_TIMEZONE if appropriate?

Uh, it does a test already by trying to compile a program referencing a
global integer variable called "timezone". Somehow a few systems will
compile that but don't really have a useful integer timezone
(RH5.0/glibc2.0 is one of those).

I'm wondering if we could change the sense of the test, to try instead
to test for the presence of a timezone field in the tm structure? That
might fix the glibc2.0 port (assuming it still has problems at v2.0.7;
haven't tested recently) but I don't know which other ports might break.

Can we experiment with this Marc?? Post-megapatch of course :)

Sounds reasonable to me...so you want the test changed to:

===========================================================================
#include <stdio.h>
#include <time.h>

main() { struct tm *tmstruct; printf("%s\n", tmstruct->timezone); }
===========================================================================

And, if the compile fails...how is HAVE_INT_TIMEZONE set? to
FALSE?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#20Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: The Hermit Hacker (#19)
Re: Timezone problems / HAVE_INT_TIMEZINE

Sounds reasonable to me...so you want the test changed to:

========================================================================

#include <stdio.h>
#include <time.h>

main() { struct tm *tmstruct; printf("%s\n", tmstruct->timezone); }
========================================================================

The structure member looks like tm->tm_gmtoff (an integer). There would
need to be other calls to set it up, unless something like

main() {struct tm tmstruct, *tm = &tmstruct; tm->tm_gmtoff = 0; }

would be acceptable.

And, if the compile fails...how is HAVE_INT_TIMEZONE set? to
FALSE?

Actually, if the test fails, then we need to #undef HAVE_INT_TIMEZONE,
although if it would be easier to set it to FALSE then I can pretty
easily fix up the sources to use that.

- Tom

#21Dwayne Bailey
dwayne@mika.com
In reply to: Mattias Kregert (#17)
Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

-----BEGIN PGP SIGNED MESSAGE-----

Thomas G. Lockhart wrote:

The 19 year differences usually seem to come from mis-handling the
HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
in config.h and see if it helps.

As far as I've been able to determine, the correct setting for
HAVE_INT_TIMEZONE (1) is being used in the Alpha port. It does
in fact define 'long timezone' (not 'int timezone') as being
available, as part of the tzset() man page. I have to admit that
I'm not familiar with the way that this is supposed to work, so
this may seem kind of dumb, but I did some experimenting on the
value of 'timezone' and 'tzname', since the contents of those
variable weren't documented anywhere that I could find in DEC's
man pages. I of course now know that tzname[0] is the base
timezone name, tzname[1] is the dst name, and timezone is the
number of seconds offset from GMT.

However, what I also discovered in that these values are not set
until after the tzset() routine is called. Is that normal
behavior? Doing a grep for tzset in the PG sources revealed
that it's only called for a few SQL commands. Is it called
anywhere as part of startup processing, and I'm just missing it?
Or is the DEC implementation the only one that requires an
explicit tzset() call before the use of these variables?

- --
Dwayne Bailey + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com + What is your favorite color?
http://www.mika.com/~dwayne + Blue ... no, Yelloooooooooooooooooow
finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNREaJqA2uleK7maRAQGvdwL9F5t3M1dK8Qf9MVWGa3CfKguegHyG/f9+
1Oe3OETtA5gI0GLUJkxgpVBQFMzT6kczju1AR6l7JcM2N+wXMk1lE5ULrLH96axd
T8sLQwkdjTWhNsnBBFulyocyoLPF7TzK
=SbKH
-----END PGP SIGNATURE-----

#22Maarten Boekhold
maartenb@dutepp2.et.tudelft.nl
In reply to: Dwayne Bailey (#21)
Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

However, what I also discovered in that these values are not set
until after the tzset() routine is called. Is that normal
behavior? Doing a grep for tzset in the PG sources revealed
that it's only called for a few SQL commands. Is it called
anywhere as part of startup processing, and I'm just missing it?
Or is the DEC implementation the only one that requires an
explicit tzset() call before the use of these variables?

AFAIK tzset() is called automagically by all time-related libc routines
when they detect it is not set yet (at least I think with Linux it is
done this way. It's been a long time since I looked at that).

Maarten

_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems |
| Department of Electrical Engineering |
| Computer Architecture and Digital Technique section |
| M.Boekhold@et.tudelft.nl |
-----------------------------------------------------------------------------

#23Dwayne Bailey
dwayne@mika.com
In reply to: Maarten Boekhold (#22)
Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

-----BEGIN PGP SIGNED MESSAGE-----

On Thu, 19 Mar 1998, Maarten Boekhold wrote:

AFAIK tzset() is called automagically by all time-related libc routines
when they detect it is not set yet (at least I think with Linux it is
done this way. It's been a long time since I looked at that).

That would explain it then. I was just accessing the variables
directly, without any intervening calls.

It's a moot point, anyway. I put explicit calls in to the
startup, and it made no difference in the result. It's likely to
be a 32/64 bit issue somewhere that I haven't located yet. It
really shouldn't be that hard to track down. Since the output is
different from the input by a consistance amount (19 years +- a
few days) it can only be in one of 4 places, AFAIK: parsing
input, storing value, retrieving value, or generating output. My
bet is on the retrieve phase, but we'll see.

- --
Dwayne Bailey + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com + What is your favorite color?
http://www.mika.com/~dwayne + Blue ... no, Yelloooooooooooooooooow
finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNRE5eqA2uleK7maRAQGqPQMAgajIzCAK8cBRmqCHw83mVyI8i5YI7yo4
j0jhJXG3vEauLST0B+6ompKw0+KQvRoOfgFWOoyqelZ08zo6qCBrJJmuAbGSM1/b
EbBtsORCpSymqaeDIIPHoPdaq+jG9c8e
=BiGQ
-----END PGP SIGNATURE-----

#24Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Dwayne Bailey (#23)
Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

It's a moot point, anyway. I put explicit calls in to the
startup, and it made no difference in the result. It's likely to
be a 32/64 bit issue somewhere that I haven't located yet. It
really shouldn't be that hard to track down. Since the output is
different from the input by a consistance amount (19 years +- a
few days) it can only be in one of 4 places, AFAIK: parsing
input, storing value, retrieving value, or generating output. My
bet is on the retrieve phase, but we'll see.

Didn't this stuff work for v6.2.1, even on Alpha? afaik nothing around
this adt code changed recently...

- Tom

I moved to another job recently so left my dozen Alphas and don't have
access to man pages on them :( Have you tried compiling with
HAVE_INT_TIMEZONE disabled?

#25Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Dwayne Bailey (#1)
Re: [HACKERS] Alpha initdb fixed!

Applied.

-----BEGIN PGP SIGNED MESSAGE-----

I've gotten 6.3 initdb to run to a successful completion on my
Alpha running OSF/1 V3.2c. Forget the change that I sent in
earlier. While I still think that there's something funky with
that code, it doesn not need to be modifed. Actually, the
modifications are miniscule. The only files that need to be
changed are backend/main/main.c and template/alpha.

The real trick is to add -Dalpha to the CFLAGS setting. The
changes to main.c are only to add some extra includes to support
some code that's suddenly being used.

The #define ASSEMBLER is to prevent most of the code of
sys/proc.h from being included, as it ends up conflicting with
some of the postgresql definitions. This may or may not work on
other versions of Digital Unix.

As far as I'm concerned, this is a hack fix. There's still some
underlying 32/64 bit assumtions that this is masking. Perhaps

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)