regress failed tests.. SERIOUS?
PLEASE NOTE: I'm brand new to PostgreSQL as of today. I've just moved from
MySQL because it's not stable on NetBSD/Alpha. I don't know enough about
pgsql to see if these failed test would make it unstable for production.
i start the server like this:
$ postmaster -D /usr/pkg/pgsql/data > /var/pgsql/logfile 2>&1 &
I've seen many messages on the console when doing the regression
tests. However, none of the warnings, or debug, or status messages went to
the logfile as it should have. logfile is currently still empty.
NetBSD/DEC-Alpha (ELF) 1.5.1_ALPHA
PostgreSQL 7.0.3
$ uname -a
NetBSD ns01 1.5.1_ALPHA NetBSD 1.5.1_ALPHA (ALPHA-$Revision: 1.127.2.2
$) #2: Fri Dec 15 16:45:58
CST 2000 tom@ns01:/usr/src/sys/arch/alpha/compile/ns01 alpha
$ grep failed regress.out
int8 .. failed
float8 .. failed
numerology .. failed
timestamp .. failed
oidjoins .. failed
type_sanity .. failed
opr_sanity .. failed
horology .. failed
rules .. failed
int8 and float8 seemed OK. int8 just had numbers with '-' or '+' signs
instead of the '<' or '>' around numbers. float8 reported same error in a
different format.
the rest of the test failed pretty bad. see the attachment.
Attachments:
regression.diffstext/plain; charset=US-ASCII; name=regression.diffsDownload+2038-2042
"Thomas T. Thai" <tom@minnesota.com> writes:
PLEASE NOTE: I'm brand new to PostgreSQL as of today. I've just moved from
MySQL because it's not stable on NetBSD/Alpha. I don't know enough about
pgsql to see if these failed test would make it unstable for production.
Postgres 7.0.* will not work very well on Alpha unless you apply Ryan
Kirkpatrick's patch set (I forget the URL offhand, but dig around in our
archives and you'll find it). 7.1 should be a lot better. If you'd
like to help out testing 7.1, please grab current sources from the CVS
server, or grab a snapshot tarball dated tomorrow or later.
regards, tom lane
On Fri, 29 Dec 2000, Tom Lane wrote:
Date: Fri, 29 Dec 2000 23:20:58 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Thomas T. Thai <tom@minnesota.com>
Cc: PostgreSQL General <pgsql-general@postgresql.org>
Subject: Re: regress failed tests.. SERIOUS?"Thomas T. Thai" <tom@minnesota.com> writes:
PLEASE NOTE: I'm brand new to PostgreSQL as of today. I've just moved from
MySQL because it's not stable on NetBSD/Alpha. I don't know enough about
pgsql to see if these failed test would make it unstable for production.Postgres 7.0.* will not work very well on Alpha unless you apply Ryan
Kirkpatrick's patch set (I forget the URL offhand, but dig around in our
archives and you'll find it). 7.1 should be a lot better. If you'd
like to help out testing 7.1, please grab current sources from the CVS
server, or grab a snapshot tarball dated tomorrow or later.
i did just that. i applied the patch that is available at:
http://www.rkirkpat.net/software/#linux-alpha
to my NetBSD/Alpha 1.5.1_ALPHA PostgreSQL 7.0.3 package. compiled with out
errors. some warnings about casting wrong pointers types etc, but they
seem harmless.
even though Kirkpatrick said his patch was for the Linux/Alpha, most of
his modifications weren't so Linux centric as it was Alpha
centric. consequently, the patch worked out well for NetBSD/Alpha as well.
with the above patch, the regression now only failed on 2 tests:
$ grep failed regress.out
float8 .. failed
timestamp .. failed
horology .. failed
float8 did pass, just diff format of the error message. 'timestamp' and
'horology' not only failed but caused many 'Fatal User Traps' logged in
newsyslog '/var/log/messages':
<cut>
Dec 30 01:22:33 ns01 /netbsd: fatal user trap:
Dec 30 01:22:33 ns01 /netbsd:
Dec 30 01:22:33 ns01 /netbsd: trap entry = 0x1 (arithmetic trap)
Dec 30 01:22:33 ns01 /netbsd: a0 = 0x2
Dec 30 01:22:33 ns01 /netbsd: a1 = 0x40000000000
Dec 30 01:22:33 ns01 /netbsd: a2 = 0xffffffffffffffff
Dec 30 01:22:33 ns01 /netbsd: pc = 0x1201449f8
Dec 30 01:22:33 ns01 /netbsd: ra = 0x120029ca4
Dec 30 01:22:33 ns01 /netbsd: curproc = 0xfffffc0023bb6c98
Dec 30 01:22:33 ns01 /netbsd: pid = 1705, comm = postgres
</cut>
the 'fatal user trap' errors seem to happen whenever there is a query
that resulted in SQL error message "ERROR: floating point exception! The
last floating point operation either exceeded legal ranges or was a
divide by zero."
for the 'strings' test, it passed but this line in 'strings.sql'
SELECT CAST(f1 AS char(10)) AS "char(text)" FROM TEXT_TBL;
caused these output on the console:
<cut>
pid 1684 (postgres): unaligned access: va=0x1a007dd25 pc=0x12014bd10
ra=0x12014b
cac op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dd26 pc=0x12014bd10
ra=0x12014b
cac op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dd27 pc=0x12014bd10
ra=0x12014b
cac op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dced pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcee pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcef pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf1 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf2 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf3 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf5 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
</cut>
(but nothing in '/var/log/messages').
i'm attaching the regression.diffs file. in addition, i'm going to move
this thread to pgsql-bugs instead of pgsql-general.
Attachments:
regression.diffstext/plain; charset=US-ASCII; name=regression.diffsDownload+662-662
On Saft, 30 Dec 2000, Thomas T. Thai wrote:
i grabbed the CVS ball last night and tried to build it. i'm attaching a
patch that made it possible to build -current on NetBSD/Alpha
1.5.1_ALPHA. i would appreciate it if you have cvs write access to
integrate my patch back into the tree.
after install, i did the regression test and it failed in the same way
that 7.0.3+rkirkpat.patch did as described below (copy of my last post).
Show quoted text
Date: Sat, 30 Dec 2000 01:42:11 -0600 (CST)
From: Thomas T. Thai <tom@minnesota.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-bugs@postgresql.org, Brent Verner <brent@rcfile.org>,
Ryan Kirkpatrick <pgsql@rkirkpat.net>,
Adriaan Joubert <a.joubert@albourne.com>,
Arrigo Triulzi <arrigo@albourne.com>
Subject: NetBSD/Alpha and rkirkpat's patch [was Re: regress failed
tests.. SERIOUS?]On Fri, 29 Dec 2000, Tom Lane wrote:
Date: Fri, 29 Dec 2000 23:20:58 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Thomas T. Thai <tom@minnesota.com>
Cc: PostgreSQL General <pgsql-general@postgresql.org>
Subject: Re: regress failed tests.. SERIOUS?"Thomas T. Thai" <tom@minnesota.com> writes:
PLEASE NOTE: I'm brand new to PostgreSQL as of today. I've just moved from
MySQL because it's not stable on NetBSD/Alpha. I don't know enough about
pgsql to see if these failed test would make it unstable for production.Postgres 7.0.* will not work very well on Alpha unless you apply Ryan
Kirkpatrick's patch set (I forget the URL offhand, but dig around in our
archives and you'll find it). 7.1 should be a lot better. If you'd
like to help out testing 7.1, please grab current sources from the CVS
server, or grab a snapshot tarball dated tomorrow or later.i did just that. i applied the patch that is available at:
http://www.rkirkpat.net/software/#linux-alpha
to my NetBSD/Alpha 1.5.1_ALPHA PostgreSQL 7.0.3 package. compiled with out
errors. some warnings about casting wrong pointers types etc, but they
seem harmless.even though Kirkpatrick said his patch was for the Linux/Alpha, most of
his modifications weren't so Linux centric as it was Alpha
centric. consequently, the patch worked out well for NetBSD/Alpha as well.with the above patch, the regression now only failed on 2 tests:
$ grep failed regress.out
float8 .. failed
timestamp .. failed
horology .. failedfloat8 did pass, just diff format of the error message. 'timestamp' and
'horology' not only failed but caused many 'Fatal User Traps' logged in
newsyslog '/var/log/messages':<cut>
Dec 30 01:22:33 ns01 /netbsd: fatal user trap:
Dec 30 01:22:33 ns01 /netbsd:
Dec 30 01:22:33 ns01 /netbsd: trap entry = 0x1 (arithmetic trap)
Dec 30 01:22:33 ns01 /netbsd: a0 = 0x2
Dec 30 01:22:33 ns01 /netbsd: a1 = 0x40000000000
Dec 30 01:22:33 ns01 /netbsd: a2 = 0xffffffffffffffff
Dec 30 01:22:33 ns01 /netbsd: pc = 0x1201449f8
Dec 30 01:22:33 ns01 /netbsd: ra = 0x120029ca4
Dec 30 01:22:33 ns01 /netbsd: curproc = 0xfffffc0023bb6c98
Dec 30 01:22:33 ns01 /netbsd: pid = 1705, comm = postgres
</cut>the 'fatal user trap' errors seem to happen whenever there is a query
that resulted in SQL error message "ERROR: floating point exception! The
last floating point operation either exceeded legal ranges or was a
divide by zero."for the 'strings' test, it passed but this line in 'strings.sql'
SELECT CAST(f1 AS char(10)) AS "char(text)" FROM TEXT_TBL;
caused these output on the console:
<cut>
pid 1684 (postgres): unaligned access: va=0x1a007dd25 pc=0x12014bd10
ra=0x12014b
cac op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dd26 pc=0x12014bd10
ra=0x12014b
cac op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dd27 pc=0x12014bd10
ra=0x12014b
cac op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dced pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcee pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcef pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf1 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf2 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf3 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
pid 1684 (postgres): unaligned access: va=0x1a007dcf5 pc=0x12014bd10
ra=0x12014b
ce4 op=ldl
</cut>(but nothing in '/var/log/messages').
i'm attaching the regression.diffs file. in addition, i'm going to move
this thread to pgsql-bugs instead of pgsql-general.
Attachments:
pgsql-current.difftext/plain; charset=US-ASCII; name=pgsql-current.diffDownload+23-7
On Sat, 30 Dec 2000, Thomas T. Thai wrote:
[...snip mail header...]
i grabbed the CVS ball last night and tried to build it. i'm attaching a
patch that made it possible to build -current on NetBSD/Alpha
1.5.1_ALPHA. i would appreciate it if you have cvs write access to
integrate my patch back into the tree.after install, i did the regression test and it failed in the same way
that 7.0.3+rkirkpat.patch did as described below (copy of my last post).
[...snip regression test outputs...]
i forgot to mention that i wasn't able to do the serial regression test
because it didn't find the right socket file in /tmp. however the parallel
test worked (with failed tests). i did run psql to verify that it can talk
to the running postmaster. serial regression worked in 7.0.3 though.
### Verify that postmaster is running ###################################
$ ps axj | grep postmaster
pgsql 18355 1 18355 3c280 0 I p0 0:00.04 ./postmaster -D
/var/pgsql/data (postgres
$ whoami
pgsql
$ pwd
/usr/local/build/pgsql-current/src/test/regress
### start the serial regression test ####################################
$ gmake runtest
gmake -C ../../../contrib/spi REFINT_VERBOSE=1 refint.so autoinc.so
gmake[1]: Entering directory `/usr/local/build/pgsql-current/contrib/spi'
gmake[1]: `refint.so' is up to date.
gmake[1]: `autoinc.so' is up to date.
gmake[1]: Leaving directory `/usr/local/build/pgsql-current/contrib/spi'
/bin/sh ./pg_regress --schedule=./serial_schedule --multibyte=
(using postmaster on Unix socket, default port)
============== dropping database "regression" ==============
psql: connectDBStart() -- connect() failed: No such file or directory
Is the postmaster running locally
and accepting connections on Unix socket '/tmp/.s.PGSQL.0'?
dropdb: database removal failed
============== creating database "regression" ==============
psql: connectDBStart() -- connect() failed: No such file or directory
Is the postmaster running locally
and accepting connections on Unix socket '/tmp/.s.PGSQL.0'?
createdb: database creation failed
pg_regress: createdb failed
### Show that postmaster is still running ###############################
$ ps axj | grep postmaster
pgsql 18355 1 18355 3c280 0 I p0 0:00.04 ./postmaster -D
/var/pgsql/data (postgres
### Verify that there is a socket file ##################################
$ ls -la /tmp | grep PGSQL
srwxrwxrwx 1 pgsql wheel 0 Dec 30 18:01 .s.PGSQL.5432
-rw------- 1 pgsql wheel 22 Dec 30 18:01 .s.PGSQL.5432.lock
### Verify that postmaster will respond to local clients ################
$ /usr/local/install/pgsql-current/bin/psql mydb
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
mydb=# select version();
version
----------------------------------------------------------------------------------
PostgreSQL 7.1beta1 on alpha-unknown-netbsdelf1.5.1., compiled by GCC
egcs-1.1.2
(1 row)
mydb=#
"Thomas T. Thai" <tom@minnesota.com> writes:
psql: connectDBStart() -- connect() failed: No such file or directory
Is the postmaster running locally
and accepting connections on Unix socket '/tmp/.s.PGSQL.0'?
Hmm, do you have an environment definition for PGPORT?
I notice that pg_regress.sh contains
export PGPORT
but it doesn't necessarily set any value for PGPORT. It seems possible
that some shells may take this as license to invent an empty-string
value for PGPORT, which would cause libpq to think that port 0 is being
specified.
My feeling is that libpq ought to ignore an empty-string PGPORT
environment value, rather than treat it as selecting port 0.
Comments anyone?
regards, tom lane
i concure with this.
On Sat, 30 Dec 2000, Tom Lane wrote:
Show quoted text
Date: Sat, 30 Dec 2000 20:10:58 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Thomas T. Thai <tom@minnesota.com>
Cc: pgsql-bugs@postgresql.org, Brent Verner <brent@rcfile.org>,
Ryan Kirkpatrick <pgsql@rkirkpat.net>,
Adriaan Joubert <a.joubert@albourne.com>,
Arrigo Triulzi <arrigo@albourne.com>
Subject: Re: NetBSD/Alpha and PostgreSQL-current [was Re: NetBSD/Alpha
and rkirkpat's patch]"Thomas T. Thai" <tom@minnesota.com> writes:
psql: connectDBStart() -- connect() failed: No such file or directory
Is the postmaster running locally
and accepting connections on Unix socket '/tmp/.s.PGSQL.0'?Hmm, do you have an environment definition for PGPORT?
I notice that pg_regress.sh contains
export PGPORT
but it doesn't necessarily set any value for PGPORT. It seems possible
that some shells may take this as license to invent an empty-string
value for PGPORT, which would cause libpq to think that port 0 is being
specified.My feeling is that libpq ought to ignore an empty-string PGPORT
environment value, rather than treat it as selecting port 0.
Comments anyone?regards, tom lane
Hmm, do you have an environment definition for PGPORT?
I notice that pg_regress.sh contains
export PGPORT
but it doesn't necessarily set any value for PGPORT. It seems possible
that some shells may take this as license to invent an empty-string
value for PGPORT, which would cause libpq to think that port 0 is being
specified.My feeling is that libpq ought to ignore an empty-string PGPORT
environment value, rather than treat it as selecting port 0.
Comments anyone?
Agreed. I have already committed changes to ignore empty-string pgport
paramter of PQsetdbLogin(). Same thing should be applied to PGPORT
environment variable too, I think.
--
Tatsuo Ishii
"Thomas T. Thai" <tom@minnesota.com> writes:
i grabbed the CVS ball last night and tried to build it. i'm attaching a
patch that made it possible to build -current on NetBSD/Alpha
1.5.1_ALPHA.
Partially applied, per comments below.
after install, i did the regression test and it failed in the same way
that 7.0.3+rkirkpat.patch did as described below (copy of my last post).
Hmm, no idea what's going on here. Could you compile with -g and then
use gdb to track the reported PC addresses to particular source lines?
That might give us a clue.
--- /usr/local/source/postgresql/pgsql/src/backend/main/main.c Fri Nov 24 21:45:47 2000
+++ /usr/local/build/pgsql-current/src/backend/main/main.c Sat Dec 30 15:06:34 2000
-#if defined(__alpha) && !defined(linux) && !defined(__FreeBSD__)
+#if defined(__alpha) && !defined(linux) && !defined(__FreeBSD__) && !defined(__NetBSD__)
#include <sys/sysinfo.h>
#include "machine/hal_sysinfo.h"
Applied, but I begin to think that we should be testing here for the
*presence* of a Tru64 symbol, rather than the absence of a bunch of
other OSes. Anyone know what would be suitable?
+#include <sys/param.h>
I inserted this conditionally on #if defined(__NetBSD__). It seems
a bad idea to risk breaking other ports to fix yours.
--- /usr/local/source/postgresql/pgsql/src/include/port/netbsd.h Sun Oct 29 07:17:34 2000
+++ /usr/local/build/pgsql-current/src/include/port/netbsd.h Sat Dec 30 14:59:06 2000
netbsd.h changes look good, applied.
--- /usr/local/source/postgresql/pgsql/src/include/storage/s_lock.h Fri Dec 29 20:34:56 2000
+++ /usr/local/build/pgsql-current/src/include/storage/s_lock.h Sat Dec 30 14:59:37 2000
@@ -241,7 +241,17 @@
#if defined(NEED_NS32K_TAS_ASM)
#define TAS(lock) tas(lock)
+#if defined(__GNUC__)
+/*
+ * GCC on the Alpha doesn't appear to handle inlining of assembly with
+ * %0 or %1 properly. This removes the inlining of the tas (test-and-set)
+ * function, which probably slows things down considerably, but correctness
+ * first!
+ */
+static int
+#else
static __inline__ int
+#endif
tas(volatile slock_t *lock)
{
register _res;
Uh, why are you altering NS32K code in an Alpha patch? I did not apply
this.
regards, tom lane
On Sat, 30 Dec 2000, Tom Lane wrote:
[snipped header]
"Thomas T. Thai" <tom@minnesota.com> writes:
i grabbed the CVS ball last night and tried to build it. i'm attaching a
patch that made it possible to build -current on NetBSD/Alpha
1.5.1_ALPHA.Partially applied, per comments below.
after install, i did the regression test and it failed in the same way
that 7.0.3+rkirkpat.patch did as described below (copy of my last post).Hmm, no idea what's going on here. Could you compile with -g and then
use gdb to track the reported PC addresses to particular source lines?
That might give us a clue.
will do.
--- /usr/local/source/postgresql/pgsql/src/backend/main/main.c Fri Nov 24 21:45:47 2000 +++ /usr/local/build/pgsql-current/src/backend/main/main.c Sat Dec 30 15:06:34 2000-#if defined(__alpha) && !defined(linux) && !defined(__FreeBSD__) +#if defined(__alpha) && !defined(linux) && !defined(__FreeBSD__) && !defined(__NetBSD__) #include <sys/sysinfo.h> #include "machine/hal_sysinfo.h"
Applied, but I begin to think that we should be testing here for the
*presence* of a Tru64 symbol, rather than the absence of a bunch of
other OSes. Anyone know what would be suitable?
i don't know what the symbol might be.
+#include <sys/param.h>
I inserted this conditionally on #if defined(__NetBSD__). It seems
a bad idea to risk breaking other ports to fix yours.
agreed.
--- /usr/local/source/postgresql/pgsql/src/include/port/netbsd.h Sun Oct 29 07:17:34 2000 +++ /usr/local/build/pgsql-current/src/include/port/netbsd.h Sat Dec 30 14:59:06 2000netbsd.h changes look good, applied.
--- /usr/local/source/postgresql/pgsql/src/include/storage/s_lock.h Fri Dec 29 20:34:56 2000 +++ /usr/local/build/pgsql-current/src/include/storage/s_lock.h Sat Dec 30 14:59:37 2000 @@ -241,7 +241,17 @@ #if defined(NEED_NS32K_TAS_ASM) #define TAS(lock) tas(lock)+#if defined(__GNUC__) +/* + * GCC on the Alpha doesn't appear to handle inlining of assembly with + * %0 or %1 properly. This removes the inlining of the tas (test-and-set) + * function, which probably slows things down considerably, but correctness + * first! + */ +static int +#else static __inline__ int +#endif tas(volatile slock_t *lock) { register _res;Uh, why are you altering NS32K code in an Alpha patch? I did not apply
this.
cause egcs on NetBSD/Alpha will give lots of error during compile. we
don't have gcc 2.95.2 on the alpha working yet.
"Thomas T. Thai" <tom@minnesota.com> writes:
Uh, why are you altering NS32K code in an Alpha patch? I did not apply
this.
cause egcs on NetBSD/Alpha will give lots of error during compile. we
don't have gcc 2.95.2 on the alpha working yet.
But the proposed diff is inside #if defined(NEED_NS32K_TAS_ASM).
How can that affect an Alpha compilation at all?
regards, tom lane
Tom Lane writes:
--- /usr/local/source/postgresql/pgsql/src/backend/main/main.c Fri Nov 24 21:45:47 2000 +++ /usr/local/build/pgsql-current/src/backend/main/main.c Sat Dec 30 15:06:34 2000-#if defined(__alpha) && !defined(linux) && !defined(__FreeBSD__) +#if defined(__alpha) && !defined(linux) && !defined(__FreeBSD__) && !defined(__NetBSD__) #include <sys/sysinfo.h> #include "machine/hal_sysinfo.h"Applied, but I begin to think that we should be testing here for the
*presence* of a Tru64 symbol, rather than the absence of a bunch of
other OSes. Anyone know what would be suitable?
__osf__
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/