parallel regression test failure

Started by Bruce Momjianover 22 years ago30 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

I am seeing the following parallel regression test failures. Any idea
on the cause?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/regression.diffstext/plainDownload+23-18
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any idea
on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

regards, tom lane

#3Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#2)
Re: parallel regression test failure

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any idea
on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

I do (I run tools/pgtest), and see the failure regularly. It is a
dual-cpu Xeon machine. I run it every night and it fails 25% of the
time.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#4Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#2)
Re: parallel regression test failure

On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any
idea on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

I was just able to get some problems on my dual Athlon machine
(after about 10 runs) with a clean cvs download.

Linux thunder.mshome.net 2.4.21-0.13_test #35 SMP Wed Apr 9 07:29:10 MDT
2003 i686 unknown unknown GNU/Linux

gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)

<./configure --with-pgport=5433 --prefix=/usr/local/pgsql_cvs>
<sh src/tools/pgtest>
<sh src/tools/pgtest -n>

*** ./expected/triggers.out	Thu Jul 24 11:52:50 2003
--- ./results/triggers.out	Fri Jul 25 21:20:34 2003
***************
*** 92,97 ****
--- 92,98 ----
  DROP TABLE pkeys;
  DROP TABLE fkeys;
  DROP TABLE fkeys2;
+ ERROR:  could not open relation with OID 119498
  -- -- I've disabled the funny_dup17 test because the new semantics
  -- -- of AFTER ROW triggers, which get now fired at the end of a
  -- -- query always, cause funny_dup17 to enter an endless loop.

======================================================================

*** ./expected/sanity_check.out	Wed May 28 10:04:02 2003
--- ./results/sanity_check.out	Fri Jul 25 21:20:37 2003
***************
*** 15,20 ****
--- 15,21 ----
   bt_name_heap        | t
   bt_txt_heap         | t
   fast_emp4000        | t
+  fkeys2              | t
   func_index_heap     | t
   hash_f8_heap        | t
   hash_i4_heap        | t
***************
*** 62,68 ****
   shighway            | t
   tenk1               | t
   tenk2               | t
! (52 rows)

--
-- another sanity check: every system catalog that has OIDs should
have--- 63,69 ----
shighway | t
tenk1 | t
tenk2 | t
! (53 rows)

--
-- another sanity check: every system catalog that has OIDs should
have

======================================================================

*** ./expected/misc.out	Fri Jul 25 21:14:51 2003
--- ./results/misc.out	Fri Jul 25 21:20:39 2003
***************
*** 598,603 ****
--- 598,604 ----
   equipment_r
   f_star
   fast_emp4000
+  fkeys2
   float4_tbl
   float8_tbl
   func_index_heap
***************
*** 660,666 ****
   toyemp
   varchar_tbl
   xacttest
! (96 rows)
  --SELECT name(equipment(hobby_construct(text 'skywalking', text
'mer'))) AS equip_name;  SELECT hobbies_by_name('basketball');
--- 661,667 ----
   toyemp
   varchar_tbl
   xacttest
! (97 rows)

--SELECT name(equipment(hobby_construct(text 'skywalking', text
'mer'))) AS equip_name; SELECT hobbies_by_name('basketball');

======================================================================

--
21:23:44 up 8 days, 1:24, 2 users, load average: 0.11, 1.04, 1.31

#5Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#2)
Re: parallel regression test failure

On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any
idea on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

And another failure:

*** ./expected/constraints.out	Fri Jul 25 21:14:51 2003
--- ./results/constraints.out	Fri Jul 25 21:34:09 2003
***************
*** 212,244 ****
  DROP SEQUENCE INSERT_SEQ;
  CREATE SEQUENCE INSERT_SEQ START 4;
  CREATE TABLE tmp (xd INT, yd TEXT, zd INT);
  INSERT INTO tmp VALUES (null, 'Y', null);
  INSERT INTO tmp VALUES (5, '!check failed', null);
  INSERT INTO tmp VALUES (null, 'try again', null);
  INSERT INTO INSERT_TBL(y) select yd from tmp;
  SELECT '' AS three, * FROM INSERT_TBL;
   three | x |       y       | z  
! -------+---+---------------+----
!        | 4 | Y             | -4
!        | 5 | !check failed | -5
!        | 6 | try again     | -6
! (3 rows)

INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again';
INSERT INTO INSERT_TBL(y,z) SELECT yd, -7 FROM tmp WHERE yd = 'try
again'; INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd =
'try again';! ERROR: new row for relation "insert_tbl" violates CHECK
constraint "insert_con" SELECT '' AS four, * FROM INSERT_TBL;
four | x | y | z
! ------+---+---------------+----
! | 4 | Y | -4
! | 5 | !check failed | -5
! | 6 | try again | -6
! | | try again |
! | 7 | try again | -7
! (5 rows)

  DROP TABLE tmp;
  --
  -- Check constraints on UPDATE
  --
--- 212,244 ----
  DROP SEQUENCE INSERT_SEQ;
  CREATE SEQUENCE INSERT_SEQ START 4;
  CREATE TABLE tmp (xd INT, yd TEXT, zd INT);
+ ERROR:  relation 126260 deleted while still in use
  INSERT INTO tmp VALUES (null, 'Y', null);
+ ERROR:  relation "tmp" does not exist
  INSERT INTO tmp VALUES (5, '!check failed', null);
+ ERROR:  relation "tmp" does not exist
  INSERT INTO tmp VALUES (null, 'try again', null);
+ ERROR:  relation "tmp" does not exist
  INSERT INTO INSERT_TBL(y) select yd from tmp;
+ ERROR:  relation "tmp" does not exist
  SELECT '' AS three, * FROM INSERT_TBL;
   three | x | y | z 
! -------+---+---+---
! (0 rows)

INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again';
+ ERROR: relation "tmp" does not exist
INSERT INTO INSERT_TBL(y,z) SELECT yd, -7 FROM tmp WHERE yd = 'try
again';+ ERROR: relation "tmp" does not exist
INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd = 'try
again';! ERROR: relation "tmp" does not exist
SELECT '' AS four, * FROM INSERT_TBL;
four | x | y | z
! ------+---+---+---
! (0 rows)

  DROP TABLE tmp;
+ ERROR:  table "tmp" does not exist
  --
  -- Check constraints on UPDATE
  --
***************
*** 246,261 ****
  UPDATE INSERT_TBL SET x = 6 WHERE x = 6;
  UPDATE INSERT_TBL SET x = -z, z = -x;
  UPDATE INSERT_TBL SET x = z, z = x;
- ERROR:  new row for relation "insert_tbl" violates CHECK constraint
"insert_con"  SELECT * FROM INSERT_TBL;
   x |       y       | z  
! ---+---------------+----
!  4 | Y             | -4
!    | try again     |   
!  7 | try again     | -7
!  5 | !check failed |   
!  6 | try again     | -6
! (5 rows)
  -- DROP TABLE INSERT_TBL;
  --
--- 246,255 ----
  UPDATE INSERT_TBL SET x = 6 WHERE x = 6;
  UPDATE INSERT_TBL SET x = -z, z = -x;
  UPDATE INSERT_TBL SET x = z, z = x;
  SELECT * FROM INSERT_TBL;
   x | y | z 
! ---+---+---
! (0 rows)

-- DROP TABLE INSERT_TBL;
--

======================================================================

--
21:34:48 up 8 days, 1:35, 2 users, load average: 0.89, 0.65, 0.85

#6Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#2)
Re: parallel regression test failure

On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any
idea on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

I've attached a little Perl script which runs pgtest over and over
(with -n option), checking for failures and saving the output
(runX.out) and the diffs (failX.diffs) in /tmp for each failing run.

Run it from the top level (as you would pgtest).

Later,
Rob

--
22:25:11 up 8 days, 2:26, 2 users, load average: 2.40, 1.61, 1.57

Attachments:

pgtest_loop.plapplication/octet-stream; name=pgtest_loop.plDownload
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

regards, tom lane

#8Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
Re: parallel regression test failure

Let me get the patch queue applied, then use CVS to backtrack and find
the date it started failing. I think you need a dual cpu machine to see
the failures.

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#7)
Re: parallel regression test failure

On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?

1 of 25 - failed 0 (0%)
2 of 25 - failed 0 (0%)
3 of 25 - failed 0 (0%)
4 of 25 - failed 0 (0%)
5 of 25 - failed 0 (0%)
6 of 25 - failed 0 (0%)
7 of 25 - failed 0 (0%)
8 of 25 - failed 0 (0%)
9 of 25 - failed 0 (0%)
10 of 25 - failed 0 (0%)
11 of 25 - failed 1 (9%)
12 of 25 - failed 2 (17%)
13 of 25 - failed 2 (15%)
14 of 25 - failed 2 (14%
15 of 25 - failed 3 (20%)
16 of 25 - failed 3 (19%)
17 of 25 - failed 3 (18%)
18 of 25 - failed 4 (22%)
19 of 25 - failed 4 (21%)
20 of 25 - failed 4 (20%)
21 of 25 - failed 5 (24%)
22 of 25 - failed 6 (27%)
23 of 25 - failed 6 (26%)
24 of 25 - failed 7 (29%)
25 of 25 - failed 8 (32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

--
08:21:18 up 8 days, 12:22, 2 users, load average: 0.08, 0.65, 1.58

#10Bruce Momjian
bruce@momjian.us
In reply to: Robert Creager (#9)
Re: parallel regression test failure

I am going to use cvs -d to pull an older CVS and see if that fails, so
we can track down the date it started failing.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?

1 of 25 - failed 0 (0%)
2 of 25 - failed 0 (0%)
3 of 25 - failed 0 (0%)
4 of 25 - failed 0 (0%)
5 of 25 - failed 0 (0%)
6 of 25 - failed 0 (0%)
7 of 25 - failed 0 (0%)
8 of 25 - failed 0 (0%)
9 of 25 - failed 0 (0%)
10 of 25 - failed 0 (0%)
11 of 25 - failed 1 (9%)
12 of 25 - failed 2 (17%)
13 of 25 - failed 2 (15%)
14 of 25 - failed 2 (14%
15 of 25 - failed 3 (20%)
16 of 25 - failed 3 (19%)
17 of 25 - failed 3 (18%)
18 of 25 - failed 4 (22%)
19 of 25 - failed 4 (21%)
20 of 25 - failed 4 (20%)
21 of 25 - failed 5 (24%)
22 of 25 - failed 6 (27%)
23 of 25 - failed 6 (26%)
24 of 25 - failed 7 (29%)
25 of 25 - failed 8 (32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

--
08:21:18 up 8 days, 12:22, 2 users, load average: 0.08, 0.65, 1.58

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#11Bruce Momjian
bruce@momjian.us
In reply to: Robert Creager (#9)
Re: parallel regression test failure

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?

1 of 25 - failed 0 (0%)
2 of 25 - failed 0 (0%)
3 of 25 - failed 0 (0%)
4 of 25 - failed 0 (0%)
5 of 25 - failed 0 (0%)
6 of 25 - failed 0 (0%)
7 of 25 - failed 0 (0%)
8 of 25 - failed 0 (0%)
9 of 25 - failed 0 (0%)
10 of 25 - failed 0 (0%)
11 of 25 - failed 1 (9%)
12 of 25 - failed 2 (17%)
13 of 25 - failed 2 (15%)
14 of 25 - failed 2 (14%
15 of 25 - failed 3 (20%)
16 of 25 - failed 3 (19%)
17 of 25 - failed 3 (18%)
18 of 25 - failed 4 (22%)
19 of 25 - failed 4 (21%)
20 of 25 - failed 4 (20%)
21 of 25 - failed 5 (24%)
22 of 25 - failed 6 (27%)
23 of 25 - failed 6 (26%)
24 of 25 - failed 7 (29%)
25 of 25 - failed 8 (32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

--
08:21:18 up 8 days, 12:22, 2 users, load average: 0.08, 0.65, 1.58

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#12Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Bruce Momjian (#11)
Re: parallel regression test failure

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

I will start doing pulling down old versions (once I figure out the -d
syntax). Do you recall how long you may of been seeing this?

Thanks,
Rob

--
08:54:59 up 8 days, 12:55, 2 users, load average: 2.38, 1.12, 1.14

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#8)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I think you need a dual cpu machine to see the failures.

I was wondering about that myself, but we shouldn't fixate on that
assumption without more evidence. There could be some other factor
explaining why I can't reproduce it. A couple of questions for both
of you:
- what configure options are you using?
- can you reproduce the problem with serial tests (make installcheck)?
- exactly how repeatable is it --- when it fails, is it always at the
same places, or do the failures move around?

It would also be good to find out exactly where the failures are coming
from. Please try running the tests with LOG_ERROR_VERBOSITY set to
VERBOSE (probably the easiest way to hack this in make check's temp
installation is to modify src/backend/utils/misc/postgresql.conf.sample).
Then the postmaster log file created by make check will show the elog
calls' locations.

regards, tom lane

#14Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Bruce Momjian (#11)
Re: parallel regression test failure

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

--
09:05:56 up 8 days, 13:06, 2 users, load average: 2.59, 2.90, 2.14

#15Bruce Momjian
bruce@momjian.us
In reply to: Robert Creager (#12)
Re: parallel regression test failure

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

I will start doing pulling down old versions (once I figure out the -d
syntax). Do you recall how long you may of been seeing this?

I think you just take a CVS checkout and to:

cvs update -D '2003-05-01 00:00:00 GMT' pgsql

and keep changing the dates to find the date it started breaking.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#16Bruce Momjian
bruce@momjian.us
In reply to: Robert Creager (#14)
Re: parallel regression test failure

Yep, I think that is it, though the last one is pgtest or whatever you
are using for testing.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

--
09:05:56 up 8 days, 13:06, 2 users, load average: 2.59, 2.90, 2.14

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#17Bruce Momjian
bruce@momjian.us
In reply to: Robert Creager (#12)
Re: parallel regression test failure

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

I will start doing pulling down old versions (once I figure out the -d
syntax). Do you recall how long you may of been seeing this?

Since it is random, I hadn't noticed when it started, and originally
suspected my hardware I recently upgraded my hardware, around May 1, I
think.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#18Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
Re: parallel regression test failure

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I think you need a dual cpu machine to see the failures.

I was wondering about that myself, but we shouldn't fixate on that
assumption without more evidence. There could be some other factor
explaining why I can't reproduce it. A couple of questions for both
of you:
- what configure options are you using?

configure \
--with-x \
--with-threads \
--with-tcl \
--with-perl \
--with-python \
--enable-pltcl-unknown \
--with-tclconfig=/u/lib \
--with-tkconfig=/u/lib \
--enable-cassert \
--with-includes="/usr/local/include/readline /usr/contrib/include" \
--with-libraries="/usr/local/lib /usr/contrib/lib" \
--enable-locale \
--enable-multibyte \
--with-recode \
--with-openssl

- can you reproduce the problem with serial tests (make installcheck)?

No, I have never seen a serial failure, and when I get a paralell
failure, I run the serial to make sure it is just the paralell test, and
serial always passes.

- exactly how repeatable is it --- when it fails, is it always at the
same places, or do the failures move around?

No, different, as reported by Robert, but it usually has to do with the
contraint, trigger, and sanity tests. I assume we just had a dependency
in the paralell regression tests and we just need to do an adjustment,
but looking at the diffs more closely, I see it is more serious.

It would also be good to find out exactly where the failures are coming
from. Please try running the tests with LOG_ERROR_VERBOSITY set to
VERBOSE (probably the easiest way to hack this in make check's temp
installation is to modify src/backend/utils/misc/postgresql.conf.sample).
Then the postmaster log file created by make check will show the elog
calls' locations.

OK.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Creager (#14)
Re: parallel regression test failure

Robert Creager <Robert_Creager@LogicalChaos.org> writes:

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

I'd do the "make maintainer-clean" before cvs update'ing, but otherwise
probably right. Watch the output the first couple times and make sure
cvs is actually willing to replace files in both the forward and
backward directions.

regards, tom lane

#20Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#13)
Re: parallel regression test failure

./configure --with-pgport=5433 --prefix=/usr/local/pgsql_cvs

The failure moves around (out of 25 tests):

constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

Have not tried install check yet.

On Sat, 26 Jul 2003 11:06:21 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

- what configure options are you using?
- can you reproduce the problem with serial tests (make
installcheck)?- exactly how repeatable is it --- when it fails, is
it always at the
same places, or do the failures move around?

--
09:22:25 up 8 days, 13:23, 2 users, load average: 1.36, 1.26, 1.70

#21Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#19)
#22Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
#23Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#22)
#24Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Bruce Momjian (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#23)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)
#27Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#26)
#28Gaetano Mendola
mendola@bigfoot.com
In reply to: Tom Lane (#26)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gaetano Mendola (#28)
In reply to: Bruce Momjian (#1)