parallel regression test failure

Started by Bruce Momjianover 22 years ago30 messages
#1Bruce Momjian
pgman@candle.pha.pa.us
1 attachment(s)

I am seeing the following parallel regression test failures. Any idea
on the cause?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/regression.diffstext/plainDownload
*** ./expected/constraints.out	Fri Jul 25 17:36:36 2003
--- ./results/constraints.out	Fri Jul 25 17:37:07 2003
***************
*** 80,102 ****
  CREATE TABLE CHECK2_TBL (x int, y text, z int,
  	CONSTRAINT SEQUENCE_CON
  	CHECK (x > 3 and y <> 'check failed' and z < 8));
  INSERT INTO CHECK2_TBL VALUES (4, 'check ok', -2);
  INSERT INTO CHECK2_TBL VALUES (1, 'x check failed', -2);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (5, 'z check failed', 10);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (0, 'check failed', -2);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (6, 'check failed', 11);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (7, 'check ok', 7);
  SELECT '' AS two, * from CHECK2_TBL;
!  two | x |    y     | z  
! -----+---+----------+----
!      | 4 | check ok | -2
!      | 7 | check ok |  7
! (2 rows)
! 
  --
  -- Check constraints on INSERT
  --
--- 80,100 ----
  CREATE TABLE CHECK2_TBL (x int, y text, z int,
  	CONSTRAINT SEQUENCE_CON
  	CHECK (x > 3 and y <> 'check failed' and z < 8));
+ ERROR:  cache lookup failed for relation 126262
  INSERT INTO CHECK2_TBL VALUES (4, 'check ok', -2);
+ ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (1, 'x check failed', -2);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (5, 'z check failed', 10);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (0, 'check failed', -2);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (6, 'check failed', 11);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (7, 'check ok', 7);
+ ERROR:  relation "check2_tbl" does not exist
  SELECT '' AS two, * from CHECK2_TBL;
! ERROR:  relation "check2_tbl" does not exist
  --
  -- Check constraints on INSERT
  --

======================================================================

*** ./expected/triggers.out	Fri Jul 25 12:38:34 2003
--- ./results/triggers.out	Fri Jul 25 17:37:06 2003
***************
*** 91,96 ****
--- 91,97 ----
  NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted
  DROP TABLE pkeys;
  DROP TABLE fkeys;
+ ERROR:  cache lookup failed for relation 122552
  DROP TABLE fkeys2;
  -- -- I've disabled the funny_dup17 test because the new semantics
  -- -- of AFTER ROW triggers, which get now fired at the end of a

======================================================================

*** ./expected/sanity_check.out	Wed May 28 12:04:02 2003
--- ./results/sanity_check.out	Fri Jul 25 17:37:14 2003
***************
*** 15,20 ****
--- 15,21 ----
   bt_name_heap        | t
   bt_txt_heap         | t
   fast_emp4000        | t
+  fkeys               | t
   func_index_heap     | t
   hash_f8_heap        | t
   hash_i4_heap        | t
***************
*** 62,68 ****
   shighway            | t
   tenk1               | t
   tenk2               | t
! (52 rows)
  
  --
  -- another sanity check: every system catalog that has OIDs should have
--- 63,69 ----
   shighway            | t
   tenk1               | t
   tenk2               | t
! (53 rows)
  
  --
  -- another sanity check: every system catalog that has OIDs should have

======================================================================

*** ./expected/misc.out	Fri Jul 25 17:36:36 2003
--- ./results/misc.out	Fri Jul 25 17:37:17 2003
***************
*** 580,586 ****
   c
   c_star
   char_tbl
-  check2_tbl
   check_seq
   check_tbl
   circle_tbl
--- 580,585 ----
***************
*** 598,603 ****
--- 597,603 ----
   equipment_r
   f_star
   fast_emp4000
+  fkeys
   float4_tbl
   float8_tbl
   func_index_heap

======================================================================

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any idea
on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

regards, tom lane

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#2)
Re: parallel regression test failure

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any idea
on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

I do (I run tools/pgtest), and see the failure regularly. It is a
dual-cpu Xeon machine. I run it every night and it fails 25% of the
time.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#4Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#2)
Re: parallel regression test failure

On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any
idea on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

I was just able to get some problems on my dual Athlon machine
(after about 10 runs) with a clean cvs download.

Linux thunder.mshome.net 2.4.21-0.13_test #35 SMP Wed Apr 9 07:29:10 MDT
2003 i686 unknown unknown GNU/Linux

gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)

<./configure --with-pgport=5433 --prefix=/usr/local/pgsql_cvs>
<sh src/tools/pgtest>
<sh src/tools/pgtest -n>

*** ./expected/triggers.out	Thu Jul 24 11:52:50 2003
--- ./results/triggers.out	Fri Jul 25 21:20:34 2003
***************
*** 92,97 ****
--- 92,98 ----
  DROP TABLE pkeys;
  DROP TABLE fkeys;
  DROP TABLE fkeys2;
+ ERROR:  could not open relation with OID 119498
  -- -- I've disabled the funny_dup17 test because the new semantics
  -- -- of AFTER ROW triggers, which get now fired at the end of a
  -- -- query always, cause funny_dup17 to enter an endless loop.

======================================================================

*** ./expected/sanity_check.out	Wed May 28 10:04:02 2003
--- ./results/sanity_check.out	Fri Jul 25 21:20:37 2003
***************
*** 15,20 ****
--- 15,21 ----
   bt_name_heap        | t
   bt_txt_heap         | t
   fast_emp4000        | t
+  fkeys2              | t
   func_index_heap     | t
   hash_f8_heap        | t
   hash_i4_heap        | t
***************
*** 62,68 ****
   shighway            | t
   tenk1               | t
   tenk2               | t
! (52 rows)

--
-- another sanity check: every system catalog that has OIDs should
have--- 63,69 ----
shighway | t
tenk1 | t
tenk2 | t
! (53 rows)

--
-- another sanity check: every system catalog that has OIDs should
have

======================================================================

*** ./expected/misc.out	Fri Jul 25 21:14:51 2003
--- ./results/misc.out	Fri Jul 25 21:20:39 2003
***************
*** 598,603 ****
--- 598,604 ----
   equipment_r
   f_star
   fast_emp4000
+  fkeys2
   float4_tbl
   float8_tbl
   func_index_heap
***************
*** 660,666 ****
   toyemp
   varchar_tbl
   xacttest
! (96 rows)
  --SELECT name(equipment(hobby_construct(text 'skywalking', text
'mer'))) AS equip_name;  SELECT hobbies_by_name('basketball');
--- 661,667 ----
   toyemp
   varchar_tbl
   xacttest
! (97 rows)

--SELECT name(equipment(hobby_construct(text 'skywalking', text
'mer'))) AS equip_name; SELECT hobbies_by_name('basketball');

======================================================================

--
21:23:44 up 8 days, 1:24, 2 users, load average: 0.11, 1.04, 1.31

#5Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#2)
Re: parallel regression test failure

On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any
idea on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

And another failure:

*** ./expected/constraints.out	Fri Jul 25 21:14:51 2003
--- ./results/constraints.out	Fri Jul 25 21:34:09 2003
***************
*** 212,244 ****
  DROP SEQUENCE INSERT_SEQ;
  CREATE SEQUENCE INSERT_SEQ START 4;
  CREATE TABLE tmp (xd INT, yd TEXT, zd INT);
  INSERT INTO tmp VALUES (null, 'Y', null);
  INSERT INTO tmp VALUES (5, '!check failed', null);
  INSERT INTO tmp VALUES (null, 'try again', null);
  INSERT INTO INSERT_TBL(y) select yd from tmp;
  SELECT '' AS three, * FROM INSERT_TBL;
   three | x |       y       | z  
! -------+---+---------------+----
!        | 4 | Y             | -4
!        | 5 | !check failed | -5
!        | 6 | try again     | -6
! (3 rows)

INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again';
INSERT INTO INSERT_TBL(y,z) SELECT yd, -7 FROM tmp WHERE yd = 'try
again'; INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd =
'try again';! ERROR: new row for relation "insert_tbl" violates CHECK
constraint "insert_con" SELECT '' AS four, * FROM INSERT_TBL;
four | x | y | z
! ------+---+---------------+----
! | 4 | Y | -4
! | 5 | !check failed | -5
! | 6 | try again | -6
! | | try again |
! | 7 | try again | -7
! (5 rows)

  DROP TABLE tmp;
  --
  -- Check constraints on UPDATE
  --
--- 212,244 ----
  DROP SEQUENCE INSERT_SEQ;
  CREATE SEQUENCE INSERT_SEQ START 4;
  CREATE TABLE tmp (xd INT, yd TEXT, zd INT);
+ ERROR:  relation 126260 deleted while still in use
  INSERT INTO tmp VALUES (null, 'Y', null);
+ ERROR:  relation "tmp" does not exist
  INSERT INTO tmp VALUES (5, '!check failed', null);
+ ERROR:  relation "tmp" does not exist
  INSERT INTO tmp VALUES (null, 'try again', null);
+ ERROR:  relation "tmp" does not exist
  INSERT INTO INSERT_TBL(y) select yd from tmp;
+ ERROR:  relation "tmp" does not exist
  SELECT '' AS three, * FROM INSERT_TBL;
   three | x | y | z 
! -------+---+---+---
! (0 rows)

INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again';
+ ERROR: relation "tmp" does not exist
INSERT INTO INSERT_TBL(y,z) SELECT yd, -7 FROM tmp WHERE yd = 'try
again';+ ERROR: relation "tmp" does not exist
INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd = 'try
again';! ERROR: relation "tmp" does not exist
SELECT '' AS four, * FROM INSERT_TBL;
four | x | y | z
! ------+---+---+---
! (0 rows)

  DROP TABLE tmp;
+ ERROR:  table "tmp" does not exist
  --
  -- Check constraints on UPDATE
  --
***************
*** 246,261 ****
  UPDATE INSERT_TBL SET x = 6 WHERE x = 6;
  UPDATE INSERT_TBL SET x = -z, z = -x;
  UPDATE INSERT_TBL SET x = z, z = x;
- ERROR:  new row for relation "insert_tbl" violates CHECK constraint
"insert_con"  SELECT * FROM INSERT_TBL;
   x |       y       | z  
! ---+---------------+----
!  4 | Y             | -4
!    | try again     |   
!  7 | try again     | -7
!  5 | !check failed |   
!  6 | try again     | -6
! (5 rows)
  -- DROP TABLE INSERT_TBL;
  --
--- 246,255 ----
  UPDATE INSERT_TBL SET x = 6 WHERE x = 6;
  UPDATE INSERT_TBL SET x = -z, z = -x;
  UPDATE INSERT_TBL SET x = z, z = x;
  SELECT * FROM INSERT_TBL;
   x | y | z 
! ---+---+---
! (0 rows)

-- DROP TABLE INSERT_TBL;
--

======================================================================

--
21:34:48 up 8 days, 1:35, 2 users, load average: 0.89, 0.65, 0.85

#6Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#2)
1 attachment(s)
Re: parallel regression test failure

On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any
idea on the cause?

I don't see it here, on either of two different architectures. Maybe
you need a make distclean and rebuild?

I've attached a little Perl script which runs pgtest over and over
(with -n option), checking for failures and saving the output
(runX.out) and the diffs (failX.diffs) in /tmp for each failing run.

Run it from the top level (as you would pgtest).

Later,
Rob

--
22:25:11 up 8 days, 2:26, 2 users, load average: 2.40, 1.61, 1.57

Attachments:

pgtest_loop.plapplication/octet-stream; name=pgtest_loop.plDownload
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

regards, tom lane

#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#7)
Re: parallel regression test failure

Let me get the patch queue applied, then use CVS to backtrack and find
the date it started failing. I think you need a dual cpu machine to see
the failures.

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#7)
Re: parallel regression test failure

On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?

1 of 25 - failed 0 (0%)
2 of 25 - failed 0 (0%)
3 of 25 - failed 0 (0%)
4 of 25 - failed 0 (0%)
5 of 25 - failed 0 (0%)
6 of 25 - failed 0 (0%)
7 of 25 - failed 0 (0%)
8 of 25 - failed 0 (0%)
9 of 25 - failed 0 (0%)
10 of 25 - failed 0 (0%)
11 of 25 - failed 1 (9%)
12 of 25 - failed 2 (17%)
13 of 25 - failed 2 (15%)
14 of 25 - failed 2 (14%
15 of 25 - failed 3 (20%)
16 of 25 - failed 3 (19%)
17 of 25 - failed 3 (18%)
18 of 25 - failed 4 (22%)
19 of 25 - failed 4 (21%)
20 of 25 - failed 4 (20%)
21 of 25 - failed 5 (24%)
22 of 25 - failed 6 (27%)
23 of 25 - failed 6 (26%)
24 of 25 - failed 7 (29%)
25 of 25 - failed 8 (32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

--
08:21:18 up 8 days, 12:22, 2 users, load average: 0.08, 0.65, 1.58

#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Robert Creager (#9)
Re: parallel regression test failure

I am going to use cvs -d to pull an older CVS and see if that fails, so
we can track down the date it started failing.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?

1 of 25 - failed 0 (0%)
2 of 25 - failed 0 (0%)
3 of 25 - failed 0 (0%)
4 of 25 - failed 0 (0%)
5 of 25 - failed 0 (0%)
6 of 25 - failed 0 (0%)
7 of 25 - failed 0 (0%)
8 of 25 - failed 0 (0%)
9 of 25 - failed 0 (0%)
10 of 25 - failed 0 (0%)
11 of 25 - failed 1 (9%)
12 of 25 - failed 2 (17%)
13 of 25 - failed 2 (15%)
14 of 25 - failed 2 (14%
15 of 25 - failed 3 (20%)
16 of 25 - failed 3 (19%)
17 of 25 - failed 3 (18%)
18 of 25 - failed 4 (22%)
19 of 25 - failed 4 (21%)
20 of 25 - failed 4 (20%)
21 of 25 - failed 5 (24%)
22 of 25 - failed 6 (27%)
23 of 25 - failed 6 (26%)
24 of 25 - failed 7 (29%)
25 of 25 - failed 8 (32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

--
08:21:18 up 8 days, 12:22, 2 users, load average: 0.08, 0.65, 1.58

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Robert Creager (#9)
Re: parallel regression test failure

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures. Zero failures. I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem. There's nothing more I can do.

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?

1 of 25 - failed 0 (0%)
2 of 25 - failed 0 (0%)
3 of 25 - failed 0 (0%)
4 of 25 - failed 0 (0%)
5 of 25 - failed 0 (0%)
6 of 25 - failed 0 (0%)
7 of 25 - failed 0 (0%)
8 of 25 - failed 0 (0%)
9 of 25 - failed 0 (0%)
10 of 25 - failed 0 (0%)
11 of 25 - failed 1 (9%)
12 of 25 - failed 2 (17%)
13 of 25 - failed 2 (15%)
14 of 25 - failed 2 (14%
15 of 25 - failed 3 (20%)
16 of 25 - failed 3 (19%)
17 of 25 - failed 3 (18%)
18 of 25 - failed 4 (22%)
19 of 25 - failed 4 (21%)
20 of 25 - failed 4 (20%)
21 of 25 - failed 5 (24%)
22 of 25 - failed 6 (27%)
23 of 25 - failed 6 (26%)
24 of 25 - failed 7 (29%)
25 of 25 - failed 8 (32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

--
08:21:18 up 8 days, 12:22, 2 users, load average: 0.08, 0.65, 1.58

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#12Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Bruce Momjian (#11)
Re: parallel regression test failure

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

I will start doing pulling down old versions (once I figure out the -d
syntax). Do you recall how long you may of been seeing this?

Thanks,
Rob

--
08:54:59 up 8 days, 12:55, 2 users, load average: 2.38, 1.12, 1.14

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#8)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I think you need a dual cpu machine to see the failures.

I was wondering about that myself, but we shouldn't fixate on that
assumption without more evidence. There could be some other factor
explaining why I can't reproduce it. A couple of questions for both
of you:
- what configure options are you using?
- can you reproduce the problem with serial tests (make installcheck)?
- exactly how repeatable is it --- when it fails, is it always at the
same places, or do the failures move around?

It would also be good to find out exactly where the failures are coming
from. Please try running the tests with LOG_ERROR_VERBOSITY set to
VERBOSE (probably the easiest way to hack this in make check's temp
installation is to modify src/backend/utils/misc/postgresql.conf.sample).
Then the postmaster log file created by make check will show the elog
calls' locations.

regards, tom lane

#14Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Bruce Momjian (#11)
Re: parallel regression test failure

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

--
09:05:56 up 8 days, 13:06, 2 users, load average: 2.59, 2.90, 2.14

#15Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Robert Creager (#12)
Re: parallel regression test failure

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

I will start doing pulling down old versions (once I figure out the -d
syntax). Do you recall how long you may of been seeing this?

I think you just take a CVS checkout and to:

cvs update -D '2003-05-01 00:00:00 GMT' pgsql

and keep changing the dates to find the date it started breaking.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#16Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Robert Creager (#14)
Re: parallel regression test failure

Yep, I think that is it, though the last one is pgtest or whatever you
are using for testing.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

--
09:05:56 up 8 days, 13:06, 2 users, load average: 2.59, 2.90, 2.14

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#17Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Robert Creager (#12)
Re: parallel regression test failure

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

If you would like to do the cvs -d testing yourself instead of me, let
me know. It will take me a few hours to get to it anyway.

I will start doing pulling down old versions (once I figure out the -d
syntax). Do you recall how long you may of been seeing this?

Since it is random, I hadn't noticed when it started, and originally
suspected my hardware I recently upgraded my hardware, around May 1, I
think.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#18Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#13)
Re: parallel regression test failure

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I think you need a dual cpu machine to see the failures.

I was wondering about that myself, but we shouldn't fixate on that
assumption without more evidence. There could be some other factor
explaining why I can't reproduce it. A couple of questions for both
of you:
- what configure options are you using?

configure \
--with-x \
--with-threads \
--with-tcl \
--with-perl \
--with-python \
--enable-pltcl-unknown \
--with-tclconfig=/u/lib \
--with-tkconfig=/u/lib \
--enable-cassert \
--with-includes="/usr/local/include/readline /usr/contrib/include" \
--with-libraries="/usr/local/lib /usr/contrib/lib" \
--enable-locale \
--enable-multibyte \
--with-recode \
--with-openssl

- can you reproduce the problem with serial tests (make installcheck)?

No, I have never seen a serial failure, and when I get a paralell
failure, I run the serial to make sure it is just the paralell test, and
serial always passes.

- exactly how repeatable is it --- when it fails, is it always at the
same places, or do the failures move around?

No, different, as reported by Robert, but it usually has to do with the
contraint, trigger, and sanity tests. I assume we just had a dependency
in the paralell regression tests and we just need to do an adjustment,
but looking at the diffs more closely, I see it is more serious.

It would also be good to find out exactly where the failures are coming
from. Please try running the tests with LOG_ERROR_VERBOSITY set to
VERBOSE (probably the easiest way to hack this in make check's temp
installation is to modify src/backend/utils/misc/postgresql.conf.sample).
Then the postmaster log file created by make check will show the elog
calls' locations.

OK.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Creager (#14)
Re: parallel regression test failure

Robert Creager <Robert_Creager@LogicalChaos.org> writes:

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

I'd do the "make maintainer-clean" before cvs update'ing, but otherwise
probably right. Watch the output the first couple times and make sure
cvs is actually willing to replace files in both the forward and
backward directions.

regards, tom lane

#20Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#13)
Re: parallel regression test failure

./configure --with-pgport=5433 --prefix=/usr/local/pgsql_cvs

The failure moves around (out of 25 tests):

constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

Have not tried install check yet.

On Sat, 26 Jul 2003 11:06:21 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

- what configure options are you using?
- can you reproduce the problem with serial tests (make
installcheck)?- exactly how repeatable is it --- when it fails, is
it always at the
same places, or do the failures move around?

--
09:22:25 up 8 days, 13:23, 2 users, load average: 1.36, 1.26, 1.70

#21Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Tom Lane (#19)
Re: parallel regression test failure

On Sat, 26 Jul 2003 11:22:21 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

Robert Creager <Robert_Creager@LogicalChaos.org> writes:

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

I'd do the "make maintainer-clean" before cvs update'ing, but
otherwise probably right. Watch the output the first couple times and
make sure cvs is actually willing to replace files in both the forward
and backward directions.

Yeah, and yeah, it just removed src/tools/pgtest when I went back to
April...

--
09:36:18 up 8 days, 13:37, 2 users, load average: 0.08, 0.86, 1.54

#22Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#13)
Re: parallel regression test failure

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I think you need a dual cpu machine to see the failures.

I was wondering about that myself, but we shouldn't fixate on that
assumption without more evidence. There could be some other factor
explaining why I can't reproduce it. A couple of questions for both
of you:
- what configure options are you using?
- can you reproduce the problem with serial tests (make installcheck)?
- exactly how repeatable is it --- when it fails, is it always at the
same places, or do the failures move around?

It would also be good to find out exactly where the failures are coming
from. Please try running the tests with LOG_ERROR_VERBOSITY set to
VERBOSE (probably the easiest way to hack this in make check's temp
installation is to modify src/backend/utils/misc/postgresql.conf.sample).
Then the postmaster log file created by make check will show the elog
calls' locations.

OK, I got a failure with verbose output. Error was:

*** ./expected/triggers.out	Fri Jul 25 12:38:34 2003
--- ./results/triggers.out	Sat Jul 26 12:52:02 2003
***************
*** 66,71 ****
--- 66,72 ----
  ERROR:  tuple references non-existent key
  DETAIL:  Trigger "check_fkeys2_pkey_exist" found tuple referencing non-existent key in "pkeys".
  insert into fkeys values (10, '1', 2);
+ ERROR:  could not open relation with OID 119980
  insert into fkeys values (30, '3', 3);
  insert into fkeys values (40, '4', 2);
  insert into fkeys values (50, '5', 2);
***************
*** 87,93 ****
  NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
  ERROR:  "check_fkeys2_fkey_restrict": tuple is referenced in "fkeys"
  update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 10 and pkey2 = '1';
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
  NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted
  DROP TABLE pkeys;
  DROP TABLE fkeys;
--- 88,94 ----
  NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
  ERROR:  "check_fkeys2_fkey_restrict": tuple is referenced in "fkeys"
  update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 10 and pkey2 = '1';
! NOTICE:  check_pkeys_fkey_cascade: 0 tuple(s) of fkeys are deleted
  NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted
  DROP TABLE pkeys;
  DROP TABLE fkeys;

======================================================================

and logs show:

ERROR: 23514: new row for relation "check_tbl" violates CHECK
constraint "check_con"
LOCATION: ExecConstraints, execMain.c:1698
ERROR: 09000: tuple references non-existent key
DETAIL: Trigger "check_fkeys2_pkey_exist" found tuple referencing
non-existent key in "pkeys".
LOCATION: check_primary_key, refint.c:214
ERROR: 23514: new row for relation "check_tbl" violates CHECK
constraint "check_con"
LOCATION: ExecConstraints, execMain.c:1698

ERROR: XX000: could not open relation with OID 119980
LOCATION: relation_open, heapam.c:459

ERROR: 23502: null value for attribute "aa" violates NOT NULL
constraint
LOCATION: ExecConstraints, execMain.c:1686
ERROR: 09000: tuple references non-existent key

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#23Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Bruce Momjian (#22)
Re: parallel regression test failure

That is a very good guess. All the errors seem related to the parser.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.

Could the failures have something to do with bison level? 2003-02-01
would not compile with 1.875, but compiles with 1.5. Which is running
now...

Later,
Rob

On Sat, 26 Jul 2003 14:12:35 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

I just reproduced the same failure for the same date. Let me try
another date here.

---------------------------------------------------------------------
------

Robert Creager wrote:
-- Start of PGP signed section.

On Sat, 26 Jul 2003 11:09:54 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

I think you just take a CVS checkout and to:

cvs update -D '2003-05-01 00:00:00 GMT' pgsql

and keep changing the dates to find the date it started breaking.

I just want to make sure I'm not chasing my tail.

I just went to 2002-12-01 in an empty directory, and had the
following failures:

*** ./expected/strings.out	Sun Sep 22 11:27:25 2002
--- ./results/strings.out	Sat Jul 26 11:20:22 2003
***************
*** 18,24 ****
' - next line' /* this comment is not allowed here */
' - third line'
AS "Illegal comment within continuation";
! ERROR:  parser: parse error at or near "' - third line'" at
character 75
--
-- test conversions between various string types
-- E021-10 implicit casting among the character data types
--- 18,24 ----
' - next line' /* this comment is not allowed here */
' - third line'
AS "Illegal comment within continuation";
! ERROR:  parser: syntax error at or near "' - third line'" at
character 75
--
-- test conversions between various string types
-- E021-10 implicit casting among the character data types

===================================================================
===

*** ./expected/geometry.out	Fri Nov  8 13:09:55 2002
--- ./results/geometry.out	Sat Jul 26 11:20:23 2003
***************
*** 258,281 ****
twenty |                               rotation                  

--------+--------------------------------------------------------
--------------
| (0,-0),(-0.2,-0.2)
- | (-0.1,-0.1),(-0.3,-0.3)
- | (-0.25,-0.25),(-0.25,-0.35)
- | (-0.3,-0.3),(-0.3,-0.3)
| (0.08,-0),(0,-0.56)
- | (0.12,-0.28),(0.04,-0.84)
- | (0.26,-0.7),(0.1,-0.82)
- | (0.12,-0.84),(0.12,-0.84)
| (0.0651176557644,0),(0,-0.0483449262493)
- |
(0.0976764836466,-0.0241724631247),(0.0325588278822,-0.072517389374
)- |
(0.109762715209,-0.0562379754329),(0.0813970697055,-0.0604311578117
)- |
(0.0976764836466,-0.072517389374),(0.0976764836466,-0.072517389374)
| (-0,0.0828402366864),(-0.201183431953,0)
- |
(-0.100591715976,0.12426035503),(-0.301775147929,0.0414201183432)-
|
(-0.251479289941,0.103550295858),(-0.322485207101,0.073964497
0414)
- |
(-0.301775147929,0.12426035503),(-0.301775147929,0.12426035503)
| (0.2,0),(0,0)
| (0.3,0),(0.1,0)
| (0.3,0.05),(0.25,0)
| (0.3,0),(0.3,0)
(20 rows)

--- 258,281 ----
twenty |                               rotation                  
--------+--------------------------------------------------------
--------------
| (0,-0),(-0.2,-0.2)
| (0.08,-0),(0,-0.56)
| (0.0651176557644,0),(0,-0.0483449262493)
| (-0,0.0828402366864),(-0.201183431953,0)
| (0.2,0),(0,0)
+         | (-0.1,-0.1),(-0.3,-0.3)
+         | (0.12,-0.28),(0.04,-0.84)
+         |
(0.0976764836466,-0.0241724631247),(0.0325588278822,-0.072517389374
)+         |
(-0.100591715976,0.12426035503),(-0.301775147929,0.0414201183432)
| (0.3,0),(0.1,0)
+         | (-0.25,-0.25),(-0.25,-0.35)
+         | (0.26,-0.7),(0.1,-0.82)
+         |
(0.109762715209,-0.0562379754329),(0.0813970697055,-0.0604311578117
)+         |
(-0.251479289941,0.103550295858),(-0.322485207101,0.0739644970414)
| (0.3,0.05),(0.25,0)
+         | (-0.3,-0.3),(-0.3,-0.3)
+         | (0.12,-0.84),(0.12,-0.84)
+         |
(0.0976764836466,-0.072517389374),(0.0976764836466,-0.072517389374)+
|
(-0.301775147929,0.12426035503),(-0.301775147929,0.12426035
503)
| (0.3,0),(0.3,0)
(20 rows)

===================================================================
===

*** ./expected/create_function_1.out	Sat Jul 26 11:19:18 2003
--- ./results/create_function_1.out	Sat Jul 26 11:20:24 2003
***************
*** 51,57 ****
ERROR:  return type mismatch in function: declared to return
integer, returns "unknown" CREATE FUNCTION test1 (int) RETURNS int
LANGUAGE sql
AS 'not even SQL';
! ERROR:  parser: parse error at or near "not" at character 1
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
AS 'SELECT 1, 2, 3;';
ERROR:  function declared to return integer returns multiple
columns in final SELECT
--- 51,57 ----
ERROR:  return type mismatch in function: declared to return
integer, returns "unknown" CREATE FUNCTION test1 (int) RETURNS int
LANGUAGE sql
AS 'not even SQL';
! ERROR:  parser: syntax error at or near "not" at character 1
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
AS 'SELECT 1, 2, 3;';
ERROR:  function declared to return integer returns multiple
columns in final SELECT

===================================================================
===

*** ./expected/constraints.out	Sat Jul 26 11:19:18 2003
--- ./results/constraints.out	Sat Jul 26 11:20:26 2003
***************
*** 45,56 ****
-- syntax errors
--  test for extraneous comma
CREATE TABLE error_tbl (i int DEFAULT (100, ));
! ERROR:  parser: parse error at or near "," at character 43
--  this will fail because gram.y uses b_expr not a_expr for
defaults,--  to avoid a shift/reduce conflict that arises from NOT
NULL being--  part of the column definition syntax:
CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2));
! ERROR:  parser: parse error at or near "IN" at character 43
--  this should work, however:
CREATE TABLE error_tbl (b1 bool DEFAULT (1 IN (1, 2)));
DROP TABLE error_tbl;
--- 45,56 ----
-- syntax errors
--  test for extraneous comma
CREATE TABLE error_tbl (i int DEFAULT (100, ));
! ERROR:  parser: syntax error at or near "," at character 43
--  this will fail because gram.y uses b_expr not a_expr for
defaults,--  to avoid a shift/reduce conflict that arises from NOT
NULL being--  part of the column definition syntax:
CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2));
! ERROR:  parser: syntax error at or near "IN" at character 43
--  this should work, however:
CREATE TABLE error_tbl (b1 bool DEFAULT (1 IN (1, 2)));
DROP TABLE error_tbl;

===================================================================
===

*** ./expected/errors.out	Mon Sep  2 00:05:16 2002
--- ./results/errors.out	Sat Jul 26 11:20:28 2003
***************
*** 22,34 ****
-- missing relation name 
select;
! ERROR:  parser: parse error at or near ";" at character 7
-- no such relation 
select * from nonesuch;
ERROR:  Relation "nonesuch" does not exist
-- missing target list
select from pg_database;
! ERROR:  parser: parse error at or near "from" at character 8
-- bad name in target list
select nonesuch from pg_database;
ERROR:  Attribute "nonesuch" not found
--- 22,34 ----
-- missing relation name 
select;
! ERROR:  parser: syntax error at or near ";" at character 7
-- no such relation 
select * from nonesuch;
ERROR:  Relation "nonesuch" does not exist
-- missing target list
select from pg_database;
! ERROR:  parser: syntax error at or near "from" at character 8
-- bad name in target list
select nonesuch from pg_database;
ERROR:  Attribute "nonesuch" not found
***************
*** 40,46 ****
ERROR:  Attribute "nonesuch" not found
-- bad select distinct on syntax, distinct attribute missing
select distinct on (foobar) from pg_database;
! ERROR:  parser: parse error at or near "from" at character 29
-- bad select distinct on syntax, distinct attribute not in target
list select distinct on (foobar) * from pg_database;
ERROR:  Attribute "foobar" not found
--- 40,46 ----
ERROR:  Attribute "nonesuch" not found
-- bad select distinct on syntax, distinct attribute missing
select distinct on (foobar) from pg_database;
! ERROR:  parser: syntax error at or near "from" at character 29
-- bad select distinct on syntax, distinct attribute not in target
list select distinct on (foobar) * from pg_database;
ERROR:  Attribute "foobar" not found
***************
*** 49,55 ****
-- missing relation name (this had better not wildcard!) 
delete from;
! ERROR:  parser: parse error at or near ";" at character 12
-- no such relation 
delete from nonesuch;
ERROR:  Relation "nonesuch" does not exist
--- 49,55 ----

-- missing relation name (this had better not wildcard!)
delete from;
! ERROR: parser: syntax error at or near ";" at character 12
-- no such relation
delete from nonesuch;
ERROR: Relation "nonesuch" does not exist
***************
*** 58,64 ****

-- missing relation name (this had better not wildcard!) 
drop table;
! ERROR:  parser: parse error at or near ";" at character 11
-- no such relation 
drop table nonesuch;
ERROR:  table "nonesuch" does not exist
--- 58,64 ----
-- missing relation name (this had better not wildcard!) 
drop table;
! ERROR:  parser: syntax error at or near ";" at character 11
-- no such relation 
drop table nonesuch;
ERROR:  table "nonesuch" does not exist
***************
*** 68,74 ****
-- relation renaming 
-- missing relation name 
alter table rename;
! ERROR:  parser: parse error at or near ";" at character 19
-- no such relation 
alter table nonesuch rename to newnonesuch;
ERROR:  Relation "nonesuch" does not exist
--- 68,74 ----
-- relation renaming 
-- missing relation name 
alter table rename;
! ERROR:  parser: syntax error at or near ";" at character 19
-- no such relation 
alter table nonesuch rename to newnonesuch;
ERROR:  Relation "nonesuch" does not exist
***************
*** 122,131 ****
-- missing index name 
drop index;
! ERROR:  parser: parse error at or near ";" at character 11
-- bad index name 
drop index 314159;
! ERROR:  parser: parse error at or near "314159" at character 12
-- no such index 
drop index nonesuch;
ERROR:  index "nonesuch" does not exist
--- 122,131 ----

-- missing index name
drop index;
! ERROR: parser: syntax error at or near ";" at character 11
-- bad index name
drop index 314159;
! ERROR: parser: syntax error at or near "314159" at character 12
-- no such index
drop index nonesuch;
ERROR: index "nonesuch" does not exist
***************
*** 134,146 ****

-- missing aggregate name 
drop aggregate;
! ERROR:  parser: parse error at or near ";" at character 15
-- missing aggregate type
drop aggregate newcnt1;
! ERROR:  parser: parse error at or near ";" at character 23
-- bad aggregate name 
drop aggregate 314159 (int);
! ERROR:  parser: parse error at or near "314159" at character 16
-- bad aggregate type
drop aggregate newcnt (nonesuch);
ERROR:  Type "nonesuch" does not exist
--- 134,146 ----

-- missing aggregate name
drop aggregate;
! ERROR: parser: syntax error at or near ";" at character 15
-- missing aggregate type
drop aggregate newcnt1;
! ERROR: parser: syntax error at or near ";" at character 23
-- bad aggregate name
drop aggregate 314159 (int);
! ERROR: parser: syntax error at or near "314159" at character 16
-- bad aggregate type
drop aggregate newcnt (nonesuch);
ERROR: Type "nonesuch" does not exist
***************
*** 155,164 ****

-- missing function name 
drop function ();
! ERROR:  parser: parse error at or near "(" at character 15
-- bad function name 
drop function 314159();
! ERROR:  parser: parse error at or near "314159" at character 15
-- no such function 
drop function nonesuch();
ERROR:  RemoveFunction: function nonesuch() does not exist
--- 155,164 ----

-- missing function name
drop function ();
! ERROR: parser: syntax error at or near "(" at character 15
-- bad function name
drop function 314159();
! ERROR: parser: syntax error at or near "314159" at character 15
-- no such function
drop function nonesuch();
ERROR: RemoveFunction: function nonesuch() does not exist
***************
*** 167,176 ****

-- missing type name 
drop type;
! ERROR:  parser: parse error at or near ";" at character 10
-- bad type name 
drop type 314159;
! ERROR:  parser: parse error at or near "314159" at character 11
-- no such type 
drop type nonesuch;
ERROR:  Type "nonesuch" does not exist
--- 167,176 ----

-- missing type name
drop type;
! ERROR: parser: syntax error at or near ";" at character 10
-- bad type name
drop type 314159;
! ERROR: parser: syntax error at or near "314159" at character 11
-- no such type
drop type nonesuch;
ERROR: Type "nonesuch" does not exist
***************
*** 179,200 ****

-- missing everything 
drop operator;
! ERROR:  parser: parse error at or near ";" at character 14
-- bad operator name 
drop operator equals;
! ERROR:  parser: parse error at or near ";" at character 21
-- missing type list 
drop operator ===;
! ERROR:  parser: parse error at or near ";" at character 18
-- missing parentheses 
drop operator int4, int4;
! ERROR:  parser: parse error at or near "," at character 19
-- missing operator name 
drop operator (int4, int4);
! ERROR:  parser: parse error at or near "(" at character 15
-- missing type list contents 
drop operator === ();
! ERROR:  parser: parse error at or near ")" at character 20
-- no such operator 
drop operator === (int4);
ERROR:  parser: argument type missing (use NONE for unary
operators)
--- 179,200 ----
-- missing everything 
drop operator;
! ERROR:  parser: syntax error at or near ";" at character 14
-- bad operator name 
drop operator equals;
! ERROR:  parser: syntax error at or near ";" at character 21
-- missing type list 
drop operator ===;
! ERROR:  parser: syntax error at or near ";" at character 18
-- missing parentheses 
drop operator int4, int4;
! ERROR:  parser: syntax error at or near "," at character 19
-- missing operator name 
drop operator (int4, int4);
! ERROR:  parser: syntax error at or near "(" at character 15
-- missing type list contents 
drop operator === ();
! ERROR:  parser: syntax error at or near ")" at character 20
-- no such operator 
drop operator === (int4);
ERROR:  parser: argument type missing (use NONE for unary
operators)
***************
*** 206,212 ****
ERROR:  parser: argument type missing (use NONE for unary
operators)-- no such type1 
drop operator = ( , int4);
! ERROR:  parser: parse error at or near "," at character 19
-- no such type1 
drop operator = (nonesuch, int4);
ERROR:  Type "nonesuch" does not exist
--- 206,212 ----
ERROR:  parser: argument type missing (use NONE for unary
operators)-- no such type1 
drop operator = ( , int4);
! ERROR:  parser: syntax error at or near "," at character 19
-- no such type1 
drop operator = (nonesuch, int4);
ERROR:  Type "nonesuch" does not exist
***************
*** 215,239 ****
ERROR:  Type "nonesuch" does not exist
-- no such type2 
drop operator = (int4, );
! ERROR:  parser: parse error at or near ")" at character 24
--
-- DROP RULE
-- missing rule name 
drop rule;
! ERROR:  parser: parse error at or near ";" at character 10
-- bad rule name 
drop rule 314159;
! ERROR:  parser: parse error at or near "314159" at character 11
-- no such rule 
drop rule nonesuch on noplace;
ERROR:  Relation "noplace" does not exist
-- bad keyword 
drop tuple rule nonesuch;
! ERROR:  parser: parse error at or near "tuple" at character 6
-- no such rule 
drop instance rule nonesuch on noplace;
! ERROR:  parser: parse error at or near "instance" at character 6
-- no such rule 
drop rewrite rule nonesuch;
! ERROR:  parser: parse error at or near "rewrite" at character 6
--- 215,239 ----
ERROR:  Type "nonesuch" does not exist
-- no such type2 
drop operator = (int4, );
! ERROR:  parser: syntax error at or near ")" at character 24
--
-- DROP RULE

-- missing rule name
drop rule;
! ERROR: parser: syntax error at or near ";" at character 10
-- bad rule name
drop rule 314159;
! ERROR: parser: syntax error at or near "314159" at character 11
-- no such rule
drop rule nonesuch on noplace;
ERROR: Relation "noplace" does not exist
-- bad keyword
drop tuple rule nonesuch;
! ERROR: parser: syntax error at or near "tuple" at character 6
-- no such rule
drop instance rule nonesuch on noplace;
! ERROR: parser: syntax error at or near "instance" at character 6
-- no such rule
drop rewrite rule nonesuch;
! ERROR: parser: syntax error at or near "rewrite" at character 6

===================================================================
===

--
11:22:06 up 8 days, 15:22, 2 users, load average: 0.30, 0.51,
0.53

-- End of PGP section, PGP failed!

-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, Pennsylvania
19073

--
12:41:22 up 8 days, 16:42, 2 users, load average: 2.58, 1.05, 0.97

-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#24Robert Creager
Robert_Creager@LogicalChaos.org
In reply to: Bruce Momjian (#23)
Re: parallel regression test failure

On Sat, 26 Jul 2003 16:40:27 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

That is a very good guess. All the errors seem related to the parser.

Everyone gets lucky now and then ;-)

I'm now using bison 1.5

2003-01-22 did not fail in 50 tests.
2003-01-26 has not failed yet in 33 of 50 tests.

2003-01-28 and 2003-02-15 are compiled and waiting...

2003-02-01 fails, but only 2 time in 50 tests:

*** ./expected/domain.out	Sat Jul 26 12:24:18 2003
--- ./results/domain.out	Sat Jul 26 12:56:01 2003
***************
*** 263,269 ****
  insert into domcontest values (5);
  alter domain con drop constraint t;
  insert into domcontest values (-5); --fails
! ERROR:  ExecEvalConstraintTest: Domain con constraint $1 failed
  insert into domcontest values (42);
  -- cleanup
  drop domain ddef1 restrict;
--- 263,269 ----
  insert into domcontest values (5);
  alter domain con drop constraint t;
  insert into domcontest values (-5); --fails
! ERROR:  ExecEvalConstraintTest: Domain con constraint  failed
  insert into domcontest values (42);
  -- cleanup
  drop domain ddef1 restrict;

======================================================================

--
14:52:02 up 8 days, 18:52, 2 users, load average: 3.69, 3.40, 2.57

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#23)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

That is a very good guess. All the errors seem related to the parser.

No, I don't think bison's got anything to do with it. AFAICS all the
reported failures look more like syscache-level problems. I'm betting
on a locking issue. It'll be easier to find once you guys home in on
the date we broke it.

regards, tom lane

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)
Re: parallel regression test failure

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I am seeing the following parallel regression test failures. Any idea
on the cause?

For the record, I believe this is explained by the bug I just fixed in
_bt_search().

The bug occurs only when one backend is trying to search a btree index
at the same time another backend is doing the first page split in that
index (that is, when the aboriginal root-and-leaf page gets split into
two leaf pages). In the present form of the parallel regression tests,
pg_class_oid_index and pg_type_oid_index suffer that split during the
third group of parallel tests, which is why the failures were bunched
in constraints/triggers/vacuum.

My guess is that the reason different vintages of CVS show or don't show
the problem is that modifications of the test scripts have caused more
or fewer pg_class and pg_type entries to get created, possibly moving
the critical split point before or after that set of parallel tests.
If the split occurs during a sequential test step then we'd never see
a failure. This may explain why we've not become aware of the bug till
now, even though it's certainly been there a long time.

We need to think about whether this bug is serious enough to justify a
quick 7.3.5 release. I'm leaning to the idea that it is not, because
if it were, we'd have heard about it from the field before now. In
pre-7.4 code there is only one instant in the lifespan of an index where
the bug could occur, and then only if the index is created empty.

regards, tom lane

#27Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#26)
Re: parallel regression test failure

Tom Lane wrote:

We need to think about whether this bug is serious enough to justify a
quick 7.3.5 release. I'm leaning to the idea that it is not, because
if it were, we'd have heard about it from the field before now. In
pre-7.4 code there is only one instant in the lifespan of an index where
the bug could occur, and then only if the index is created empty.

Agreed, I don't think 7.3.5 is warranted, but it would have been nice to
get this in 7.3.4. Let's keep our eyes open for maybe a 7.3.5 later.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#28Mendola Gaetano
mendola@bigfoot.com
In reply to: Tom Lane (#26)
Re: parallel regression test failure

"Bruce Momjian" <pgman@candle.pha.pa.us> wrote:

Tom Lane wrote:

We need to think about whether this bug is serious enough to justify a
quick 7.3.5 release. I'm leaning to the idea that it is not, because
if it were, we'd have heard about it from the field before now. In
pre-7.4 code there is only one instant in the lifespan of an index where
the bug could occur, and then only if the index is created empty.

Agreed, I don't think 7.3.5 is warranted, but it would have been nice to
get this in 7.3.4. Let's keep our eyes open for maybe a 7.3.5 later.

7.3.4 is already out, I agree that do a 7.3.5 build is not necessary.

Regards
Gaetano Mendola

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mendola Gaetano (#28)
Re: parallel regression test failure

"Mendola Gaetano" <mendola@bigfoot.com> writes:

"Bruce Momjian" <pgman@candle.pha.pa.us> wrote:

Tom Lane wrote:

We need to think about whether this bug is serious enough to justify a
quick 7.3.5 release. I'm leaning to the idea that it is not, because
if it were, we'd have heard about it from the field before now. In
pre-7.4 code there is only one instant in the lifespan of an index where
the bug could occur, and then only if the index is created empty.

Agreed, I don't think 7.3.5 is warranted, but it would have been nice to
get this in 7.3.4. Let's keep our eyes open for maybe a 7.3.5 later.

7.3.4 is already out, I agree that do a 7.3.5 build is not necessary.

I dug back in the archives and convinced myself that this bug existed as
far back as Postgres 4.2 (nbtsearch.c dated 1994-02-07); it's probably
much older but I have nothing to look at to check. Since it's managed
to go undetected for a decade, it's probably not a "must fix today!"
kind of bug.

I do think we ought to put out a 7.3.5 that includes the fix at some
point, but let's wait a few weeks and see if anything else turns up
to include.

regards, tom lane

In reply to: Bruce Momjian (#1)
Re: parallel regression test failure

On Fri, Jul 25, 2003 at 05:47:50PM -0400, Bruce Momjian wrote:

I am seeing the following parallel regression test failures. Any idea
on the cause?

I think I saw about the same thing once, but I run the test again
and it didn't show up anymore at all. I'm not sure what it
exactly was, but it looked a bit simular to yours.

Kurt