random isolation test failures

Started by Andrew Dunstanover 14 years ago8 messages

andrew@dunslane.net

over 14 years ago

We are seeing numerous occasional buildfarm failures of the fk-deadlock2
isolation test, that look like this:

***************
*** 32,39 ****
step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;<waiting ...>
step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2:<... completed>
ERROR: deadlock detected
step s2c: COMMIT;
step s1c: COMMIT;

    --- 32,39 ----
       step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
       step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;<waiting ...>
       step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
       ERROR:  deadlock detected
    + step s1u2:<... completed>
       step s2c: COMMIT;
       step s1c: COMMIT;

If this is harmless, we could provide an alternative results file as a
simple fix. If it's not harmless, it should be fixed.

cheers

andrew

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Andrew Dunstan (#1)

Re: random isolation test failures

Andrew Dunstan <andrew@dunslane.net> wrote:

We are seeing numerous occasional buildfarm failures of the
fk-deadlock2 isolation test

If this is harmless, we could provide an alternative results file
as a simple fix. If it's not harmless, it should be fixed.

I agree, but don't look at me. I'm not the one who added the tests,
nor are they related to serializable snapshot isolation. Tom
recently raised the same issue on this thread:

http://archives.postgresql.org/pgsql-hackers/2011-09/msg00991.php

Alvaro?

-Kevin

Tom Lane

tgl@sss.pgh.pa.us

over 14 years ago

In reply to: Andrew Dunstan (#1)

Re: random isolation test failures

Andrew Dunstan <andrew@dunslane.net> writes:

We are seeing numerous occasional buildfarm failures of the fk-deadlock2
isolation test,

Yeah, I complained about this already, but Kevin disclaims all
responsibility for the fk isolation tests. It looks like Alvaro
and Noah Misch are the people to be harassing.

regards, tom lane

Noah Misch

noah@leadboat.com

over 14 years ago

In reply to: Tom Lane (#3)

1 attachment(s)

Re: random isolation test failures

On Mon, Sep 26, 2011 at 01:10:27PM -0400, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

We are seeing numerous occasional buildfarm failures of the fk-deadlock2
isolation test,

Yeah, I complained about this already, but Kevin disclaims all
responsibility for the fk isolation tests. It looks like Alvaro
and Noah Misch are the people to be harassing.

Yep; I took advantage of Kevin's test harness for some unrelated tests.

These sporadic failures happen whenever the test case takes longer than
deadlock_timeout (currently 100ms for these tests) to setup the deadlock. I
outlined some mitigating strategies here:
http://archives.postgresql.org/message-id/20110727171438.GE18910@tornado.leadboat.com

I'd vote for #1: let's double the deadlock_timeout until the failures stop.
Other opinions?

Thanks,
nm

Attachments:

fk-isolation-200ms.patchtext/plain; charset=us-asciiDownload

*** a/src/test/isolation/specs/fk-deadlock.spec
--- b/src/test/isolation/specs/fk-deadlock.spec
***************
*** 19,25 **** teardown
  }
  
  session "s1"
! setup		{ BEGIN; SET deadlock_timeout = '100ms'; }
  step "s1i"	{ INSERT INTO child VALUES (1, 1); }
  step "s1u"	{ UPDATE parent SET aux = 'bar'; }
  step "s1c"	{ COMMIT; }
--- 19,25 ----
  }
  
  session "s1"
! setup		{ BEGIN; SET deadlock_timeout = '200ms'; }
  step "s1i"	{ INSERT INTO child VALUES (1, 1); }
  step "s1u"	{ UPDATE parent SET aux = 'bar'; }
  step "s1c"	{ COMMIT; }
*** a/src/test/isolation/specs/fk-deadlock2.spec
--- b/src/test/isolation/specs/fk-deadlock2.spec
***************
*** 24,30 **** teardown
  }
  
  session "s1"
! setup		{ BEGIN; SET deadlock_timeout = '100ms'; }
  step "s1u1"	{ UPDATE A SET Col1 = 1 WHERE AID = 1; }
  step "s1u2"	{ UPDATE B SET Col2 = 1 WHERE BID = 2; }
  step "s1c"	{ COMMIT; }
--- 24,30 ----
  }
  
  session "s1"
! setup		{ BEGIN; SET deadlock_timeout = '200ms'; }
  step "s1u1"	{ UPDATE A SET Col1 = 1 WHERE AID = 1; }
  step "s1u2"	{ UPDATE B SET Col2 = 1 WHERE BID = 2; }
  step "s1c"	{ COMMIT; }

Alvaro Herrera

alvherre@commandprompt.com

over 14 years ago

In reply to: Noah Misch (#4)

1 attachment(s)

Re: random isolation test failures

Excerpts from Noah Misch's message of lun sep 26 21:57:40 -0300 2011:

These sporadic failures happen whenever the test case takes longer than
deadlock_timeout (currently 100ms for these tests) to setup the deadlock. I
outlined some mitigating strategies here:
http://archives.postgresql.org/message-id/20110727171438.GE18910@tornado.leadboat.com

I'd vote for #1: let's double the deadlock_timeout until the failures stop.
Other opinions?

I just tweaked isolationtester so that it collects the error messages
and displays them all together at the end of the test. After seeing it
run, I didn't like it -- I think I prefer something more local, so that
in the only case where we call try_complete_step twice in the loop, we
report any errors in either. AFAICS this would make both expected cases
behave identically in test output. The only thing left to figure out is
where to store the error message between calls ... clearly Step is not
the right place for it. I'm on it now, anyway.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachments:

isolation-fix.patchapplication/octet-stream; name=isolation-fix.patchDownload

diff --git a/src/test/isolation/isolationtester.c b/src/test/isolation/isolationtester.c
index 0164012..852ba77 100644
--- a/src/test/isolation/isolationtester.c
+++ b/src/test/isolation/isolationtester.c
@@ -455,10 +455,29 @@ run_permutation(TestSpec * testspec, int nsteps, Step ** steps)
 		else if (try_complete_step(step, STEP_NONBLOCK))
 			waiting = step;
 	}
+	for (i = 0; i < nsteps; i++)
+	{
+		Step *step = steps[i];
+
+		if (step->errormsg)
+		{
+			fprintf(stdout, "%s\n", step->errormsg);
+			free(step->errormsg);
+			step->errormsg = NULL;
+		}
+	}
 
 	/* Finish any waiting query. */
 	if (waiting != NULL)
+	{
 		try_complete_step(waiting, STEP_RETRY);
+		if (waiting->errormsg)
+		{
+			fprintf(stdout, "%s\n", waiting->errormsg);
+			free(waiting->errormsg);
+			waiting->errormsg = NULL;
+		}
+	}
 
 	/* Perform per-session teardown */
 	for (i = 0; i < testspec->nsessions; i++)
@@ -505,6 +524,9 @@ run_permutation(TestSpec * testspec, int nsteps, Step ** steps)
  * When calling this function on behalf of a given step for a second or later
  * time, pass the STEP_RETRY flag.  This only affects the messages printed.
  *
+ * If the connection returns an error, the message is saved in step->errormsg.
+ * Caller is responsible for reporting it as appropriate.
+ *
  * If the STEP_NONBLOCK flag was specified and the query is waiting to acquire
  * a lock, returns true.  Otherwise, returns false.
  */
@@ -579,9 +601,17 @@ try_complete_step(Step *step, int flags)
 				printResultSet(res);
 				break;
 			case PGRES_FATAL_ERROR:
+				if (step->errormsg != NULL)
+				{
+					printf("WARNING: this step had already emitted an error message\n");
+					printf("%s\n", step->errormsg);
+				}
 				/* Detail may contain xid values, so just show primary. */
-				printf("%s:  %s\n", PQresultErrorField(res, PG_DIAG_SEVERITY),
-					   PQresultErrorField(res, PG_DIAG_MESSAGE_PRIMARY));
+				step->errormsg = malloc(5 + strlen(PQresultErrorField(res, PG_DIAG_SEVERITY)) + 
+										strlen(PQresultErrorField(res, PG_DIAG_MESSAGE_PRIMARY)));
+				sprintf(step->errormsg, "%s:  %s",
+						PQresultErrorField(res, PG_DIAG_SEVERITY),
+						PQresultErrorField(res, PG_DIAG_MESSAGE_PRIMARY));
 				break;
 			default:
 				printf("unexpected result status: %s\n",
diff --git a/src/test/isolation/isolationtester.h b/src/test/isolation/isolationtester.h
index 377c10c..1e286df 100644
--- a/src/test/isolation/isolationtester.h
+++ b/src/test/isolation/isolationtester.h
@@ -31,6 +31,7 @@ struct Step
 	int			session;
 	char	   *name;
 	char	   *sql;
+	char	   *errormsg;
 };
 
 typedef struct

Tom Lane

tgl@sss.pgh.pa.us

over 14 years ago

In reply to: Alvaro Herrera (#5)

Re: random isolation test failures

Alvaro Herrera <alvherre@commandprompt.com> writes:

I just tweaked isolationtester so that it collects the error messages
and displays them all together at the end of the test. After seeing it
run, I didn't like it -- I think I prefer something more local, so that
in the only case where we call try_complete_step twice in the loop, we
report any errors in either. AFAICS this would make both expected cases
behave identically in test output.

Hmm, is that really an appropriate fix? I'm worried that it might mask
event-ordering differences that actually are significant.

regards, tom lane

Alvaro Herrera

alvherre@commandprompt.com

over 14 years ago

In reply to: Tom Lane (#6)

1 attachment(s)

Re: random isolation test failures

Excerpts from Tom Lane's message of mar sep 27 01:11:39 -0300 2011:

Alvaro Herrera <alvherre@commandprompt.com> writes:

I just tweaked isolationtester so that it collects the error messages
and displays them all together at the end of the test. After seeing it
run, I didn't like it -- I think I prefer something more local, so that
in the only case where we call try_complete_step twice in the loop, we
report any errors in either. AFAICS this would make both expected cases
behave identically in test output.

Hmm, is that really an appropriate fix? I'm worried that it might mask
event-ordering differences that actually are significant.

In the attached, it only affects the case where there is one blocking
command and another command that unblocks it; this is only exercised by
the much-beaten fk-deadlock cases. If either of the steps fails with a
deadlock error, it is reported identically, i.e. the error message is
emitted as

"error in s1u1 s2u1: ERROR: deadlock detected"

So the deadlock could have been detected in either s1u1 or s2u1; we
don't really care.

The way error messages are reported in all the other cases is not
changed, and these do not have a prefix; so if anything were to behave
differently, we would find out because a spurious prefix would appear.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachments:

isolation-fix-2.patchapplication/octet-stream; name=isolation-fix-2.patchDownload

*** a/src/test/isolation/expected/fk-deadlock.out
--- b/src/test/isolation/expected/fk-deadlock.out
***************
*** 23,29 **** step s2i: INSERT INTO child VALUES (2, 1);
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 23,29 ----
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! error in steps s2u s1u: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 32,39 **** step s1i: INSERT INTO child VALUES (1, 1);
  step s2i: INSERT INTO child VALUES (2, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
- ERROR:  deadlock detected
  step s2u: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 32,39 ----
  step s2i: INSERT INTO child VALUES (2, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
  step s2u: <... completed>
+ error in steps s1u s2u: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 43,49 **** step s1i: INSERT INTO child VALUES (1, 1);
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 43,49 ----
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! error in steps s2u s1u: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 52,59 **** step s2i: INSERT INTO child VALUES (2, 1);
  step s1i: INSERT INTO child VALUES (1, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
- ERROR:  deadlock detected
  step s2u: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 52,59 ----
  step s1i: INSERT INTO child VALUES (1, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
  step s2u: <... completed>
+ error in steps s1u s2u: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
*** a/src/test/isolation/expected/fk-deadlock2.out
--- b/src/test/isolation/expected/fk-deadlock2.out
***************
*** 23,29 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 23,29 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 33,39 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 33,39 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 42,49 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 42,49 ----
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 52,59 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 52,59 ----
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 63,69 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 63,69 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 73,79 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 73,79 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 82,89 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 82,89 ----
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 92,99 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 92,99 ----
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
*** a/src/test/isolation/expected/fk-deadlock2_1.out
--- b/src/test/isolation/expected/fk-deadlock2_1.out
***************
*** 14,20 **** step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1c: COMMIT;
  step s2u1: <... completed>
! ERROR:  could not serialize access due to concurrent update
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  ERROR:  current transaction is aborted, commands ignored until end of transaction block
  step s2c: COMMIT;
--- 14,20 ----
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1c: COMMIT;
  step s2u1: <... completed>
! error in steps s1c s2u1: ERROR:  could not serialize access due to concurrent update
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  ERROR:  current transaction is aborted, commands ignored until end of transaction block
  step s2c: COMMIT;
***************
*** 25,31 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 25,31 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 35,41 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 35,41 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 44,51 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 44,51 ----
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 54,61 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 54,61 ----
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 65,71 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 65,71 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 75,81 **** step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 75,81 ----
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: <... completed>
! error in steps s2u2 s1u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 84,91 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 84,91 ----
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 94,101 **** step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
  step s2u2: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 94,101 ----
  step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s2u2: <... completed>
+ error in steps s1u2 s2u2: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
*** a/src/test/isolation/expected/fk-deadlock2_2.out
--- /dev/null
***************
*** 1,110 ****
- Parsed test spec with 2 sessions
- 
- starting permutation: s1u1 s1u2 s1c s2u1 s2u2 s2c
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1c: COMMIT;
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s2c: COMMIT;
- 
- starting permutation: s1u1 s1u2 s2u1 s1c s2u2 s2c
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s1c: COMMIT;
- step s2u1: <... completed>
- ERROR:  could not serialize access due to concurrent update
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  current transaction is aborted, commands ignored until end of transaction block
- step s2c: COMMIT;
- 
- starting permutation: s1u1 s2u1 s1u2 s2u2 s1c s2c
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2: <... completed>
- ERROR:  deadlock detected
- step s1c: COMMIT;
- step s2c: COMMIT;
- 
- starting permutation: s1u1 s2u1 s1u2 s2u2 s2c s1c
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2: <... completed>
- ERROR:  deadlock detected
- step s2c: COMMIT;
- step s1c: COMMIT;
- 
- starting permutation: s1u1 s2u1 s2u2 s1u2 s1c s2c
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
- step s2u2: <... completed>
- step s1c: COMMIT;
- step s2c: COMMIT;
- 
- starting permutation: s1u1 s2u1 s2u2 s1u2 s2c s1c
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
- step s2u2: <... completed>
- step s2c: COMMIT;
- step s1c: COMMIT;
- 
- starting permutation: s2u1 s1u1 s1u2 s2u2 s1c s2c
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2: <... completed>
- ERROR:  deadlock detected
- step s1c: COMMIT;
- step s2c: COMMIT;
- 
- starting permutation: s2u1 s1u1 s1u2 s2u2 s2c s1c
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u2: <... completed>
- ERROR:  deadlock detected
- step s2c: COMMIT;
- step s1c: COMMIT;
- 
- starting permutation: s2u1 s1u1 s2u2 s1u2 s1c s2c
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
- step s2u2: <... completed>
- step s1c: COMMIT;
- step s2c: COMMIT;
- 
- starting permutation: s2u1 s1u1 s2u2 s1u2 s2c s1c
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1;
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2; <waiting ...>
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  deadlock detected
- step s2u2: <... completed>
- step s2c: COMMIT;
- step s1c: COMMIT;
- 
- starting permutation: s2u1 s2u2 s1u1 s2c s1u2 s1c
- step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- step s1u1: UPDATE A SET Col1 = 1 WHERE AID = 1; <waiting ...>
- step s2c: COMMIT;
- step s1u1: <... completed>
- step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
- ERROR:  could not serialize access due to concurrent update
- step s1c: COMMIT;
--- 0 ----
*** a/src/test/isolation/expected/fk-deadlock_1.out
--- b/src/test/isolation/expected/fk-deadlock_1.out
***************
*** 14,20 **** step s1u: UPDATE parent SET aux = 'bar';
  step s2i: INSERT INTO child VALUES (2, 1); <waiting ...>
  step s1c: COMMIT;
  step s2i: <... completed>
! ERROR:  could not serialize access due to concurrent update
  step s2u: UPDATE parent SET aux = 'baz';
  ERROR:  current transaction is aborted, commands ignored until end of transaction block
  step s2c: COMMIT;
--- 14,20 ----
  step s2i: INSERT INTO child VALUES (2, 1); <waiting ...>
  step s1c: COMMIT;
  step s2i: <... completed>
! error in steps s1c s2i: ERROR:  could not serialize access due to concurrent update
  step s2u: UPDATE parent SET aux = 'baz';
  ERROR:  current transaction is aborted, commands ignored until end of transaction block
  step s2c: COMMIT;
***************
*** 25,31 **** step s2i: INSERT INTO child VALUES (2, 1);
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 25,31 ----
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! error in steps s2u s1u: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 34,41 **** step s1i: INSERT INTO child VALUES (1, 1);
  step s2i: INSERT INTO child VALUES (2, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
- ERROR:  deadlock detected
  step s2u: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 34,41 ----
  step s2i: INSERT INTO child VALUES (2, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
  step s2u: <... completed>
+ error in steps s1u s2u: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 45,51 **** step s1i: INSERT INTO child VALUES (1, 1);
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
--- 45,51 ----
  step s1u: UPDATE parent SET aux = 'bar'; <waiting ...>
  step s2u: UPDATE parent SET aux = 'baz';
  step s1u: <... completed>
! error in steps s2u s1u: ERROR:  deadlock detected
  step s1c: COMMIT;
  step s2c: COMMIT;
  
***************
*** 54,61 **** step s2i: INSERT INTO child VALUES (2, 1);
  step s1i: INSERT INTO child VALUES (1, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
- ERROR:  deadlock detected
  step s2u: <... completed>
  step s2c: COMMIT;
  step s1c: COMMIT;
  
--- 54,61 ----
  step s1i: INSERT INTO child VALUES (1, 1);
  step s2u: UPDATE parent SET aux = 'baz'; <waiting ...>
  step s1u: UPDATE parent SET aux = 'bar';
  step s2u: <... completed>
+ error in steps s1u s2u: ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;
  
***************
*** 65,71 **** step s2u: UPDATE parent SET aux = 'baz';
  step s1i: INSERT INTO child VALUES (1, 1); <waiting ...>
  step s2c: COMMIT;
  step s1i: <... completed>
! ERROR:  could not serialize access due to concurrent update
  step s1u: UPDATE parent SET aux = 'bar';
  ERROR:  current transaction is aborted, commands ignored until end of transaction block
  step s1c: COMMIT;
--- 65,71 ----
  step s1i: INSERT INTO child VALUES (1, 1); <waiting ...>
  step s2c: COMMIT;
  step s1i: <... completed>
! error in steps s2c s1i: ERROR:  could not serialize access due to concurrent update
  step s1u: UPDATE parent SET aux = 'bar';
  ERROR:  current transaction is aborted, commands ignored until end of transaction block
  step s1c: COMMIT;
*** a/src/test/isolation/isolationtester.c
--- b/src/test/isolation/isolationtester.c
***************
*** 384,389 **** step_bsearch_cmp(const void *a, const void *b)
--- 384,433 ----
  }
  
  /*
+  * If a step caused an error to be reported, print it out and clear it.
+  */
+ static void
+ report_error_message(Step *step)
+ {
+ 	if (step->errormsg)
+ 	{
+ 		fprintf(stdout, "%s\n", step->errormsg);
+ 		free(step->errormsg);
+ 		step->errormsg = NULL;
+ 	}
+ }
+ 
+ /*
+  * As above, but reports messages possibly emitted by two steps.  This is
+  * useful when we have a blocked command awakened by another one; we want to
+  * report both messages identically, for the case where we don't care which
+  * one fails due to a timeout such as deadlock timeout.
+  */
+ static void
+ report_two_error_messages(Step *step1, Step *step2)
+ {
+ 	char *prefix;
+ 
+ 	prefix = malloc(strlen(step1->name) + strlen(step2->name) + 2);
+ 	sprintf(prefix, "%s %s", step1->name, step2->name);
+ 
+ 	if (step1->errormsg)
+ 	{
+ 		fprintf(stdout, "error in steps %s: %s\n", prefix,
+ 				step1->errormsg);
+ 		free(step1->errormsg);
+ 		step1->errormsg = NULL;
+ 	}
+ 	if (step2->errormsg)
+ 	{
+ 		fprintf(stdout, "error in steps %s: %s\n", prefix,
+ 				step2->errormsg);
+ 		free(step2->errormsg);
+ 		step2->errormsg = NULL;
+ 	}
+ }
+ 
+ /*
   * Run one permutation
   */
  static void
***************
*** 448,464 **** run_permutation(TestSpec * testspec, int nsteps, Step ** steps)
  			/* Some other step is already waiting: just block. */
  			try_complete_step(step, 0);
  
! 			/* See if this step unblocked the waiting step. */
  			if (!try_complete_step(waiting, STEP_NONBLOCK | STEP_RETRY))
  				waiting = NULL;
  		}
- 		else if (try_complete_step(step, STEP_NONBLOCK))
- 			waiting = step;
  	}
  
  	/* Finish any waiting query. */
  	if (waiting != NULL)
  		try_complete_step(waiting, STEP_RETRY);
  
  	/* Perform per-session teardown */
  	for (i = 0; i < testspec->nsessions; i++)
--- 492,523 ----
  			/* Some other step is already waiting: just block. */
  			try_complete_step(step, 0);
  
! 			/*
! 			 * See if this step unblocked the waiting step; report both error
! 			 * messages together if so.
! 			 */
  			if (!try_complete_step(waiting, STEP_NONBLOCK | STEP_RETRY))
+ 			{
+ 				report_two_error_messages(step, waiting);
  				waiting = NULL;
+ 			}
+ 			else
+ 				report_error_message(step);
+ 		}
+ 		else
+ 		{
+ 			if (try_complete_step(step, STEP_NONBLOCK))
+ 				waiting = step;
+ 			report_error_message(step);
  		}
  	}
  
  	/* Finish any waiting query. */
  	if (waiting != NULL)
+ 	{
  		try_complete_step(waiting, STEP_RETRY);
+ 		report_error_message(waiting);
+ 	}
  
  	/* Perform per-session teardown */
  	for (i = 0; i < testspec->nsessions; i++)
***************
*** 505,510 **** run_permutation(TestSpec * testspec, int nsteps, Step ** steps)
--- 564,573 ----
   * When calling this function on behalf of a given step for a second or later
   * time, pass the STEP_RETRY flag.  This only affects the messages printed.
   *
+  * If the connection returns an error, the message is saved in step->errormsg.
+  * Caller should call report_error_message shortly after this, to have it
+  * printed and cleared.
+  *
   * If the STEP_NONBLOCK flag was specified and the query is waiting to acquire
   * a lock, returns true.  Otherwise, returns false.
   */
***************
*** 579,587 **** try_complete_step(Step *step, int flags)
  				printResultSet(res);
  				break;
  			case PGRES_FATAL_ERROR:
  				/* Detail may contain xid values, so just show primary. */
! 				printf("%s:  %s\n", PQresultErrorField(res, PG_DIAG_SEVERITY),
! 					   PQresultErrorField(res, PG_DIAG_MESSAGE_PRIMARY));
  				break;
  			default:
  				printf("unexpected result status: %s\n",
--- 642,660 ----
  				printResultSet(res);
  				break;
  			case PGRES_FATAL_ERROR:
+ 				if (step->errormsg != NULL)
+ 				{
+ 					printf("WARNING: this step had a leftover error message\n");
+ 					printf("%s\n", step->errormsg);
+ 				}
  				/* Detail may contain xid values, so just show primary. */
! 				step->errormsg = malloc(5 +
! 										strlen(PQresultErrorField(res, PG_DIAG_SEVERITY)) + 
! 										strlen(PQresultErrorField(res,
! 																  PG_DIAG_MESSAGE_PRIMARY)));
! 				sprintf(step->errormsg, "%s:  %s",
! 						PQresultErrorField(res, PG_DIAG_SEVERITY),
! 						PQresultErrorField(res, PG_DIAG_MESSAGE_PRIMARY));
  				break;
  			default:
  				printf("unexpected result status: %s\n",
*** a/src/test/isolation/isolationtester.h
--- b/src/test/isolation/isolationtester.h
***************
*** 31,36 **** struct Step
--- 31,37 ----
  	int			session;
  	char	   *name;
  	char	   *sql;
+ 	char	   *errormsg;
  };
  
  typedef struct
*** a/src/test/isolation/specparse.y
--- b/src/test/isolation/specparse.y
***************
*** 123,128 **** step:
--- 123,129 ----
  				$$ = malloc(sizeof(Step));
  				$$->name = $2;
  				$$->sql = $3;
+ 				$$->errormsg = NULL;
  			}
  		;

Tom Lane

tgl@sss.pgh.pa.us

over 14 years ago

In reply to: Alvaro Herrera (#7)

Re: random isolation test failures

Alvaro Herrera <alvherre@commandprompt.com> writes:

Excerpts from Tom Lane's message of mar sep 27 01:11:39 -0300 2011:

Hmm, is that really an appropriate fix? I'm worried that it might mask
event-ordering differences that actually are significant.

In the attached, it only affects the case where there is one blocking
command and another command that unblocks it; this is only exercised by
the much-beaten fk-deadlock cases. If either of the steps fails with a
deadlock error, it is reported identically, i.e. the error message is
emitted as
"error in s1u1 s2u1: ERROR: deadlock detected"
So the deadlock could have been detected in either s1u1 or s2u1; we
don't really care.

Hmm. For the case of "deadlock detected", we actually don't *want* to
care because the infrastructure is such that either process might report
it. So I agree that this is a good fix for that case. I'm just worried
whether it will obscure other situations where it's important to know
which command failed. But if you're convinced there aren't any, fine.

regards, tom lane