EvalPlanQual() doesn't follow LockTuple() pattern

Started by Simon Riggsabout 9 years ago2 messages

Simon Riggs

simon@2ndquadrant.com

about 9 years ago

1 attachment(s)

src/backend/access/heap/README.tuplock says we do this...

LockTuple()
XactLockTableWait()
mark tuple as locked by me
UnlockTuple()

only problem is we don't... because EvalPlanQualFetch() does this

XactLockTableWait()
LockTuple()

If README.tuplock's reasons for the stated lock order is correct then
it implies that EvalPlanQual updates could be starved indefinitely,
which is probably bad.

It might also be a bug of more serious nature, though no bug seen.
This was found while debugging why wait_event not set correctly.

In any case, I can't really see any reason for this coding in
EvalPlanQual and it isn't explained in comments. Simply removing the
wait allows the access pattern to follow the documented lock order,
and allows regression tests and isolation tests to pass without
problem. Perhaps I have made an error there.

Thoughts?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

epq_locktuple.v1.patchapplication/octet-stream; name=epq_locktuple.v1.patchDownload

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32bb3f9..d34ce39 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2288,19 +2288,14 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
 				elog(ERROR, "t_xmin is uncommitted in tuple to be updated");
 
 			/*
-			 * If tuple is being updated by other transaction then we have to
-			 * wait for its commit/abort, or die trying.
+			 * If tuple is being updated by other transaction then we may wish
+			 * to abort early with certain wait_policy types.
 			 */
 			if (TransactionIdIsValid(SnapshotDirty.xmax))
 			{
 				ReleaseBuffer(buffer);
 				switch (wait_policy)
 				{
-					case LockWaitBlock:
-						XactLockTableWait(SnapshotDirty.xmax,
-										  relation, &tuple.t_self,
-										  XLTW_FetchUpdated);
-						break;
 					case LockWaitSkip:
 						if (!ConditionalXactLockTableWait(SnapshotDirty.xmax))
 							return NULL;		/* skip instead of waiting */
@@ -2312,8 +2307,12 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
 									 errmsg("could not obtain lock on row in relation \"%s\"",
 										RelationGetRelationName(relation))));
 						break;
+					case LockWaitBlock:
+						/*
+						 * Drop through to lock tuple, if not ours
+						 */
+						break;
 				}
-				continue;		/* loop back to repeat heap_fetch */
 			}
 
 			/*

Robert Haas

robertmhaas@gmail.com

about 9 years ago

In reply to: Simon Riggs (#1)

Re: EvalPlanQual() doesn't follow LockTuple() pattern

On Tue, Nov 29, 2016 at 6:26 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

src/backend/access/heap/README.tuplock says we do this...

LockTuple()
XactLockTableWait()
mark tuple as locked by me
UnlockTuple()

only problem is we don't... because EvalPlanQualFetch() does this

XactLockTableWait()
LockTuple()

Hmm. Yeah. Actually, it doesn't do LockTuple() directly but just
calls heap_lock_tuple(), which calls heap_acquire_tuplock(), which
calls LockTupleTuplock(), which calls LockTuple(). The words "lock"
and "tuple" are overloaded to the point of total confusion here, which
may account for the oversight you spotted.

If README.tuplock's reasons for the stated lock order is correct then
it implies that EvalPlanQual updates could be starved indefinitely,
which is probably bad.

I suspect that it's pretty hard to hit the starvation case in
practice, because EvalPlanQual updates are pretty rare. It's hard to
imagine a stream of updates all hitting the same tuple for long enough
to cause a serious problem. Eventually EvalPlanQualFetch would win
the race, I think.

It might also be a bug of more serious nature, though no bug seen.
This was found while debugging why wait_event not set correctly.

In any case, I can't really see any reason for this coding in
EvalPlanQual and it isn't explained in comments. Simply removing the
wait allows the access pattern to follow the documented lock order,
and allows regression tests and isolation tests to pass without
problem. Perhaps I have made an error there.

That might cause a problem because of this intervening test, for the
reasons explained in the comment:

/*
* If tuple was inserted by our own
transaction, we have to check
* cmin against es_output_cid: cmin >= current
CID means our
* command cannot see the tuple, so we should
ignore it. Otherwise
* heap_lock_tuple() will throw an error, and
so would any later
* attempt to update or delete the tuple. (We
need not check cmax
* because HeapTupleSatisfiesDirty will
consider a tuple deleted
* by our transaction dead, regardless of
cmax.) We just checked
* that priorXmax == xmin, so we can test that
variable instead of
* doing HeapTupleHeaderGetXmin again.
*/
if (TransactionIdIsCurrentTransactionId(priorXmax) &&
HeapTupleHeaderGetCmin(tuple.t_data)

= estate->es_output_cid)

{
ReleaseBuffer(buffer);
return NULL;
}

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers