bug in fast-path locking

Started by Robert Haasalmost 14 years ago13 messages

robertmhaas@gmail.com

almost 14 years ago

On Sun, Apr 8, 2012 at 12:43 PM, Boszormenyi Zoltan <zb@cybertec.at> wrote:

Indeed, the unpatched GIT version crashes if you enter
=#lock TABLE pgbench_accounts ;
the second time in session 2 after the first one failed. Also,
manually spelling it out:

Session 1:

$ psql
psql (9.2devel)
Type "help" for help.

zozo=# begin;
BEGIN
zozo=# lock table pgbench_accounts;
LOCK TABLE
zozo=#

Session 2:

zozo=# begin;
BEGIN
zozo=# savepoint a;
SAVEPOINT
zozo=# lock table pgbench_accounts;
ERROR: canceling statement due to statement timeout
zozo=# rollback to a;
ROLLBACK
zozo=# savepoint b;
SAVEPOINT
zozo=# lock table pgbench_accounts;
The connection to the server was lost. Attempting reset: Failed.
!>

Server log after the second lock table:

TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File:
"lock.c", Line: 749)
LOG: server process (PID 12978) was terminated by signal 6: Aborted

Robert, the Assert triggering with the above procedure
is in your "fast path" locking code with current GIT.

Yes, that sure looks like a bug. It seems that if the top-level
transaction is aborting, then LockReleaseAll() is called and
everything gets cleaned up properly; or if a subtransaction is
aborting after the lock is fully granted, then the locks held by the
subtransaction are released one at a time using LockRelease(), but if
the subtransaction is aborted *during the lock wait* then we only do
LockWaitCancel(), which doesn't clean up the LOCALLOCK. Before the
fast-lock patch, that didn't really matter, but now it does, because
that LOCALLOCK is tracking the fact that we're holding onto a shared
resource - the strong lock count. So I think that LockWaitCancel()
needs some kind of adjustment, but I haven't figured out exactly what
it is yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Robert Haas (#1)

1 attachment(s)

Re: bug in fast-path locking

On Sun, Apr 8, 2012 at 9:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Robert, the Assert triggering with the above procedure
is in your "fast path" locking code with current GIT.

Yes, that sure looks like a bug. It seems that if the top-level
transaction is aborting, then LockReleaseAll() is called and
everything gets cleaned up properly; or if a subtransaction is
aborting after the lock is fully granted, then the locks held by the
subtransaction are released one at a time using LockRelease(), but if
the subtransaction is aborted *during the lock wait* then we only do
LockWaitCancel(), which doesn't clean up the LOCALLOCK. Before the
fast-lock patch, that didn't really matter, but now it does, because
that LOCALLOCK is tracking the fact that we're holding onto a shared
resource - the strong lock count. So I think that LockWaitCancel()
needs some kind of adjustment, but I haven't figured out exactly what
it is yet.

I looked at this more. The above analysis is basically correct, but
the problem goes a bit beyond an error in LockWaitCancel(). We could
also crap out with an error before getting as far as LockWaitCancel()
and have the same problem. I think that a correct statement of the
problem is this: from the time we bump the strong lock count, up until
the time we're done acquiring the lock (or give up on acquiring it),
we need to have an error-cleanup hook in place that will unbump the
strong lock count if we error out. Once we're done updating the
shared and local lock tables, the special handling ceases to be
needed, because any subsequent lock release will go through
LockRelease() or LockReleaseAll(), which will do the appropriate
clenaup.

The attached patch is an attempt at implementing that; any reviews appreciated.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

fix-strong-lock-cleanup.patchapplication/octet-stream; name=fix-strong-lock-cleanup.patchDownload

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index e22bdac..bc9a357 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2258,7 +2258,7 @@ AbortTransaction(void)
 	 * Also clean up any open wait for lock, since the lock manager will choke
 	 * if we try to wait for another lock before doing this.
 	 */
-	LockWaitCancel();
+	LockErrorCleanup();
 
 	/*
 	 * check the current transaction state
@@ -4143,7 +4143,7 @@ AbortSubTransaction(void)
 	AbortBufferIO();
 	UnlockBuffers();
 
-	LockWaitCancel();
+	LockErrorCleanup();
 
 	/*
 	 * check the current transaction state
diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index e3bb116..ee94725 100644
--- a/src/backend/storage/lmgr/README
+++ b/src/backend/storage/lmgr/README
@@ -560,7 +560,7 @@ examines every member of the wait queue (this was not true in the 7.0
 implementation, BTW).  Therefore, if ProcLockWakeup is always invoked
 after a lock is released or a wait queue is rearranged, there can be no
 failure to wake a wakable process.  One should also note that
-LockWaitCancel (abort a waiter due to outside factors) must run
+LockErrorCleanup (abort a waiter due to outside factors) must run
 ProcLockWakeup, in case the canceled waiter was soft-blocking other
 waiters.
 
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index a98dfca..3f937ab 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -250,7 +250,8 @@ static HTAB *LockMethodProcLockHash;
 static HTAB *LockMethodLocalHash;
 
 
-/* private state for GrantAwaitedLock */
+/* private state for error cleanup */
+static LOCALLOCK *StrongLockInProgress;
 static LOCALLOCK *awaitedLock;
 static ResourceOwner awaitedOwner;
 
@@ -338,6 +339,8 @@ static void RemoveLocalLock(LOCALLOCK *locallock);
 static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
 			     const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
 static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
+static void BeginStrongLockAcquire(LOCALLOCK *locallock, uint32 fasthashcode);
+static void FinishStrongLockAcquire(void);
 static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
 static void ReleaseLockForOwner(LOCALLOCK *locallock, ResourceOwner owner);
 static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
@@ -738,22 +741,11 @@ LockAcquireExtended(const LOCKTAG *locktag,
 		}
 		else if (FastPathStrongMode(lockmode))
 		{
-			/*
-			 * Adding to a memory location is not atomic, so we take a
-			 * spinlock to ensure we don't collide with someone else trying
-			 * to bump the count at the same time.
-			 *
-			 * XXX: It might be worth considering using an atomic fetch-and-add
-			 * instruction here, on architectures where that is supported.
-			 */
-			Assert(locallock->holdsStrongLockCount == FALSE);
-			SpinLockAcquire(&FastPathStrongRelationLocks->mutex);
-			FastPathStrongRelationLocks->count[fasthashcode]++;
-			locallock->holdsStrongLockCount = TRUE;
-			SpinLockRelease(&FastPathStrongRelationLocks->mutex);
+			BeginStrongLockAcquire(locallock, fasthashcode);
 			if (!FastPathTransferRelationLocks(lockMethodTable, locktag,
 											   hashcode))
 			{
+				AbortStrongLockAcquire();
 				if (reportMemoryError)
 					ereport(ERROR,
 							(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -779,6 +771,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
 								hashcode, lockmode);
 	if (!proclock)
 	{
+		AbortStrongLockAcquire();
 		LWLockRelease(partitionLock);
 		if (reportMemoryError)
 			ereport(ERROR,
@@ -820,6 +813,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
 		 */
 		if (dontWait)
 		{
+			AbortStrongLockAcquire();
 			if (proclock->holdMask == 0)
 			{
 				uint32		proclock_hashcode;
@@ -894,6 +888,12 @@ LockAcquireExtended(const LOCKTAG *locktag,
 		LOCK_PRINT("LockAcquire: granted", lock, lockmode);
 	}
 
+	/*
+	 * Lock state is fully up-to-date now; if we error out after this, no
+	 * special error cleanup is required.
+	 */
+	FinishStrongLockAcquire();
+
 	LWLockRelease(partitionLock);
 
 	/*
@@ -1350,6 +1350,64 @@ GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner)
 }
 
 /*
+ * BeginStrongLockAcquire - inhibit use of fastpath for a given LOCALLOCK,
+ * and arrange for error cleanup if it fails
+ */
+static void
+BeginStrongLockAcquire(LOCALLOCK *locallock, uint32 fasthashcode)
+{
+	Assert(StrongLockInProgress == NULL);
+	Assert(locallock->holdsStrongLockCount == FALSE);
+
+	/*
+	 * Adding to a memory location is not atomic, so we take a
+	 * spinlock to ensure we don't collide with someone else trying
+	 * to bump the count at the same time.
+	 *
+	 * XXX: It might be worth considering using an atomic fetch-and-add
+	 * instruction here, on architectures where that is supported.
+	 */
+
+	SpinLockAcquire(&FastPathStrongRelationLocks->mutex);
+	FastPathStrongRelationLocks->count[fasthashcode]++;
+	locallock->holdsStrongLockCount = TRUE;
+	StrongLockInProgress = locallock;
+	SpinLockRelease(&FastPathStrongRelationLocks->mutex);
+}
+
+/*
+ * FinishStrongLockAcquire - cancel pending cleanup for a strong lock
+ * acquisition once it's no longer needed
+ */
+static void
+FinishStrongLockAcquire(void)
+{
+	StrongLockInProgress = NULL;
+}
+
+/*
+ * AbortStrongLockAcquire - undo strong lock state changes performed by
+ * BeginStrongLockAcquire.
+ */
+void
+AbortStrongLockAcquire(void)
+{
+	uint32	fasthashcode;
+	LOCALLOCK  *locallock = StrongLockInProgress;
+	
+	if (locallock == NULL)
+		return;
+
+	fasthashcode = FastPathStrongLockHashPartition(locallock->hashcode);
+	Assert(locallock->holdsStrongLockCount == TRUE);
+	SpinLockAcquire(&FastPathStrongRelationLocks->mutex);
+	FastPathStrongRelationLocks->count[fasthashcode]--;
+	locallock->holdsStrongLockCount = FALSE;
+	StrongLockInProgress = NULL;
+	SpinLockRelease(&FastPathStrongRelationLocks->mutex);
+}
+
+/*
  * GrantAwaitedLock -- call GrantLockLocal for the lock we are doing
  *		WaitOnLock on.
  *
@@ -1414,7 +1472,7 @@ WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
 	 * We can and do use a PG_TRY block to try to clean up after failure, but
 	 * this still has a major limitation: elog(FATAL) can occur while waiting
 	 * (eg, a "die" interrupt), and then control won't come back here. So all
-	 * cleanup of essential state should happen in LockWaitCancel, not here.
+	 * cleanup of essential state should happen in LockErrorCleanup, not here.
 	 * We can use PG_TRY to clear the "waiting" status flags, since doing that
 	 * is unimportant if the process exits.
 	 */
@@ -1441,7 +1499,7 @@ WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
 	}
 	PG_CATCH();
 	{
-		/* In this path, awaitedLock remains set until LockWaitCancel */
+		/* In this path, awaitedLock remains set until LockErrorCleanup */
 
 		/* Report change to non-waiting status */
 		pgstat_report_waiting(false);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 2926c15..d1bf264 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -635,17 +635,20 @@ IsWaitingForLock(void)
 }
 
 /*
- * Cancel any pending wait for lock, when aborting a transaction.
+ * Cancel any pending wait for lock, when aborting a transaction, and revert
+ * any strong lock count acquisition for a lock being acquired.
  *
  * (Normally, this would only happen if we accept a cancel/die
- * interrupt while waiting; but an ereport(ERROR) while waiting is
- * within the realm of possibility, too.)
+ * interrupt while waiting; but an ereport(ERROR) before or during the lock
+ * wait is within the realm of possibility, too.)
  */
 void
-LockWaitCancel(void)
+LockErrorCleanup(void)
 {
 	LWLockId	partitionLock;
 
+	AbortStrongLockAcquire();
+
 	/* Nothing to do if we weren't waiting for a lock */
 	if (lockAwaited == NULL)
 		return;
@@ -709,7 +712,7 @@ ProcReleaseLocks(bool isCommit)
 	if (!MyProc)
 		return;
 	/* If waiting, get off wait queue (should only be needed after error) */
-	LockWaitCancel();
+	LockErrorCleanup();
 	/* Release locks */
 	LockReleaseAll(DEFAULT_LOCKMETHOD, !isCommit);
 
@@ -1019,7 +1022,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 	 * NOTE: this may also cause us to exit critical-section state, possibly
 	 * allowing a cancel/die interrupt to be accepted. This is OK because we
 	 * have recorded the fact that we are waiting for a lock, and so
-	 * LockWaitCancel will clean up if cancel/die happens.
+	 * LockErrorCleanup will clean up if cancel/die happens.
 	 */
 	LWLockRelease(partitionLock);
 
@@ -1062,7 +1065,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 	 * don't, because we have no shared-state-change work to do after being
 	 * granted the lock (the grantor did it all).  We do have to worry about
 	 * updating the locallock table, but if we lose control to an error,
-	 * LockWaitCancel will fix that up.
+	 * LockErrorCleanup will fix that up.
 	 */
 	do
 	{
@@ -1207,7 +1210,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 	LWLockAcquire(partitionLock, LW_EXCLUSIVE);
 
 	/*
-	 * We no longer want LockWaitCancel to do anything.
+	 * We no longer want LockErrorCleanup to do anything.
 	 */
 	lockAwaited = NULL;
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d5369f4..51d623f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2575,7 +2575,7 @@ die(SIGNAL_ARGS)
 			/* bump holdoff count to make ProcessInterrupts() a no-op */
 			/* until we are done getting ready for it */
 			InterruptHoldoffCount++;
-			LockWaitCancel();	/* prevent CheckDeadLock from running */
+			LockErrorCleanup();	/* prevent CheckDeadLock from running */
 			DisableNotifyInterrupt();
 			DisableCatchupInterrupt();
 			InterruptHoldoffCount--;
@@ -2617,7 +2617,7 @@ StatementCancelHandler(SIGNAL_ARGS)
 			/* bump holdoff count to make ProcessInterrupts() a no-op */
 			/* until we are done getting ready for it */
 			InterruptHoldoffCount++;
-			LockWaitCancel();	/* prevent CheckDeadLock from running */
+			LockErrorCleanup();	/* prevent CheckDeadLock from running */
 			DisableNotifyInterrupt();
 			DisableCatchupInterrupt();
 			InterruptHoldoffCount--;
@@ -2776,7 +2776,7 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
 			/* bump holdoff count to make ProcessInterrupts() a no-op */
 			/* until we are done getting ready for it */
 			InterruptHoldoffCount++;
-			LockWaitCancel();	/* prevent CheckDeadLock from running */
+			LockErrorCleanup();	/* prevent CheckDeadLock from running */
 			DisableNotifyInterrupt();
 			DisableCatchupInterrupt();
 			InterruptHoldoffCount--;
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 6e7a6e9..92b6d9d 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -489,6 +489,7 @@ extern LockAcquireResult LockAcquireExtended(const LOCKTAG *locktag,
 					bool sessionLock,
 					bool dontWait,
 					bool report_memory_error);
+extern void AbortStrongLockAcquire(void);
 extern bool LockRelease(const LOCKTAG *locktag,
 			LOCKMODE lockmode, bool sessionLock);
 extern void LockReleaseSession(LOCKMETHODID lockmethodid);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 415c093..a66c484 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -244,7 +244,7 @@ extern int	ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable);
 extern PGPROC *ProcWakeup(PGPROC *proc, int waitStatus);
 extern void ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock);
 extern bool IsWaitingForLock(void);
-extern void LockWaitCancel(void);
+extern void LockErrorCleanup(void);
 
 extern void ProcWaitForSignal(void);
 extern void ProcSendSignal(int pid);

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#2)

Re: bug in fast-path locking

Robert Haas <robertmhaas@gmail.com> writes:

I looked at this more. The above analysis is basically correct, but
the problem goes a bit beyond an error in LockWaitCancel(). We could
also crap out with an error before getting as far as LockWaitCancel()
and have the same problem. I think that a correct statement of the
problem is this: from the time we bump the strong lock count, up until
the time we're done acquiring the lock (or give up on acquiring it),
we need to have an error-cleanup hook in place that will unbump the
strong lock count if we error out. Once we're done updating the
shared and local lock tables, the special handling ceases to be
needed, because any subsequent lock release will go through
LockRelease() or LockReleaseAll(), which will do the appropriate
clenaup.

Haven't looked at the code, but maybe it'd be better to not bump the
strong lock count in the first place until the final step of updating
the lock tables?

regards, tom lane

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#3)

Re: bug in fast-path locking

On Mon, Apr 9, 2012 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I looked at this more. The above analysis is basically correct, but
the problem goes a bit beyond an error in LockWaitCancel(). We could
also crap out with an error before getting as far as LockWaitCancel()
and have the same problem. I think that a correct statement of the
problem is this: from the time we bump the strong lock count, up until
the time we're done acquiring the lock (or give up on acquiring it),
we need to have an error-cleanup hook in place that will unbump the
strong lock count if we error out. Once we're done updating the
shared and local lock tables, the special handling ceases to be
needed, because any subsequent lock release will go through
LockRelease() or LockReleaseAll(), which will do the appropriate
clenaup.

Haven't looked at the code, but maybe it'd be better to not bump the
strong lock count in the first place until the final step of updating
the lock tables?

Well, unfortunately, that would break the entire mechanism. The idea
is that we bump the strong lock count first. That prevents anyone
from taking any more fast-path locks on the target relation. Then, we
go through and find any existing fast-path locks that have already
been taken, and turn them into regular locks. Finally, we resolve the
actual lock request and either grant the lock or block, depending on
whether conflicts exist. So there's some necessary separation between
the action of bumping the strong lock count and updating the lock
tables; the entire mechanism relies on being able to do non-trivial
processing in between. I thought that I had nailed down the error
exit cases in the original patch, but this test case, and some code
reading with fresh eyes, shows that I didn't do half so good a job as
I had thought.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#4)

Re: bug in fast-path locking

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Apr 9, 2012 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Haven't looked at the code, but maybe it'd be better to not bump the
strong lock count in the first place until the final step of updating
the lock tables?

Well, unfortunately, that would break the entire mechanism. The idea
is that we bump the strong lock count first. That prevents anyone
from taking any more fast-path locks on the target relation. Then, we
go through and find any existing fast-path locks that have already
been taken, and turn them into regular locks. Finally, we resolve the
actual lock request and either grant the lock or block, depending on
whether conflicts exist.

OK. (Is that explained somewhere in the comments? I confess I've not
paid any attention to this patch up to now.) I wonder though whether
you actually need a *count*. What if it were just a flag saying "do not
take any fast path locks on this object", and once set it didn't get
unset until there were no locks left at all on that object? In
particular, it's not clear from what you're saying here why it's okay
to let the value revert once you've changed some of the FP locks to
regular locks.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#5)

Re: bug in fast-path locking

On Mon, Apr 9, 2012 at 2:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Apr 9, 2012 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Haven't looked at the code, but maybe it'd be better to not bump the
strong lock count in the first place until the final step of updating
the lock tables?

Well, unfortunately, that would break the entire mechanism. The idea
is that we bump the strong lock count first. That prevents anyone
from taking any more fast-path locks on the target relation. Then, we
go through and find any existing fast-path locks that have already
been taken, and turn them into regular locks. Finally, we resolve the
actual lock request and either grant the lock or block, depending on
whether conflicts exist.

OK. (Is that explained somewhere in the comments? I confess I've not
paid any attention to this patch up to now.)

There's a new section in src/backend/storage/lmgr/README on Fast Path
Locking, plus comments at various places in the code. It's certainly
possible I've missed something that should be updated, but I did my
best.

I wonder though whether
you actually need a *count*. What if it were just a flag saying "do not
take any fast path locks on this object", and once set it didn't get
unset until there were no locks left at all on that object?

I think if you read the above-referenced section of the README you'll
be deconfused. The short version is that we divide up the space of
lockable objects into 1024 partitions and the strong lock counts are
actually a count of all locks in the partition. It is therefore
theoretically possible for locking to get slower on table A because
somebody's got an AccessExclusiveLock on table B, if the low-order 10
bits of the locktag hashcodes happen to collide. In such a case, all
locks on both relations would be forced out of the fast path until the
AccessExclusiveLock was released. If it so happens that table A is
getting pounded with something that looks a lot like pgbench -S -c 32
-j 32 on a system with more than a couple of cores, the user will be
sad. I judge that real-world occurrences of this problem will be
quite rare, since most people have adequate reasons for long-lived
strong table locks anyway, and 1024 partitions seemed like enough to
keep most people from suffering too badly. I don't see any way to
eliminate the theoretical possibility of this while still having the
basic mechanism work, either, though we could certainly crank up the
partition count.

In
particular, it's not clear from what you're saying here why it's okay
to let the value revert once you've changed some of the FP locks to
regular locks.

It's always safe to convert a fast-path lock to a regular lock; it
just costs you some performance. The idea is that everything that
exists as a fast-path lock is something that's certain not to have any
lock conflicts. As soon as we discover that a particular lock might
be involved in a lock conflict, we have to turn it into a "real" lock.
So if backends 1, 2, and 3 take fast-path locks on A (to SELECT from
it, for example) and then backend 4 wants an AccessExclusiveLock, it
will pull the locks from those backends out of the fast-path mechanism
and make regular lock entries for them before checking for lock
conflicts. Then, it will discover that there are in fact conflicts
and go to sleep. When those backends go to release their locks, they
will notice that their locks have been moved to the main lock table
and will release them there, eventually waking up backend 4 to go do
his thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Jeff Davis

pgsql@j-davis.com

almost 14 years ago

In reply to: Robert Haas (#6)

Re: bug in fast-path locking

On Mon, 2012-04-09 at 16:11 -0400, Robert Haas wrote:

I wonder though whether
you actually need a *count*. What if it were just a flag saying "do not
take any fast path locks on this object", and once set it didn't get
unset until there were no locks left at all on that object?

I think if you read the above-referenced section of the README you'll
be deconfused.

My understanding:

The basic reason for the count is that we need to preserve the
information that a strong lock is being acquired between the time it's
put in FastPathStrongRelationLocks and the time it actually acquires the
lock in the lock manager.

By definition, the lock manager doesn't know about it yet (so we can't
use a simple rule like "there are no locks on the object so we can unset
the flag"). Therefore, the backend must indicate whether it's in this
code path or not; meaning that it needs to do something on the error
path (in this case, decrement the count). That was the source of this
bug.

There may be a way around this problem, but nothing occurs to me right
now.

Regards,
Jeff Davis

PS: Oops, I missed this bug in the review, too.

PPS: In the README, FastPathStrongRelationLocks is referred to as
FastPathStrongLocks. Worth a quick update for easier symbol searching.

Jim Nasby

jim@nasby.net

almost 14 years ago

In reply to: Robert Haas (#2)

Re: bug in fast-path locking

On 4/9/12 12:32 PM, Robert Haas wrote:

I looked at this more. The above analysis is basically correct, but
the problem goes a bit beyond an error in LockWaitCancel(). We could
also crap out with an error before getting as far as LockWaitCancel()
and have the same problem. I think that a correct statement of the
problem is this: from the time we bump the strong lock count, up until
the time we're done acquiring the lock (or give up on acquiring it),
we need to have an error-cleanup hook in place that will unbump the
strong lock count if we error out. Once we're done updating the
shared and local lock tables, the special handling ceases to be
needed, because any subsequent lock release will go through
LockRelease() or LockReleaseAll(), which will do the appropriate
clenaup.

The attached patch is an attempt at implementing that; any reviews appreciated.

Dumb question... should operations in the various StrongLock functions take place in a critical section? Or is that already ensure outside of these functions?
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

Jeff Davis

pgsql@j-davis.com

almost 14 years ago

In reply to: Jim Nasby (#8)

Re: bug in fast-path locking

On Mon, 2012-04-09 at 17:42 -0500, Jim Nasby wrote:

Dumb question... should operations in the various StrongLock functions
take place in a critical section? Or is that already ensure outside of
these functions?

Do you mean CRITICAL_SECTION() in the postgres sense (that is, avoid
error paths by making all ERRORs into PANICs and preventing interrupts);
or the sense described here:
http://en.wikipedia.org/wiki/Critical_section ?

If you mean in the postgres sense, you'd have to hold the critical
section open from the time you incremented the strong lock count all the
way until you decremented it (which is normally at the time the lock is
released); which is a cure worse than the disease.

Regards,
Jeff Davis

#10

Jeff Davis

pgsql@j-davis.com

almost 14 years ago

In reply to: Robert Haas (#2)

Re: bug in fast-path locking

On Mon, 2012-04-09 at 13:32 -0400, Robert Haas wrote:

I looked at this more. The above analysis is basically correct, but
the problem goes a bit beyond an error in LockWaitCancel(). We could
also crap out with an error before getting as far as LockWaitCancel()
and have the same problem. I think that a correct statement of the
problem is this: from the time we bump the strong lock count, up until
the time we're done acquiring the lock (or give up on acquiring it),
we need to have an error-cleanup hook in place that will unbump the
strong lock count if we error out. Once we're done updating the
shared and local lock tables, the special handling ceases to be
needed, because any subsequent lock release will go through
LockRelease() or LockReleaseAll(), which will do the appropriate
clenaup.

The attached patch is an attempt at implementing that; any reviews appreciated.

This path doesn't have an AbortStrongLockAcquire:

if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
{
...
elog(ERROR,...)

but other similar paths do:

if (!proclock)
{
AbortStrongLockAcquire();

I don't think it's necessary outside of LockErrorCleanup(), right?

I think there could be some more asserts. There are three sites that
decrement FastPathStrongRelationLocks, but in two of them
StrongLockInProgress should always be NULL.

Other than that, it looks good to me.

Regards,
Jeff Davis

#11

Jeff Davis

pgsql@j-davis.com

almost 14 years ago

In reply to: Jeff Davis (#10)

Re: bug in fast-path locking

On Mon, 2012-04-09 at 22:47 -0700, Jeff Davis wrote:

but other similar paths do:

if (!proclock)
{
AbortStrongLockAcquire();

I don't think it's necessary outside of LockErrorCleanup(), right?

I take that back, it's necessary for the dontwait case, too.

Regards,
Jeff Davis

#12

Boszormenyi Zoltan

zb@cybertec.at

almost 14 years ago

In reply to: Robert Haas (#2)

Re: bug in fast-path locking

2012-04-09 19:32 keltezéssel, Robert Haas írta:

On Sun, Apr 8, 2012 at 9:37 PM, Robert Haas<robertmhaas@gmail.com> wrote:

Robert, the Assert triggering with the above procedure
is in your "fast path" locking code with current GIT.

Yes, that sure looks like a bug. It seems that if the top-level
transaction is aborting, then LockReleaseAll() is called and
everything gets cleaned up properly; or if a subtransaction is
aborting after the lock is fully granted, then the locks held by the
subtransaction are released one at a time using LockRelease(), but if
the subtransaction is aborted *during the lock wait* then we only do
LockWaitCancel(), which doesn't clean up the LOCALLOCK. Before the
fast-lock patch, that didn't really matter, but now it does, because
that LOCALLOCK is tracking the fact that we're holding onto a shared
resource - the strong lock count. So I think that LockWaitCancel()
needs some kind of adjustment, but I haven't figured out exactly what
it is yet.

I looked at this more. The above analysis is basically correct, but
the problem goes a bit beyond an error in LockWaitCancel(). We could
also crap out with an error before getting as far as LockWaitCancel()
and have the same problem. I think that a correct statement of the
problem is this: from the time we bump the strong lock count, up until
the time we're done acquiring the lock (or give up on acquiring it),
we need to have an error-cleanup hook in place that will unbump the
strong lock count if we error out. Once we're done updating the
shared and local lock tables, the special handling ceases to be
needed, because any subsequent lock release will go through
LockRelease() or LockReleaseAll(), which will do the appropriate
clenaup.

The attached patch is an attempt at implementing that; any reviews appreciated.

This patch indeed fixes the scenario discovered by Cousin Marc.

Reading this patch also made me realize that my lock_timeout
patch needs adjusting, i.e. needs an AbortStrongLockAcquire()
call if waiting for a lock timed out.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

#13

Jim Nasby

jim@nasby.net

almost 14 years ago

In reply to: Jeff Davis (#9)

Re: bug in fast-path locking

On 4/9/12 6:12 PM, Jeff Davis wrote:

On Mon, 2012-04-09 at 17:42 -0500, Jim Nasby wrote:

Dumb question... should operations in the various StrongLock functions
take place in a critical section? Or is that already ensure outside of
these functions?

Do you mean CRITICAL_SECTION() in the postgres sense (that is, avoid
error paths by making all ERRORs into PANICs and preventing interrupts);
or the sense described here:

Postgres sense. I thought there was concern about multiple people trying to increment or decrement the count at the same time, and if that was the case perhaps there was an issue with it not being in a CRITICAL_SECTION as well. But I could certainly be wrong about this. :)

And yes, we'd definitely not want to be in a CRITICAL_SECTION for the duration of the operation...
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net