Inconsistency in lock structure after reporting "lock already held" error
At the end of SetupLockInTable(), there is a check for the "lock already
held" error.
Because the nRequested and requested[lockmode] value of a lock is bumped
before "lock already held" error, and there is no way to reduce them later
for
this situation, then it will keep the inconsistency in lock structure until
cluster restart or reset.
The inconsistency is:
* nRequested will never reduce to zero, the lock will never be
garbage-collected
* if there is a waitMask for this lock, the waitMast will never be removed,
then
new proc will be blocked to wait for a lock with zero holder
(looks weird in the pg_locks table)
I think moving the "lock already held" error before the bump operation of
nRequested
and requested[lockmode] value in SetupLockInTable() will fix it.
(maybe also fix the lock_twophase_recover() function)
To recreate the inconsistency:
1. create a backend 1 to lock table a, keep it idle in transaction
2. terminate backend 1 and hack it to skip the LockReleaseAll() function
3. create another backend 2 to lock table a, it will wait for the lock to
release
4. reuse the backend 1 (reuse the same proc) to lock table a again,
it will trigger the "lock already held" error
5. quit both backend 1 and 2
6. create backend 3 to lock table a, it will wait for the lock's waitMask
7. check the pg_locks table
--
GaoZengqi
pgf00a@gmail.com
zengqigao@gmail.com
The attached patch attempts to fix this.
高增琦 <pgf00a@gmail.com> 于2024年9月11日周三 14:30写道:
At the end of SetupLockInTable(), there is a check for the "lock already
held" error.
Because the nRequested and requested[lockmode] value of a lock is bumped
before "lock already held" error, and there is no way to reduce them later
for
this situation, then it will keep the inconsistency in lock structure until
cluster restart or reset.The inconsistency is:
* nRequested will never reduce to zero, the lock will never be
garbage-collected
* if there is a waitMask for this lock, the waitMast will never be
removed, then
new proc will be blocked to wait for a lock with zero holder
(looks weird in the pg_locks table)I think moving the "lock already held" error before the bump operation of
nRequested
and requested[lockmode] value in SetupLockInTable() will fix it.
(maybe also fix the lock_twophase_recover() function)To recreate the inconsistency:
1. create a backend 1 to lock table a, keep it idle in transaction
2. terminate backend 1 and hack it to skip the LockReleaseAll() function
3. create another backend 2 to lock table a, it will wait for the lock to
release
4. reuse the backend 1 (reuse the same proc) to lock table a again,
it will trigger the "lock already held" error
5. quit both backend 1 and 2
6. create backend 3 to lock table a, it will wait for the lock's waitMask
7. check the pg_locks table--
GaoZengqi
pgf00a@gmail.com
zengqigao@gmail.com
--
GaoZengqi
pgf00a@gmail.com
zengqigao@gmail.com
Attachments:
fix-lock-already-held.patchapplication/octet-stream; name=fix-lock-already-held.patchDownload
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 83b99a98f0..a49dc6f802 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -1329,15 +1329,6 @@ SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
#endif /* CHECK_DEADLOCK_RISK */
}
- /*
- * lock->nRequested and lock->requested[] count the total number of
- * requests, whether granted or waiting, so increment those immediately.
- * The other counts don't increment till we get the lock.
- */
- lock->nRequested++;
- lock->requested[lockmode]++;
- Assert((lock->nRequested > 0) && (lock->requested[lockmode] > 0));
-
/*
* We shouldn't already hold the desired lock; else locallock table is
* broken.
@@ -1348,6 +1339,15 @@ SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
lock->tag.locktag_field1, lock->tag.locktag_field2,
lock->tag.locktag_field3);
+ /*
+ * lock->nRequested and lock->requested[] count the total number of
+ * requests, whether granted or waiting, so increment those immediately.
+ * The other counts don't increment till we get the lock.
+ */
+ lock->nRequested++;
+ lock->requested[lockmode]++;
+ Assert((lock->nRequested > 0) && (lock->requested[lockmode] > 0));
+
return proclock;
}
@@ -4296,14 +4296,6 @@ lock_twophase_recover(TransactionId xid, uint16 info,
Assert((proclock->holdMask & ~lock->grantMask) == 0);
}
- /*
- * lock->nRequested and lock->requested[] count the total number of
- * requests, whether granted or waiting, so increment those immediately.
- */
- lock->nRequested++;
- lock->requested[lockmode]++;
- Assert((lock->nRequested > 0) && (lock->requested[lockmode] > 0));
-
/*
* We shouldn't already hold the desired lock.
*/
@@ -4313,6 +4305,14 @@ lock_twophase_recover(TransactionId xid, uint16 info,
lock->tag.locktag_field1, lock->tag.locktag_field2,
lock->tag.locktag_field3);
+ /*
+ * lock->nRequested and lock->requested[] count the total number of
+ * requests, whether granted or waiting, so increment those immediately.
+ */
+ lock->nRequested++;
+ lock->requested[lockmode]++;
+ Assert((lock->nRequested > 0) && (lock->requested[lockmode] > 0));
+
/*
* We ignore any possible conflicts and just grant ourselves the lock. Not
* only because we don't bother, but also to avoid deadlocks when