Optimizing FastPathTransferRelationLocks()
Hi,
I've identified some opportunities to optimize FastPathTransferRelationLocks(),
which transfers locks with a specific lock tag from per-backend fast-path arrays
to the shared hash table. The attached patch includes these enhancements.
Currently, FastPathTransferRelationLocks() recalculates the fast-path group on
each loop iteration, even though it stays the same. This patch updates
the function to calculate the group once and reuse it, improving efficiency.
The patch also extends the function's logic to skip not only backends from
a different database but also backends with pid=0 (which don’t hold fast-path
locks) and groups with no registered fast-path locks.
Since MyProc->pid is reset to 0 when a backend exits but MyProc->databaseId
remains set, checking only databaseId isn’t enough. Backends with pid=0 also
should be skipped.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachments:
v1-0001-Optimize-lock-transfer-from-per-backend-fast-path.patchtext/plain; charset=UTF-8; name=v1-0001-Optimize-lock-transfer-from-per-backend-fast-path.patchDownload
From f0986e753d5e25fe9d9a941be621e0e0e43d47ae Mon Sep 17 00:00:00 2001
From: Fujii Masao <fujii@postgresql.org>
Date: Tue, 12 Nov 2024 09:29:52 +0900
Subject: [PATCH v1] Optimize lock transfer from per-backend fast-path arrays
to shared hash table.
This commit improves FastPathTransferRelationLocks() for transferring locks
with a specific lock tag from per-backend fast-path arrays to the shared
hash table.
Previously, the fast-path group was recalculated on each loop iteration,
though it remains the same. This update now calculates the group once
and reuses it, improving efficiency. Additionally, the function now skips
backends with pid=0 (as they hold no fast-path locks) and skips empty
fast-path groups, further enhancing performance.
---
src/backend/storage/lmgr/lock.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index edc5020c6a..69c4d156fe 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2773,6 +2773,10 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
LWLock *partitionLock = LockHashPartitionLock(hashcode);
Oid relid = locktag->locktag_field2;
uint32 i;
+ uint32 group;
+
+ /* fast-path group the lock belongs to */
+ group = FAST_PATH_REL_GROUP(relid);
/*
* Every PGPROC that can potentially hold a fast-path lock is present in
@@ -2783,8 +2787,7 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
for (i = 0; i < ProcGlobal->allProcCount; i++)
{
PGPROC *proc = &ProcGlobal->allProcs[i];
- uint32 j,
- group;
+ uint32 j;
LWLockAcquire(&proc->fpInfoLock, LW_EXCLUSIVE);
@@ -2802,16 +2805,17 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
* less clear that our backend is certain to have performed a memory
* fencing operation since the other backend set proc->databaseId. So
* for now, we test it after acquiring the LWLock just to be safe.
+ *
+ * Also skip backends with pid=0 as they don’t hold fast-path locks,
+ * and skip groups without any registered fast-path locks.
*/
- if (proc->databaseId != locktag->locktag_field1)
+ if (proc->databaseId != locktag->locktag_field1 || proc->pid == 0 ||
+ proc->fpLockBits[group] == 0)
{
LWLockRelease(&proc->fpInfoLock);
continue;
}
- /* fast-path group the lock belongs to */
- group = FAST_PATH_REL_GROUP(relid);
-
for (j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++)
{
uint32 lockmode;
--
2.47.0
On 12/11/2024 03:16, Fujii Masao wrote:
Hi,
I've identified some opportunities to optimize
FastPathTransferRelationLocks(), which transfers locks with a
specific lock tag from per-backend fast- path arrays to the shared
hash table. The attached patch includes these enhancements.Currently, FastPathTransferRelationLocks() recalculates the fast-
path group on each loop iteration, even though it stays the same.
This patch updates the function to calculate the group once and
reuse it, improving efficiency.
Makes sense. GetLockConflicts() has similar code, the same optimizations
would apply there too.
The patch also extends the function's logic to skip not only
backends from a different database but also backends with pid=0
(which don’t hold fast-path locks) and groups with no registered
fast-path locks.Since MyProc->pid is reset to 0 when a backend exits but MyProc-
databaseId remains set, checking only databaseId isn’t enough.
Backends with pid=0 also should be skipped.
Hmm, a PGPROC entry that's not in use would also have
proc->fpLockBits[group] == 0, so I'm not sure if the check for proc->pid
== 0 is necessary.
And perhaps we should start clearing databaseid on backend exit.
--
Heikki Linnakangas
Neon (https://neon.tech)
On 2024/11/16 7:15, Heikki Linnakangas wrote:
On 12/11/2024 03:16, Fujii Masao wrote:
Hi,
I've identified some opportunities to optimize FastPathTransferRelationLocks(), which transfers locks with a
specific lock tag from per-backend fast- path arrays to the shared
hash table. The attached patch includes these enhancements.Currently, FastPathTransferRelationLocks() recalculates the fast-
path group on each loop iteration, even though it stays the same.
This patch updates the function to calculate the group once and
reuse it, improving efficiency.Makes sense. GetLockConflicts() has similar code, the same optimizations would apply there too.
Yes, I've updated the patch as suggested.
The patch also extends the function's logic to skip not only
backends from a different database but also backends with pid=0
(which don’t hold fast-path locks) and groups with no registered
fast-path locks.Since MyProc->pid is reset to 0 when a backend exits but MyProc-
databaseId remains set, checking only databaseId isn’t enough.
Backends with pid=0 also should be skipped.
Hmm, a PGPROC entry that's not in use would also have proc->fpLockBits[group] == 0, so I'm not sure if the check for proc->pid == 0 is necessary.
You're right. I've removed the check for proc->pid == 0.
Also I added a check for proc->fpLockBits[group] == 0 in GetLockConflicts().
The latest version of the patch is attached.
And perhaps we should start clearing databaseid on backend exit.
Maybe, but I'm not sure if we really need this..
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachments:
v2-0001-Optimize-iteration-over-PGPROC-for-fast-path-lock.patchtext/plain; charset=UTF-8; name=v2-0001-Optimize-iteration-over-PGPROC-for-fast-path-lock.patchDownload
From 8a8a9b678ad011bd55005bb10c28bd5c67d80455 Mon Sep 17 00:00:00 2001
From: Fujii Masao <fujii@postgresql.org>
Date: Sun, 17 Nov 2024 12:05:42 +0900
Subject: [PATCH v2] Optimize iteration over PGPROC for fast-path lock
searches.
This commit improves efficiency in FastPathTransferRelationLocks()
and GetLockConflicts(), which iterate over PGPROCs to search for
fast-path locks.
Previously, these functions recalculated the fast-path group during
every loop iteration, even though it remained constant. This update
optimizes the process by calculating the group once and reusing it
throughout the loop.
The functions also now skip empty fast-path groups, avoiding
unnecessary scans of their slots. Additionally, groups belonging to
inactive backends (with pid=0) are always empty, so checking
the group is sufficient to bypass these backends, further enhancing
performance.
Author: Fujii Masao
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/07d5fd6a-71f1-4ce8-8602-4cc6883f4bd1@oss.nttdata.com
---
src/backend/storage/lmgr/lock.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index edc5020c6a..a1272e376f 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2773,6 +2773,10 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
LWLock *partitionLock = LockHashPartitionLock(hashcode);
Oid relid = locktag->locktag_field2;
uint32 i;
+ uint32 group;
+
+ /* fast-path group the lock belongs to */
+ group = FAST_PATH_REL_GROUP(relid);
/*
* Every PGPROC that can potentially hold a fast-path lock is present in
@@ -2783,8 +2787,7 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
for (i = 0; i < ProcGlobal->allProcCount; i++)
{
PGPROC *proc = &ProcGlobal->allProcs[i];
- uint32 j,
- group;
+ uint32 j;
LWLockAcquire(&proc->fpInfoLock, LW_EXCLUSIVE);
@@ -2802,16 +2805,16 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
* less clear that our backend is certain to have performed a memory
* fencing operation since the other backend set proc->databaseId. So
* for now, we test it after acquiring the LWLock just to be safe.
+ *
+ * Also skip groups without any registered fast-path locks.
*/
- if (proc->databaseId != locktag->locktag_field1)
+ if (proc->databaseId != locktag->locktag_field1 ||
+ proc->fpLockBits[group] == 0)
{
LWLockRelease(&proc->fpInfoLock);
continue;
}
- /* fast-path group the lock belongs to */
- group = FAST_PATH_REL_GROUP(relid);
-
for (j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++)
{
uint32 lockmode;
@@ -3025,8 +3028,12 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
{
int i;
Oid relid = locktag->locktag_field2;
+ uint32 group;
VirtualTransactionId vxid;
+ /* fast-path group the lock belongs to */
+ group = FAST_PATH_REL_GROUP(relid);
+
/*
* Iterate over relevant PGPROCs. Anything held by a prepared
* transaction will have been transferred to the primary lock table,
@@ -3040,8 +3047,7 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
for (i = 0; i < ProcGlobal->allProcCount; i++)
{
PGPROC *proc = &ProcGlobal->allProcs[i];
- uint32 j,
- group;
+ uint32 j;
/* A backend never blocks itself */
if (proc == MyProc)
@@ -3056,16 +3062,16 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
*
* See FastPathTransferRelationLocks() for discussion of why we do
* this test after acquiring the lock.
+ *
+ * Also skip groups without any registered fast-path locks.
*/
- if (proc->databaseId != locktag->locktag_field1)
+ if (proc->databaseId != locktag->locktag_field1 ||
+ proc->fpLockBits[group] == 0)
{
LWLockRelease(&proc->fpInfoLock);
continue;
}
- /* fast-path group the lock belongs to */
- group = FAST_PATH_REL_GROUP(relid);
-
for (j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++)
{
uint32 lockmask;
--
2.47.0
Hi Fujii-san,
It seems that this was forgotten somehow.
The patch still applies.
Examining c4d5cb71d229095a39fda1121a75ee40e6069a2a, it seems that this patch
could have been part of that commit as well. But may be it wasn't so apparent
that time. I think it's a good improvement.
On Tue, Nov 19, 2024 at 10:14 PM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:
The latest version of the patch is attached.
@@ -2773,6 +2773,10 @@ FastPathTransferRelationLocks(LockMethod
lockMethodTable, const LOCKTAG *locktag
LWLock *partitionLock = LockHashPartitionLock(hashcode);
Oid relid = locktag->locktag_field2;
uint32 i;
+ uint32 group;
+
+ /* fast-path group the lock belongs to */
+ group = FAST_PATH_REL_GROUP(relid);
We could just combine variable declaration and initialization; similar
to partitionLock.
@@ -2802,16 +2805,16 @@ FastPathTransferRelationLocks(LockMethod
lockMethodTable, const LOCKTAG *locktag
* less clear that our backend is certain to have performed a memory
* fencing operation since the other backend set proc->databaseId. So
* for now, we test it after acquiring the LWLock just to be safe.
+ *
+ * Also skip groups without any registered fast-path locks.
*/
- if (proc->databaseId != locktag->locktag_field1)
+ if (proc->databaseId != locktag->locktag_field1 ||
+ proc->fpLockBits[group] == 0)
I like this. Moving the group computation out of the loop also allows
the computed group to be used for checking existence of lock. That's
double benefit. This test is similar to the test in
GetLockStatusData(), so we already have a precedence.
And perhaps we should start clearing databaseid on backend exit.
Maybe, but I'm not sure if we really need this..
There's a comment
* proc->databaseId is set at backend startup time and never changes
* thereafter, so it might be safe to perform this test before
* acquiring &proc->fpInfoLock.
that seems to assume that databaseid is never cleared, so maybe it's
not safe to do this.
The performance gain from this patch might be tiny and may not be
visible. Still, have you tried to measure the performance improvement?
--
Best Wishes,
Ashutosh Bapat
On 2025/03/11 20:55, Ashutosh Bapat wrote:
Hi Fujii-san,
It seems that this was forgotten somehow.
The patch still applies.
Examining c4d5cb71d229095a39fda1121a75ee40e6069a2a, it seems that this patch
could have been part of that commit as well. But may be it wasn't so apparent
that time. I think it's a good improvement.
Thanks for reviewing the patch!
On Tue, Nov 19, 2024 at 10:14 PM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:The latest version of the patch is attached.
@@ -2773,6 +2773,10 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag LWLock *partitionLock = LockHashPartitionLock(hashcode); Oid relid = locktag->locktag_field2; uint32 i; + uint32 group; + + /* fast-path group the lock belongs to */ + group = FAST_PATH_REL_GROUP(relid);We could just combine variable declaration and initialization; similar
to partitionLock.
I’ve updated the patch as suggested. Updated patch is attached.
The performance gain from this patch might be tiny and may not be
visible. Still, have you tried to measure the performance improvement?
I haven’t measured the actual performance gain since the patch optimizes
the logic in a clear and logical way. While we might see some improvement
in artificial scenarios — like with a large max_connections and
all backends slots having their databaseIds set, I’m not sure
how meaningful that would be.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachments:
v3-0001-Optimize-iteration-over-PGPROC-for-fast-path-lock.patchtext/plain; charset=UTF-8; name=v3-0001-Optimize-iteration-over-PGPROC-for-fast-path-lock.patchDownload
From e2d0cca4f9032aa5f5fbab024d23562633e2707f Mon Sep 17 00:00:00 2001
From: Fujii Masao <fujii@postgresql.org>
Date: Tue, 11 Mar 2025 22:31:12 +0900
Subject: [PATCH v3] Optimize iteration over PGPROC for fast-path lock
searches.
This commit improves efficiency in FastPathTransferRelationLocks()
and GetLockConflicts(), which iterate over PGPROCs to search for
fast-path locks.
Previously, these functions recalculated the fast-path group during
every loop iteration, even though it remained constant. This update
optimizes the process by calculating the group once and reusing it
throughout the loop.
The functions also now skip empty fast-path groups, avoiding
unnecessary scans of their slots. Additionally, groups belonging to
inactive backends (with pid=0) are always empty, so checking
the group is sufficient to bypass these backends, further enhancing
performance.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/07d5fd6a-71f1-4ce8-8602-4cc6883f4bd1@oss.nttdata.com
---
src/backend/storage/lmgr/lock.c | 28 ++++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index ccfe6b69bf5..cb95fd74fcf 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2774,6 +2774,9 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
Oid relid = locktag->locktag_field2;
uint32 i;
+ /* fast-path group the lock belongs to */
+ uint32 group = FAST_PATH_REL_GROUP(relid);
+
/*
* Every PGPROC that can potentially hold a fast-path lock is present in
* ProcGlobal->allProcs. Prepared transactions are not, but any
@@ -2783,8 +2786,7 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
for (i = 0; i < ProcGlobal->allProcCount; i++)
{
PGPROC *proc = &ProcGlobal->allProcs[i];
- uint32 j,
- group;
+ uint32 j;
LWLockAcquire(&proc->fpInfoLock, LW_EXCLUSIVE);
@@ -2802,16 +2804,16 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
* less clear that our backend is certain to have performed a memory
* fencing operation since the other backend set proc->databaseId. So
* for now, we test it after acquiring the LWLock just to be safe.
+ *
+ * Also skip groups without any registered fast-path locks.
*/
- if (proc->databaseId != locktag->locktag_field1)
+ if (proc->databaseId != locktag->locktag_field1 ||
+ proc->fpLockBits[group] == 0)
{
LWLockRelease(&proc->fpInfoLock);
continue;
}
- /* fast-path group the lock belongs to */
- group = FAST_PATH_REL_GROUP(relid);
-
for (j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++)
{
uint32 lockmode;
@@ -3027,6 +3029,9 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
Oid relid = locktag->locktag_field2;
VirtualTransactionId vxid;
+ /* fast-path group the lock belongs to */
+ uint32 group = FAST_PATH_REL_GROUP(relid);
+
/*
* Iterate over relevant PGPROCs. Anything held by a prepared
* transaction will have been transferred to the primary lock table,
@@ -3040,8 +3045,7 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
for (i = 0; i < ProcGlobal->allProcCount; i++)
{
PGPROC *proc = &ProcGlobal->allProcs[i];
- uint32 j,
- group;
+ uint32 j;
/* A backend never blocks itself */
if (proc == MyProc)
@@ -3056,16 +3060,16 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
*
* See FastPathTransferRelationLocks() for discussion of why we do
* this test after acquiring the lock.
+ *
+ * Also skip groups without any registered fast-path locks.
*/
- if (proc->databaseId != locktag->locktag_field1)
+ if (proc->databaseId != locktag->locktag_field1 ||
+ proc->fpLockBits[group] == 0)
{
LWLockRelease(&proc->fpInfoLock);
continue;
}
- /* fast-path group the lock belongs to */
- group = FAST_PATH_REL_GROUP(relid);
-
for (j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++)
{
uint32 lockmask;
--
2.48.1
On Tue, Mar 11, 2025 at 8:46 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2025/03/11 20:55, Ashutosh Bapat wrote:
Hi Fujii-san,
It seems that this was forgotten somehow.
The patch still applies.
Examining c4d5cb71d229095a39fda1121a75ee40e6069a2a, it seems that this patch
could have been part of that commit as well. But may be it wasn't so apparent
that time. I think it's a good improvement.Thanks for reviewing the patch!
On Tue, Nov 19, 2024 at 10:14 PM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:The latest version of the patch is attached.
@@ -2773,6 +2773,10 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag LWLock *partitionLock = LockHashPartitionLock(hashcode); Oid relid = locktag->locktag_field2; uint32 i; + uint32 group; + + /* fast-path group the lock belongs to */ + group = FAST_PATH_REL_GROUP(relid);We could just combine variable declaration and initialization; similar
to partitionLock.I’ve updated the patch as suggested. Updated patch is attached.
Thanks.
The performance gain from this patch might be tiny and may not be
visible. Still, have you tried to measure the performance improvement?I haven’t measured the actual performance gain since the patch optimizes
the logic in a clear and logical way. While we might see some improvement
in artificial scenarios — like with a large max_connections and
all backends slots having their databaseIds set, I’m not sure
how meaningful that would be.
Fair enough. The code is more readable this way. That itself is an improvement.
I stared at c4d5cb71d229095a39fda1121a75ee40e6069a2a to see whether
there's any reason this was left aside at that time. I am convinced
that it was just missed. I think this patch is good to be committed.
--
Best Wishes,
Ashutosh Bapat
On 2025/03/13 0:32, Ashutosh Bapat wrote:
Fair enough. The code is more readable this way. That itself is an improvement.
I stared at c4d5cb71d229095a39fda1121a75ee40e6069a2a to see whether
there's any reason this was left aside at that time. I am convinced
that it was just missed. I think this patch is good to be committed.
Thanks for reviewing the patch and confirming! I've pushed the patch.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION