pg_stat_statements: Avoid holding excessive lock
Hi hackers,
In pg_stat_statements there is entry's mutex spinlock which is used to be
able to modify counters without holding pgss->lock exclusively.
Here is a comment from the top of pg_stat_statements.c:
* Note about locking issues: to create or delete an entry in the shared
* hashtable, one must hold pgss->lock exclusively. Modifying any field
* in an entry except the counters requires the same. To look up an entry,
* one must hold the lock shared. To read or update the counters within
* an entry, one must hold the lock shared or exclusive (so the entry doesn't
* disappear!) and also take the entry's mutex spinlock.
* The shared state variable pgss->extent (the next free spot in the external
* query-text file) should be accessed only while holding either the
* pgss->mutex spinlock, or exclusive lock on pgss->lock. We use the mutex to
* allow reserving file space while holding only shared lock on pgss->lock.
* Rewriting the entire external query-text file, eg for garbage collection,
* requires holding pgss->lock exclusively; this allows individual entries
* in the file to be read or written while holding only shared lock.
And a comment from pgssEntry declaration:
slock_t mutex; /* protects the counters only */
Both comments say that spinlocking mutex should be used to protect
counters only.
But in pg_stat_statements_internal we do read entry->stats_since and
entry->minmax_stats_since holding the entry's mutex spinlock. This is
unnecessary since we only modify stats_since and minmax_stats_since while
holding pgss->lock exclusively, and in pg_stat_statements_internal we are
holding pgss->lock when reading them. So even without holding the entry's
mutex spinlock there should be no race.
I suggest eliminating holding the excessive lock. See the attached patch.
This would also restore the consistency between the code and the comments
about entry's mutex spinlock usage.
Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/
Attachments:
v1-0001-pg_stat_statements-Avoid-excessive-lock-holding.patchtext/x-patch; charset=US-ASCII; name=v1-0001-pg_stat_statements-Avoid-excessive-lock-holding.patchDownload
From c817fc9373ca08488088414a052eaf77dbb55bde Mon Sep 17 00:00:00 2001
From: Karina Litskevich <litskevichkarina@gmail.com>
Date: Tue, 5 Nov 2024 20:27:25 +0300
Subject: [PATCH v1] pg_stat_statements: Avoid excessive lock holding
Entry's mutex spinlock is used to be able to modify counters without
holding pgss->lock exclusively (though holding it at least shared so
the entry doesn't disappear). We never actually modify stats_since and
minmax_stats_since without holding pgss->lock exclusively, so there is
no need to hold entry's mutex spinlock when reading stats_since and
minmax_stats_since.
This also restores the consistency between the code and the comments
about entry's mutex spinlock usage.
---
contrib/pg_stat_statements/pg_stat_statements.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 1798e1d016..a66ad99a37 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1869,9 +1869,16 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
/* copy counters to a local variable to keep locking time short */
SpinLockAcquire(&entry->mutex);
tmp = entry->counters;
+ SpinLockRelease(&entry->mutex);
+
+ /*
+ * There is no need to hold entry->mutex when reading stats_since and
+ * minmax_stats_since for (unlike counters) they are always written
+ * while holding pgss->lock exclusively. We are holding pgss->lock
+ * shared so there should be no race here.
+ */
stats_since = entry->stats_since;
minmax_stats_since = entry->minmax_stats_since;
- SpinLockRelease(&entry->mutex);
/* Skip entry if unexecuted (ie, it's a pending "sticky" entry) */
if (IS_STICKY(tmp))
--
2.34.1
On Tue, Nov 05, 2024 at 08:37:08PM +0300, Karina Litskevich wrote:
I suggest eliminating holding the excessive lock. See the attached patch.
This would also restore the consistency between the code and the comments
about entry's mutex spinlock usage.
You are right. minmax_stats_since and stats_since are only set when
an entry is allocated or reset, so this is not going to matter.
+ /* + * There is no need to hold entry->mutex when reading stats_since and + * minmax_stats_since for (unlike counters) they are always written + * while holding pgss->lock exclusively. We are holding pgss->lock + * shared so there should be no race here. + */ stats_since = entry->stats_since; minmax_stats_since = entry->minmax_stats_since; - SpinLockRelease(&entry->mutex);
The comment could be simpler, say a "The spinlock is not required when
reading these two as they are always updated when holding pgss->lock
exclusively.". Or something like that.
--
Michael
On Thu, Nov 7, 2024 at 11:17 AM Michael Paquier <michael@paquier.xyz> wrote:
The comment could be simpler, say a "The spinlock is not required when
reading these two as they are always updated when holding pgss->lock
exclusively.". Or something like that.
Thank you for your feedback and the shorter wording of the comment.
I used it in the new version of the patch.
Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/
Attachments:
v2-0001-pg_stat_statements-Avoid-excessive-lock-holding.patchtext/x-patch; charset=US-ASCII; name=v2-0001-pg_stat_statements-Avoid-excessive-lock-holding.patchDownload
From 4b7f7ea0c9923fa9d2622d86aa1214f331c600ec Mon Sep 17 00:00:00 2001
From: Karina Litskevich <litskevichkarina@gmail.com>
Date: Thu, 7 Nov 2024 15:54:32 +0300
Subject: [PATCH v2] pg_stat_statements: Avoid excessive lock holding
Entry's mutex spinlock is used to be able to modify counters without
holding pgss->lock exclusively (though holding it at least shared so
the entry doesn't disappear). We never actually modify stats_since and
minmax_stats_since without holding pgss->lock exclusively, so there is
no need to hold entry's mutex spinlock when reading stats_since and
minmax_stats_since.
This also restores the consistency between the code and the comments
about entry's mutex spinlock usage.
---
contrib/pg_stat_statements/pg_stat_statements.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 1798e1d016..49c657b3e0 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1869,9 +1869,14 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
/* copy counters to a local variable to keep locking time short */
SpinLockAcquire(&entry->mutex);
tmp = entry->counters;
+ SpinLockRelease(&entry->mutex);
+
+ /*
+ * The spinlock is not required when reading these two as they are
+ * always updated when holding pgss->lock exclusively.
+ */
stats_since = entry->stats_since;
minmax_stats_since = entry->minmax_stats_since;
- SpinLockRelease(&entry->mutex);
/* Skip entry if unexecuted (ie, it's a pending "sticky" entry) */
if (IS_STICKY(tmp))
--
2.34.1
On Thu, Nov 07, 2024 at 04:08:30PM +0300, Karina Litskevich wrote:
Thank you for your feedback and the shorter wording of the comment.
I used it in the new version of the patch.
After a second look, sounds good to me. Let's wait a bit and see of
others have comments or thoughts to share.
--
Michael
Hi Karina Liskevich
+ /* + * There is no need to hold entry->mutex when reading
stats_since and
+ * minmax_stats_since for (unlike counters) they are always
written
+ * while holding pgss->lock exclusively. We are holding
pgss->lock
+ * shared so there should be no race here. + */ stats_since = entry->stats_since; minmax_stats_since = entry->minmax_stats_since; - SpinLockRelease(&entry->mutex);
The comment could be simpler, say a "The spinlock is not required when
reading these two as they are always updated when holding pgss->lock
exclusively.". Or something like that.
Agree , It reduces the lock time , The new comment are short and concise,
It sounds good .
Michael Paquier <michael@paquier.xyz> 于2024年11月8日周五 14:08写道:
Show quoted text
On Thu, Nov 07, 2024 at 04:08:30PM +0300, Karina Litskevich wrote:
Thank you for your feedback and the shorter wording of the comment.
I used it in the new version of the patch.After a second look, sounds good to me. Let's wait a bit and see of
others have comments or thoughts to share.
--
Michael