Support to define custom wait events for extensions
Hi,
Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify which
extension is the bottleneck.
So, I'd like to support new APIs to define custom wait events
for extensions. It's discussed in [1]/messages/by-id/81290a48-b25c-22a5-31a6-3feff5864fe3@gmail.com.
I made patches to realize it. Although I have some TODOs,
I want to know your feedbacks. Please feel free to comment.
# Implementation of new APIs
I implemented 2 new APIs for extensions.
* RequestNamedExtensionWaitEventTranche()
* GetNamedExtensionWaitEventTranche()
Extensions can request custom wait events by calling
RequestNamedExtensionWaitEventTranche(). After that, use
GetNamedExtensionWaitEventTranche() to get the wait event information.
The APIs usage example by extensions are following.
```
shmem_request_hook = shmem_request;
shmem_startup_hook = shmem_startup;
static void
shmem_request(void)
{
/* request a custom wait event */
RequestNamedExtensionWaitEventTranche("custom_wait_event");
}
static void
shmem_startup(void)
{
/* get the wait event information */
custom_wait_event =
GetNamedExtensionWaitEventTranche("custom_wait_event");
}
void
extension_funtion()
{
(void) WaitLatch(MyLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
10L * 1000,
custom_wait_event); /* notify core with custom wait event */
ResetLatch(MyLatch);
}
```
# Implementation overview
I referenced the implementation of
RequestNamedLWLockTranche()/GetNamedLWLockTranche().
(0001-Support-to-define-custom-wait-events-for-extensions.patch)
Extensions calls RequestNamedExtensionWaitEventTranche() in
shmem_request_hook() to request wait events to be used by each
extension.
In the core, the requested wait events are dynamically registered in
shared
memory. The extension then obtains the wait event information with
GetNamedExtensionWaitEventTranche() and uses the value to notify the
core
that it is waiting.
When a string representing of the wait event is requested,
it returns the name defined by calling
RequestNamedExtensionWaitEventTranche().
# PoC extension
I created the PoC extension and you can use it, as shown here:
(0002-Add-a-extension-to-test-custom-wait-event.patch)
1. start PostgreSQL with the following configuration
shared_preload_libraries = 'inject_wait_event'
2. check wait events periodically
psql-1=# SELECT query, wait_event_type, wait_event FROM pg_stat_activity
WHERE backend_type = 'client backend' AND pid != pg_backend_pid() ;
psql-1=# \watch
3. execute a function to inject a wait event
psql-2=# CREATE EXTENSION inject_wait_event;
psql-2=# SELECT inject_wait_event();
4. check the custom wait event
You can see the following results of psql-1.
(..snip..)
query | wait_event_type | wait_event
-----------------------------+-----------------+------------
SELECT inject_wait_event(); | Extension | Extension
(1 row)
(..snip..)
(...about 10 seconds later ..)
query | wait_event_type | wait_event
-----------------------------+-----------------+-------------------
SELECT inject_wait_event(); | Extension | custom_wait_event
# requested wait event by the extension!
(1 row)
(..snip..)
# TODOs
* tests on windows (since I tested on Ubuntu 20.04 only)
* add custom wait events for existing contrib modules (ex. postgres_fdw)
* add regression code (but, it seems to be difficult)
* others? (Please let me know)
[1]: /messages/by-id/81290a48-b25c-22a5-31a6-3feff5864fe3@gmail.com
/messages/by-id/81290a48-b25c-22a5-31a6-3feff5864fe3@gmail.com
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
Attachments:
0001-Support-to-define-custom-wait-events-for-extensions.patchtext/x-diff; name=0001-Support-to-define-custom-wait-events-for-extensions.patchDownload+345-3
0003-Add-docs-to-define-custom-wait-events.patchtext/x-diff; name=0003-Add-docs-to-define-custom-wait-events.patchDownload+26-4
0002-Add-a-extension-to-test-custom-wait-event.patchtext/x-diff; name=0002-Add-a-extension-to-test-custom-wait-event.patchDownload+160-1
On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:
Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify which
extension is the bottleneck.
Thanks for taking the time to implement a patch to do that.
I want to know your feedbacks. Please feel free to comment.
I think that's been cruelly missed.
In the core, the requested wait events are dynamically registered in shared
memory. The extension then obtains the wait event information with
GetNamedExtensionWaitEventTranche() and uses the value to notify the core
that it is waiting.When a string representing of the wait event is requested,
it returns the name defined by calling
RequestNamedExtensionWaitEventTranche().
So this implements the equivalent of RequestNamedLWLockTranche()
followed by GetNamedLWLockTranche() to get the wait event number,
which can be used only during postmaster startup. Do you think that
we could focus on implementing something more flexible instead, that
can be used dynamically as well as statically? That would be similar
to LWLockNewTrancheId() and LWLockRegisterTranche(), actually, where
we would get one or more tranche IDs, then do initialization actions
in shmem based on the tranche ID(s).
4. check the custom wait event
You can see the following results of psql-1.query | wait_event_type | wait_event
-----------------------------+-----------------+-------------------
SELECT inject_wait_event(); | Extension | custom_wait_event #
requested wait event by the extension!
(1 row)(..snip..)
A problem with this approach is that it is expensive as a test. Do we
really need one? There are three places that set PG_WAIT_EXTENSION in
src/test/modules/, more in /contrib, and there are modules like
pg_stat_statements that could gain from events for I/O operations, for
example.
# TODOs
* tests on windows (since I tested on Ubuntu 20.04 only)
* add custom wait events for existing contrib modules (ex. postgres_fdw)
* add regression code (but, it seems to be difficult)
* others? (Please let me know)
Hmm. You would need to maintain a state in a rather stable manner,
and SQL queries can make that difficult in the TAP tests as the wait
event information is reset each time a query finishes. One area where
I think this gets easier is with a background worker loaded with
shared_preload_libraries that has a configurable naptime. Looking at
what's available in the tree, the TAP tests of pg_prewarm could use
one test on pg_stat_activity with a custom wait event name
(pg_prewarm.autoprewarm_interval is 0 hence the bgworker waits
infinitely). Note that in this case, you would need to be careful of
the case where the wait event is loaded dynamically, but like LWLocks
this should be able to work as well?
--
Michael
Hi,
On 6/15/23 10:00 AM, Michael Paquier wrote:
On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:
Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify which
extension is the bottleneck.Thanks for taking the time to implement a patch to do that.
+1 thanks for it!
I want to know your feedbacks. Please feel free to comment.
I think that's been cruelly missed.
Yeah, that would clearly help to diagnose which extension(s) is/are causing the waits (if any).
I did not look at the code yet (will do) but just wanted to chime in to support the idea.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
We had this on our list of things to do at Neon, so it is a nice
surprise that you brought up an initial patchset :). It was also my
first time looking up the word tranche.
From 59a118402e5e59685fb9e0fb086872e25a405736 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
Date: Thu, 15 Jun 2023 12:57:29 +0900
Subject: [PATCH 2/3] Support to define custom wait events for extensions.
Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify bottlenecks.
"extensions are installed" should be "extensions installed".
+#define NUM_BUILDIN_WAIT_EVENT_EXTENSION \ + (WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - WAIT_EVENT_EXTENSION)
Should that be NUM_BUILTIN_WAIT_EVENT_EXTENSION?
+ NamedExtensionWaitEventTrancheRequestArray = (NamedExtensionWaitEventTrancheRequest *) + MemoryContextAlloc(TopMemoryContext, + NamedExtensionWaitEventTrancheRequestsAllocated + * sizeof(NamedExtensionWaitEventTrancheRequest));
I can't tell from reading other Postgres code when one should cast the
return value of MemoryContextAlloc(). Seems unnecessary to me.
+ if (NamedExtensionWaitEventTrancheRequestArray == NULL) + { + NamedExtensionWaitEventTrancheRequestsAllocated = 16; + NamedExtensionWaitEventTrancheRequestArray = (NamedExtensionWaitEventTrancheRequest *) + MemoryContextAlloc(TopMemoryContext, + NamedExtensionWaitEventTrancheRequestsAllocated + * sizeof(NamedExtensionWaitEventTrancheRequest)); + } + + if (NamedExtensionWaitEventTrancheRequests >= NamedExtensionWaitEventTrancheRequestsAllocated) + { + int i = pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1); + + NamedExtensionWaitEventTrancheRequestArray = (NamedExtensionWaitEventTrancheRequest *) + repalloc(NamedExtensionWaitEventTrancheRequestArray, + i * sizeof(NamedExtensionWaitEventTrancheRequest)); + NamedExtensionWaitEventTrancheRequestsAllocated = i; + }
Do you think this code would look better in an if/else if?
+ int i = pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1);
In the Postgres codebase, is an int always guaranteed to be at least 32
bits? I feel like a fixed-width type would be better for tracking the
length of the array, unless Postgres prefers the Size type.
+ Assert(strlen(tranche_name) + 1 <= NAMEDATALEN); + strlcpy(request->tranche_name, tranche_name, NAMEDATALEN);
A sizeof(request->tranche_name) would keep this code more in-sync if
size of tranche_name were to ever change, though I see sizeof
expressions in the codebase are not too common. Maybe just remove the +1
and make it less than rather than a less than or equal? Seems like it
might be worth noting in the docs of the function that the event name
has to be less than NAMEDATALEN, but maybe this is something extension
authors are inherently aware of?
---
What's the Postgres policy on the following?
for (int i = 0; ...)
for (i = 0; ...)
You are using 2 different patterns in WaitEventShmemInit() and
InitializeExtensionWaitEventTranches().
+ /* + * Copy the info about any named tranches into shared memory (so that + * other processes can see it), and initialize the requested wait events. + */ + if (NamedExtensionWaitEventTrancheRequests > 0)
Removing this if would allow one less indentation level. Nothing would
have to change about the containing code either since the for loop will
then not run
+ ExtensionWaitEventCounter = (int *) ((char *) NamedExtensionWaitEventTrancheArray - sizeof(int));
From 65e25d4b27bbb6d0934872310c24ee19f89e9631 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
Date: Thu, 15 Jun 2023 13:16:00 +0900
Subject: [PATCH 3/3] Add docs to define custom wait events
+ <para> + wait events are reserved by calling: +<programlisting> +void RequestNamedExtensionWaitEventTranche(const char *tranche_name) +</programlisting> + from your <literal>shmem_request_hook</literal>. This will ensure that + wait event is available under the name <literal>tranche_name</literal>, + which the wait event type is <literal>Extension</literal>. + Use <function>GetNamedExtensionWaitEventTranche</function> + to get a wait event information. + </para> + <para> + To avoid possible race-conditions, each backend should use the LWLock + <function>AddinShmemInitLock</function> when connecting to and initializing + its allocation of shared memory, same as LWLocks reservations above. + </para>
Should "wait" be capitalized in the first sentence?
"This will ensure that wait event is available" should have an "a"
before "wait".
Nice patch.
--
Tristan Partin
Neon (https://neon.tech)
On Thu, Jun 15, 2023 at 11:13:57AM -0500, Tristan Partin wrote:
What's the Postgres policy on the following?
for (int i = 0; ...)
for (i = 0; ...)You are using 2 different patterns in WaitEventShmemInit() and
InitializeExtensionWaitEventTranches().
C99 style is OK since v12, so the style of the patch is fine. The
older style has no urgent need to change, either. One argument to let
the code as-is is that it could generate backpatching conflicts, while
it does not hurt as it stands. This also means that bug fixes that
need to be applied down to 12 would be able to use C99 declarations
freely without some of the buildfarm animals running REL_11_STABLE
complaining. I have fallen into this trap recently, actually. See
dbd25dd.
--
Michael
Thanks for replying and your kind advice!
On 2023-06-15 17:00, Michael Paquier wrote:
On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:
In the core, the requested wait events are dynamically registered in
shared
memory. The extension then obtains the wait event information with
GetNamedExtensionWaitEventTranche() and uses the value to notify the
core
that it is waiting.When a string representing of the wait event is requested,
it returns the name defined by calling
RequestNamedExtensionWaitEventTranche().So this implements the equivalent of RequestNamedLWLockTranche()
followed by GetNamedLWLockTranche() to get the wait event number,
which can be used only during postmaster startup. Do you think that
we could focus on implementing something more flexible instead, that
can be used dynamically as well as statically? That would be similar
to LWLockNewTrancheId() and LWLockRegisterTranche(), actually, where
we would get one or more tranche IDs, then do initialization actions
in shmem based on the tranche ID(s).
OK, I agree. I'll make a patch to only support
ExtensionWaitEventNewTrancheId() and ExtensionWaitEventRegisterTranche()
similar to LWLockNewTrancheId() and LWLockRegisterTranche().
4. check the custom wait event
You can see the following results of psql-1.query | wait_event_type | wait_event
-----------------------------+-----------------+-------------------
SELECT inject_wait_event(); | Extension | custom_wait_event
#
requested wait event by the extension!
(1 row)(..snip..)
A problem with this approach is that it is expensive as a test. Do we
really need one? There are three places that set PG_WAIT_EXTENSION in
src/test/modules/, more in /contrib, and there are modules like
pg_stat_statements that could gain from events for I/O operations, for
example.
Yes. Since it's hard to test, I thought the PoC extension
should not be committed. But, I couldn't figure out the best
way to test yet.
# TODOs
* tests on windows (since I tested on Ubuntu 20.04 only)
* add custom wait events for existing contrib modules (ex.
postgres_fdw)
* add regression code (but, it seems to be difficult)
* others? (Please let me know)Hmm. You would need to maintain a state in a rather stable manner,
and SQL queries can make that difficult in the TAP tests as the wait
event information is reset each time a query finishes. One area where
I think this gets easier is with a background worker loaded with
shared_preload_libraries that has a configurable naptime. Looking at
what's available in the tree, the TAP tests of pg_prewarm could use
one test on pg_stat_activity with a custom wait event name
(pg_prewarm.autoprewarm_interval is 0 hence the bgworker waits
infinitely). Note that in this case, you would need to be careful of
the case where the wait event is loaded dynamically, but like LWLocks
this should be able to work as well?
Thanks for your advice!
I tried to query on pg_stat_activity to check the background worker
invoked by pg_prewarm. But, I found that pg_stat_activity doesn't show
it although I may be missing something...
So, I tried to implement TAP tests. But I have a problem with it.
I couldn't find the way to check the status of another backend
while the another backend wait with custom wait events.
```
# TAP test I've implemented.
# wait forever with custom wait events in session1
$session1->query_safe("SELECT test_custom_wait_events_wait()");
# I want to check the wait event from another backend process
# But, the following code is never reached because the above
# query is waiting forever.
$session2->poll_query_until('postgres',
qq[SELECT
(SELECT count(*) FROM pg_stat_activity
WHERE query ~ '^SELECT test_custom_wait_events_wait'
AND wait_event_type = 'Extension'
AND wait_event = 'custom_wait_event'
) > 0;]);
```
If I'm missing something or you have any idea,
please let me know.
Now, I plan to
* find out more the existing tests to check wait events and locks
(though I have already checked a little, but I couldn't find it)
* find another way to check wait event of the background worker invoked
by extension
* look up the reason why pg_stat_activity doesn't show the background
worker
* find a way to implement async queries in TAP tests
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
On 2023-06-15 22:21, Drouvot, Bertrand wrote:
Hi,
On 6/15/23 10:00 AM, Michael Paquier wrote:
On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:
Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify which
extension is the bottleneck.Thanks for taking the time to implement a patch to do that.
+1 thanks for it!
I want to know your feedbacks. Please feel free to comment.
I think that's been cruelly missed.
Yeah, that would clearly help to diagnose which extension(s) is/are
causing the waits (if any).I did not look at the code yet (will do) but just wanted to chime in
to support the idea.
Great! Thanks.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
On 2023-06-16 01:13, Tristan Partin wrote:
We had this on our list of things to do at Neon, so it is a nice
surprise that you brought up an initial patchset :). It was also my
first time looking up the word tranche.
What a coincidence! I came up with the idea when I used Neon with
postgres_fdw. As a Neon user, I also feel the feature is important.
Same as you. Thanks to Michael and Drouvot, I got to know the word
tranche
and the related existing code.
From 59a118402e5e59685fb9e0fb086872e25a405736 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
Date: Thu, 15 Jun 2023 12:57:29 +0900
Subject: [PATCH 2/3] Support to define custom wait events for
extensions.Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify bottlenecks."extensions are installed" should be "extensions installed".
+#define NUM_BUILDIN_WAIT_EVENT_EXTENSION \ + (WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - WAIT_EVENT_EXTENSION)Should that be NUM_BUILTIN_WAIT_EVENT_EXTENSION?
Thanks for your comments.
Yes, I'll fix it.
+ NamedExtensionWaitEventTrancheRequestArray = (NamedExtensionWaitEventTrancheRequest *) + MemoryContextAlloc(TopMemoryContext, + NamedExtensionWaitEventTrancheRequestsAllocated + * sizeof(NamedExtensionWaitEventTrancheRequest));I can't tell from reading other Postgres code when one should cast the
return value of MemoryContextAlloc(). Seems unnecessary to me.
I referenced RequestNamedLWLockTranche() and it looks ok to me.
```
void
RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
NamedLWLockTrancheRequestArray = (NamedLWLockTrancheRequest *)
MemoryContextAlloc(TopMemoryContext,
NamedLWLockTrancheRequestsAllocated
* sizeof(NamedLWLockTrancheRequest));
```
+ if (NamedExtensionWaitEventTrancheRequestArray == NULL) + { + NamedExtensionWaitEventTrancheRequestsAllocated = 16; + NamedExtensionWaitEventTrancheRequestArray = (NamedExtensionWaitEventTrancheRequest *) + MemoryContextAlloc(TopMemoryContext, + NamedExtensionWaitEventTrancheRequestsAllocated + * sizeof(NamedExtensionWaitEventTrancheRequest)); + } + + if (NamedExtensionWaitEventTrancheRequests >= NamedExtensionWaitEventTrancheRequestsAllocated) + { + int i = pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1); + + NamedExtensionWaitEventTrancheRequestArray = (NamedExtensionWaitEventTrancheRequest *) + repalloc(NamedExtensionWaitEventTrancheRequestArray, + i * sizeof(NamedExtensionWaitEventTrancheRequest)); + NamedExtensionWaitEventTrancheRequestsAllocated = i; + }Do you think this code would look better in an if/else if?
Same as above. I referenced RequestNamedLWLockTranche().
I don't know if it's a good idea, but it's better to refactor the
existing code separately from this patch.
But I plan to remove the code to focus implementing dynamic allocation
similar to LWLockNewTrancheId() and LWLockRegisterTranche() as
Michael's suggestion. I think it's good idea as a first step. Is it ok
for you?
+ int i =
pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1);In the Postgres codebase, is an int always guaranteed to be at least 32
bits? I feel like a fixed-width type would be better for tracking the
length of the array, unless Postgres prefers the Size type.
Same as above.
+ Assert(strlen(tranche_name) + 1 <= NAMEDATALEN); + strlcpy(request->tranche_name, tranche_name, NAMEDATALEN);A sizeof(request->tranche_name) would keep this code more in-sync if
size of tranche_name were to ever change, though I see sizeof
expressions in the codebase are not too common. Maybe just remove the
+1
and make it less than rather than a less than or equal? Seems like it
might be worth noting in the docs of the function that the event name
has to be less than NAMEDATALEN, but maybe this is something extension
authors are inherently aware of?
Same as above.
---
What's the Postgres policy on the following?
for (int i = 0; ...)
for (i = 0; ...)
You are using 2 different patterns in WaitEventShmemInit() and
InitializeExtensionWaitEventTranches().
I didn't care it. I'll unify it.
Michael's replay is interesting.
+ /* + * Copy the info about any named tranches into shared memory (so that + * other processes can see it), and initialize the requested wait events. + */ + if (NamedExtensionWaitEventTrancheRequests > 0)Removing this if would allow one less indentation level. Nothing would
have to change about the containing code either since the for loop will
then not run
Thanks, but I plan to remove to focus implementing dynamic allocation.
+ ExtensionWaitEventCounter = (int *) ((char *)
NamedExtensionWaitEventTrancheArray - sizeof(int));From 65e25d4b27bbb6d0934872310c24ee19f89e9631 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
Date: Thu, 15 Jun 2023 13:16:00 +0900
Subject: [PATCH 3/3] Add docs to define custom wait events+ <para> + wait events are reserved by calling: +<programlisting> +void RequestNamedExtensionWaitEventTranche(const char *tranche_name) +</programlisting> + from your <literal>shmem_request_hook</literal>. This will ensure that + wait event is available under the name <literal>tranche_name</literal>, + which the wait event type is <literal>Extension</literal>. + Use <function>GetNamedExtensionWaitEventTranche</function> + to get a wait event information. + </para> + <para> + To avoid possible race-conditions, each backend should use the LWLock + <function>AddinShmemInitLock</function> when connecting to and initializing + its allocation of shared memory, same as LWLocks reservations above. + </para>Should "wait" be capitalized in the first sentence?
Yes, I'll fix it
"This will ensure that wait event is available" should have an "a"
before "wait".
Yes, I'll fix it
Nice patch.
Thanks for your comments too.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
On Fri, Jun 16, 2023 at 11:14:05AM +0900, Masahiro Ikeda wrote:
I tried to query on pg_stat_activity to check the background worker
invoked by pg_prewarm. But, I found that pg_stat_activity doesn't show
it although I may be missing something...So, I tried to implement TAP tests. But I have a problem with it.
I couldn't find the way to check the status of another backend
while the another backend wait with custom wait events.
Hmm. Right. It seems to me that BGWORKER_BACKEND_DATABASE_CONNECTION
is required in this case, with BackgroundWorkerInitializeConnection()
to connect to a database (or not, like the logical replication
launcher if only access to shared catalogs is wanted).
I have missed that the leader process of pg_prewarm does not use that,
because it has no need to connect to a database, but its workers do.
So it is not going to show up in pg_stat_activity.
--
Michael
On 2023-06-16 16:46, Michael Paquier wrote:
On Fri, Jun 16, 2023 at 11:14:05AM +0900, Masahiro Ikeda wrote:
I tried to query on pg_stat_activity to check the background worker
invoked by pg_prewarm. But, I found that pg_stat_activity doesn't show
it although I may be missing something...So, I tried to implement TAP tests. But I have a problem with it.
I couldn't find the way to check the status of another backend
while the another backend wait with custom wait events.Hmm. Right. It seems to me that BGWORKER_BACKEND_DATABASE_CONNECTION
is required in this case, with BackgroundWorkerInitializeConnection()
to connect to a database (or not, like the logical replication
launcher if only access to shared catalogs is wanted).I have missed that the leader process of pg_prewarm does not use that,
because it has no need to connect to a database, but its workers do.
So it is not going to show up in pg_stat_activity.
Yes. Thanks to your advice, I understood that
BGWORKER_BACKEND_DATABASE_CONNECTION is the reason.
I could make the TAP test that invokes a background worker waiting
forever
and checks its custom wait event in pg_stat_activity. So, I'll make
patches
including test codes next week.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
I will take a look at your V2 when it is ready! I will also pass along
that this is wanted by Neon customers :).
--
Tristan Partin
Neon (https://neon.tech)
On 2023/06/17 1:16, Tristan Partin wrote:
I will take a look at your V2 when it is ready! I will also pass along
that this is wanted by Neon customers :).
Thanks!
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
Hi,
I updated the patches. The main changes are
* to support only dynamic wait event allocation
* to add a regression test
I appreciate any feedback.
The followings are TODO items.
* to check that meson.build works since I tested with old command `make`
now
* to make documents
* to add custom wait events for existing contrib modules (ex.
postgres_fdw)
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
On 2023-06-20 18:26, Masahiro Ikeda wrote:
The followings are TODO items.
* to check that meson.build works since I tested with old command
`make` now
I test with meson and I updated the patches to work with it.
My test procedure is the following.
```
export builddir=/mnt/tmp/build
export prefix=/mnt/tmp/master
# setup
meson setup $builddir --prefix=$prefix -Ddebug=true -Dcassert=true
-Dtap_tests=enabled
# build and install with src/test/modules
ninja -C $builddir install install-test-files
# test
meson test -v -C $builddir
meson test -v -C $builddir --suite test_custom_wait_events # run the
test only
```
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
Hi,
I updated the patches to handle the warning mentioned
by PostgreSQL Patch Tester, and removed unnecessary spaces.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
On Fri, Jun 23, 2023 at 05:56:26PM +0900, Masahiro Ikeda wrote:
I updated the patches to handle the warning mentioned
by PostgreSQL Patch Tester, and removed unnecessary spaces.
I have begun hacking on that, and the API layer inspired from the
LWLocks is sound. I have been playing with it in my own extensions
and it is nice to be able to plug in custom wait events into
pg_stat_activity, particularly for bgworkers. Finally.
The patch needed a rebase after the recent commit that introduced the
automatic generation of docs and code for wait events. It requires
two tweaks in generate-wait_event_types.pl, feel free to double-check
them.
Some of the new structures and routine names don't quite reflect the
fact that we have wait events for extensions, so I have taken a stab
at that.
Note that the test module test_custom_wait_events would crash if
attempting to launch a worker when not loaded in
shared_preload_libraries, so we'd better have some protection in
wait_worker_launch() (this function should be renamed as well).
Attached is a rebased patch that I have begun tweaking here and
there. For now, the patch is moved as waiting on author. I have
merged the test module with the main patch for the moment, for
simplicity. A split is straight-forward as the code paths touched are
different.
Another and *very* important thing is that we are going to require
some documentation in xfunc.sgml to explain how to use these routines
and what to expect from them. Ikeda-san, could you write some? You
could look at the part about shmem and LWLock to get some
inspiration.
--
Michael
Attachments:
v5-0001-Support-custom-wait-events-for-extensions.patchtext/x-diff; charset=us-asciiDownload+484-16
From bf06b8100cb747031959fe81a2d19baabc4838cf Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
Date: Fri, 16 Jun 2023 11:53:29 +0900
Subject: [PATCH 1/2] Support custom wait events for extensions.
+ * This is indexed by event ID minus NUM_BUILTIN_WAIT_EVENT_EXTENSION, and + * stores the names of all dynamically-created event ID known to the current + * process. Any unused entries in the array will contain NULL.
The second ID should be plural.
+ /* If necessary, create or enlarge array. */ + if (eventId >= ExtensionWaitEventTrancheNamesAllocated) + { + int newalloc; + + newalloc = pg_nextpower2_32(Max(8, eventId + 1));
Given the context of our last conversation, I assume this code was
copied from somewhere else. Since this is new code, I think it would
make more sense if newalloc was a uint16 or size_t.
From what I undersatnd, Neon differs from upstream in some way related
to this patch. I am trying to ascertain how that is. I hope to provide
more feedback when I learn more about it.
--
Tristan Partin
Neon (https://neon.tech)
On Tue, Jul 11, 2023 at 12:39:52PM -0500, Tristan Partin wrote:
Given the context of our last conversation, I assume this code was
copied from somewhere else. Since this is new code, I think it would
make more sense if newalloc was a uint16 or size_t.
This style comes from LWLockRegisterTranche() in lwlock.c. Do you
think that it would be more adapted to change that to
pg_nextpower2_size_t() with a Size? We could do that for the existing
code on HEAD as an improvement.
From what I understand, Neon differs from upstream in some way related
to this patch. I am trying to ascertain how that is. I hope to provide
more feedback when I learn more about it.
Hmm, okay, that would nice to hear about to make sure that the
approach taken on this thread is able to cover what you are looking
for. So you mean that Neon has been using something similar to
register wait events in the backend? Last time I looked at the Neon
repo, I did not get the impression that there was a custom patch for
Postgres in this area. All the in-core code paths using
WAIT_EVENT_EXTENSION would gain from the APIs added here, FWIW.
--
Michael
Hi,
On 2023-07-11 16:45:26 +0900, Michael Paquier wrote:
+/* ---------- + * Wait Events - Extension + * + * Use this category when the server process is waiting for some condition + * defined by an extension module. + * + * Extensions can define custom wait events. First, call + * WaitEventExtensionNewTranche() just once to obtain a new wait event + * tranche. The ID is allocated from a shared counter. Next, each + * individual process using the tranche should call + * WaitEventExtensionRegisterTranche() to associate that wait event with + * a name.
What does "tranche" mean here? For LWLocks it makes some sense, it's used for
a set of lwlocks, not an individual one. But here that doesn't really seem to
apply?
+ * It may seem strange that each process using the tranche must register it + * separately, but dynamic shared memory segments aren't guaranteed to be + * mapped at the same address in all coordinating backends, so storing the + * registration in the main shared memory segment wouldn't work for that case. + */
I don't really see how this applies to wait events? There's no pointers
here...
+typedef enum +{ + WAIT_EVENT_EXTENSION = PG_WAIT_EXTENSION, + WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED +} WaitEventExtension; + +extern void WaitEventExtensionShmemInit(void); +extern Size WaitEventExtensionShmemSize(void); + +extern uint32 WaitEventExtensionNewTranche(void); +extern void WaitEventExtensionRegisterTranche(uint32 wait_event_info,
-slock_t *ShmemLock; /* spinlock for shared memory and LWLock +slock_t *ShmemLock; /* spinlock for shared memory, LWLock + * allocation, and named extension wait event * allocation */
I'm doubtful that it's a good idea to reuse the spinlock further. Given that
the patch adds WaitEventExtensionShmemInit(), why not just have a lock in
there?
+/* + * Allocate a new event ID and return the wait event info. + */ +uint32 +WaitEventExtensionNewTranche(void) +{ + uint16 eventId; + + SpinLockAcquire(ShmemLock); + eventId = (*WaitEventExtensionCounter)++; + SpinLockRelease(ShmemLock); + + return PG_WAIT_EXTENSION | eventId; +}
It seems quite possible to run out space in WaitEventExtensionCounter, so this
should error out in that case.
+/* + * Register a dynamic tranche name in the lookup table of the current process. + * + * This routine will save a pointer to the wait event tranche name passed + * as an argument, so the name should be allocated in a backend-lifetime context + * (shared memory, TopMemoryContext, static constant, or similar). + * + * The "wait_event_name" will be user-visible as a wait event name, so try to + * use a name that fits the style for those. + */ +void +WaitEventExtensionRegisterTranche(uint32 wait_event_info, + const char *wait_event_name) +{ + uint16 eventId; + + /* Check wait event class. */ + Assert((wait_event_info & 0xFF000000) == PG_WAIT_EXTENSION); + + eventId = wait_event_info & 0x0000FFFF; + + /* This should only be called for user-defined tranches. */ + if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION) + return;
Why not assert out in that case then?
+/* + * Return the name of an Extension wait event ID. + */ +static const char * +GetWaitEventExtensionIdentifier(uint16 eventId) +{ + /* Build-in tranche? */
*built
+ if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION) + return "Extension"; + + /* + * It's an extension tranche, so look in WaitEventExtensionTrancheNames[]. + * However, it's possible that the tranche has never been registered by + * calling WaitEventExtensionRegisterTranche() in the current process, in + * which case give up and return "Extension". + */ + eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION; + + if (eventId >= WaitEventExtensionTrancheNamesAllocated || + WaitEventExtensionTrancheNames[eventId] == NULL) + return "Extension";
I'd return something different here, otherwise something that's effectively a
bug is not distinguishable from the built
+++ b/src/test/modules/test_custom_wait_events/t/001_basic.pl @@ -0,0 +1,34 @@ +# Copyright (c) 2023, PostgreSQL Global Development Group + +use strict; +use warnings; + +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +my $node = PostgreSQL::Test::Cluster->new('main'); + +$node->init; +$node->append_conf( + 'postgresql.conf', + "shared_preload_libraries = 'test_custom_wait_events'" +); +$node->start;
I think this should also test registering a wait event later.
@@ -0,0 +1,188 @@ +/*-------------------------------------------------------------------------- + * + * test_custom_wait_events.c + * Code for testing custom wait events + * + * This code initializes a custom wait event in shmem_request_hook() and + * provide a function to launch a background worker waiting forever + * with the custom wait event.
Isn't this vast overkill? Why not just have a simple C function that waits
with a custom wait event, until cancelled? That'd maybe 1/10th of the LOC.
Greetings,
Andres Freund
On Tue, Jul 11, 2023 at 05:36:47PM -0700, Andres Freund wrote:
On 2023-07-11 16:45:26 +0900, Michael Paquier wrote:
+$node->init; +$node->append_conf( + 'postgresql.conf', + "shared_preload_libraries = 'test_custom_wait_events'" +); +$node->start;I think this should also test registering a wait event later.
Yup, agreed that the coverage is not sufficient.
@@ -0,0 +1,188 @@ +/*-------------------------------------------------------------------------- + * + * test_custom_wait_events.c + * Code for testing custom wait events + * + * This code initializes a custom wait event in shmem_request_hook() and + * provide a function to launch a background worker waiting forever + * with the custom wait event.Isn't this vast overkill? Why not just have a simple C function that waits
with a custom wait event, until cancelled? That'd maybe 1/10th of the LOC.
Hmm. You mean in the shape of a TAP test where a backend registers a
wait event by itself in a SQL function that waits for a certain amount
of time with a WaitLatch(), then we use a second poll_query_until()
that checks if the a wait event is stored in pg_stat_activity? With
something like what's done at the end of 001_stream_rep.pl, that
should be stable, I guess..
--
Michael