Improve LWLock tranche name visibility across backends

Started by Sami Imseih8 months ago107 messages
Jump to latest
#1Sami Imseih
samimseih@gmail.com

Hi,

This is a follow-up to a discussion started in [0]/messages/by-id/aEiTzmndOVPmA6Mm@nathan.

LWLocks in PostgreSQL are categorized into tranches, and the tranche name
appears as the wait_event in pg_stat_activity. There are both built-in
tranche names and tranche names that can be registered by extensions using
RequestNamedLWLockTranche() or LWLockRegisterTranche().

Tranche names are stored in process-local memory when registered. If a
tranche is registered during postmaster startup, such as with built-in
tranches or those registered via RequestNamedLWLockTranche(), its name is
inherited by backend processes via fork(). However, if a tranche is
registered dynamically by a backend using LWLockRegisterTranche(), other
backends will not be aware of it unless they explicitly register it as well.

Consider a case in which an extension allows a backend to attach a new
dshash via the GetNamedDSHash API and supplies a tranche name like
"MyUsefulExtension". The first backend to call GetNamedDSHash will
initialize an LWLock using the extension-defined tranche name and associate
it with a tranche ID in local memory. Other backends that later attach to
the same dshash will also learn about the tranche name and ID. Backends
that do not attach the dshash will not know this tranche name. This
results in differences in how wait events are reported in pg_stat_activity.

When querying pg_stat_activity, the function pgstat_get_wait_event is
called, which internally uses GetLWLockIdentifier and GetLWTrancheName
to map the LWLock to its tranche name. If the backend does not recognize
the tranche ID, a fallback name "extension" is used. Therefore, backends
that have registered the tranche will report the correct extension-defined
tranche name, while others will report the generic fallback of "extension".

i.e.
````
postgres=# select wait_event, wait_event_type from pg_stat_activity;
-[ RECORD 1 ]---+--------------------
wait_event | extension
wait_event_type | LWLock
```
instead of
```
postgres=# select wait_event, wait_event_type from pg_stat_activity;
-[ RECORD 1 ]---+--------------------
wait_event | MyUsefulExtension
wait_event_type | LWLock
```

This is the current design, but I think we can do better to avoid inconsitencies
this my lead for monitoring tools and diagnostics.

To improve this, we could store tranche names registered by a normal backend
in shared memory, for example in a dshash, allowing tranche names to be
resolved even by backends that have not explicitly registered them. This
would lead to more consistent behavior, particularly as more extensions
adopt APIs like GetNamedDSHash, where tranche names are registered by the
backend rather than the postmaster.

Attached is a proof of concept that does not alter the
LWLockRegisterTranche API. Instead, it detects when a registration is
performed by a normal backend and stores the tranche name in shared memory,
using a dshash keyed by tranche ID. Tranche name lookup now proceeds in
the order of built-in names, the local list, and finally the shared memory.
The fallback name "extension" can still be returned if an extension does
not register a tranche.

An exclusive lock is taken when adding a new tranche, which should be a rare
occurrence. A shared lock is taken when looking up a tranche name via
GetLWTrancheName.

There are still some open questions I have:

1/ There is currently no mechanism for deleting entries. I am not sure whether
this is a concern, since the size of the table would grow only with the
number of extensions and the number of LWLocks they initialize, which is
typically small. That said, others may have different thoughts on this.

2/ What is the appropriate size limit for a tranche name. The work done
in [0]/messages/by-id/aEiTzmndOVPmA6Mm@nathan caps the tranche name to 128 bytes for the dshash tranche, and
128 bytes + length of " DSA" suffix for the dsa tranche. Also, the
existing RequestNamedLWLockTranche caps the name to NAMEDATALEN. Currently,
LWLockRegisterTranche does not have a limit on the tranche name. I wonder
if we also need to take care of this and implement some common limit that
applies to tranch names regardless of how they're created?

[0]: /messages/by-id/aEiTzmndOVPmA6Mm@nathan

--

Sami Imseih
Amazon Web Services (AWS)

Attachments:

0001-Improve-LWLock-tranche-name-visibility-across-backen.patchapplication/octet-stream; name=0001-Improve-LWLock-tranche-name-visibility-across-backen.patchDownload+190-26
#2Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#1)
Re: Improve LWLock tranche name visibility across backends

Hi,

On Wed, Jul 09, 2025 at 04:39:48PM -0500, Sami Imseih wrote:

Hi,

When querying pg_stat_activity, the function pgstat_get_wait_event is
called, which internally uses GetLWLockIdentifier and GetLWTrancheName
to map the LWLock to its tranche name. If the backend does not recognize
the tranche ID, a fallback name "extension" is used. Therefore, backends
that have registered the tranche will report the correct extension-defined
tranche name, while others will report the generic fallback of "extension".

i.e.
````
postgres=# select wait_event, wait_event_type from pg_stat_activity;
-[ RECORD 1 ]---+--------------------
wait_event | extension
wait_event_type | LWLock
```
instead of
```
postgres=# select wait_event, wait_event_type from pg_stat_activity;
-[ RECORD 1 ]---+--------------------
wait_event | MyUsefulExtension
wait_event_type | LWLock
```

This is the current design, but I think we can do better to avoid inconsitencies
this my lead for monitoring tools and diagnostics.

+1 on finding a way to improve this, thanks for looking at it.

Attached is a proof of concept that does not alter the
LWLockRegisterTranche API. Instead, it detects when a registration is
performed by a normal backend and stores the tranche name in shared memory,
using a dshash keyed by tranche ID. Tranche name lookup now proceeds in
the order of built-in names, the local list, and finally the shared memory.
The fallback name "extension" can still be returned if an extension does
not register a tranche.

I did not look in details, but do you think we could make use of
WaitEventCustomNew()?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#3Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#2)
Re: Improve LWLock tranche name visibility across backends

Thanks for the feedback!

Attached is a proof of concept that does not alter the
LWLockRegisterTranche API. Instead, it detects when a registration is
performed by a normal backend and stores the tranche name in shared memory,
using a dshash keyed by tranche ID. Tranche name lookup now proceeds in
the order of built-in names, the local list, and finally the shared memory.
The fallback name "extension" can still be returned if an extension does
not register a tranche.

I did not look in details, but do you think we could make use of
WaitEventCustomNew()?

It looks like I overlooked the custom wait event, so I didn’t take it into
account initially. That said, I do think it’s reasonable to consider
piggybacking on this infrastructure.

After all, LWLockRegisterTranche is already creating a custom wait event
defined by the extension. The advantage here is that we can avoid creating
new shared memory and instead reuse the existing static hash table, which is
capped at 128 custom wait events:

```
#define WAIT_EVENT_CUSTOM_HASH_MAX_SIZE 128
```

However, WaitEventCustomNew as it currently stands won’t work for our use
case, since it assigns an eventId automatically. The API currently takes a
classId and wait_event_name, but in our case, we’d actually want to pass in a
trancheId.

So, we might need a new API, something like:
```
WaitEventCustomNewWithEventId(uint32 classId, uint16 eventId,
const char *wait_event_name);
```
eventId in the LWLock case will be a tracheId that was generated
by the user in some earlier step, like LWLockInitialize

This would behave the same as the existing WaitEventCustomNew API,
except that it uses the provided eventId.

or maybe we can just allow WaitEventCustomNew to take in the eventId, and
if it's > 0, then use the passed in value, otherwise generate the next eventId.

I do like the latter approach more, what do you think?

With this API, we can then teach LWLockRegisterTranche to register the
custom wait event.

--
Sami

#4Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#3)
Re: Improve LWLock tranche name visibility across backends

Hi,

On Thu, Jul 10, 2025 at 04:34:34PM -0500, Sami Imseih wrote:

Thanks for the feedback!

Attached is a proof of concept that does not alter the
LWLockRegisterTranche API. Instead, it detects when a registration is
performed by a normal backend and stores the tranche name in shared memory,
using a dshash keyed by tranche ID. Tranche name lookup now proceeds in
the order of built-in names, the local list, and finally the shared memory.
The fallback name "extension" can still be returned if an extension does
not register a tranche.

I did not look in details, but do you think we could make use of
WaitEventCustomNew()?

It looks like I overlooked the custom wait event, so I didn’t take it into
account initially. That said, I do think it’s reasonable to consider
piggybacking on this infrastructure.

After all, LWLockRegisterTranche is already creating a custom wait event
defined by the extension.

Right, the tranche is nothing but an eventID (from a wait_event_info point of
view).

The advantage here is that we can avoid creating
new shared memory

Right, I think it's good to rely on this existing machinery.

and instead reuse the existing static hash table, which is
capped at 128 custom wait events:

```
#define WAIT_EVENT_CUSTOM_HASH_MAX_SIZE 128
```

That's probably still high enough, thoughts?

However, WaitEventCustomNew as it currently stands won’t work for our use
case, since it assigns an eventId automatically. The API currently takes a
classId and wait_event_name, but in our case, we’d actually want to pass in a
trancheId.

So, we might need a new API, something like:
```
WaitEventCustomNewWithEventId(uint32 classId, uint16 eventId,
const char *wait_event_name);
```
eventId in the LWLock case will be a tracheId that was generated
by the user in some earlier step, like LWLockInitialize

This would behave the same as the existing WaitEventCustomNew API,
except that it uses the provided eventId.

or maybe we can just allow WaitEventCustomNew to take in the eventId, and
if it's > 0, then use the passed in value, otherwise generate the next eventId.

I do like the latter approach more, what do you think?

I think I do prefer it too, but in both cases we'll have to make sure there
is no collision on the eventID (LWTRANCHE_FIRST_USER_DEFINED is currently
95).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#5Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#1)
Re: Improve LWLock tranche name visibility across backends

On Wed, Jul 09, 2025 at 04:39:48PM -0500, Sami Imseih wrote:

Attached is a proof of concept that does not alter the
LWLockRegisterTranche API.

IMHO we should consider modifying the API, because right now you have to
call LWLockRegisterTranche() in each backend. Why not accept the name as
an argument for LWLockNewTrancheId() and only require it to be called once
during your shared memory initialization? In any case, a lot of existing
code will continue to call it in each backend unless the API changes.

Instead, it detects when a registration is
performed by a normal backend and stores the tranche name in shared memory,
using a dshash keyed by tranche ID. Tranche name lookup now proceeds in
the order of built-in names, the local list, and finally the shared memory.
The fallback name "extension" can still be returned if an extension does
not register a tranche.

Why do we need three different places for the lock names? Is there a
reason we can't put it all in shared memory?

1/ There is currently no mechanism for deleting entries. I am not sure whether
this is a concern, since the size of the table would grow only with the
number of extensions and the number of LWLocks they initialize, which is
typically small. That said, others may have different thoughts on this.

I don't see any strong reason to allow deletion unless we started to
reclaim tranche IDs, and I don't see any strong reason for that, either.

2/ What is the appropriate size limit for a tranche name. The work done
in [0] caps the tranche name to 128 bytes for the dshash tranche, and
128 bytes + length of " DSA" suffix for the dsa tranche. Also, the
existing RequestNamedLWLockTranche caps the name to NAMEDATALEN. Currently,
LWLockRegisterTranche does not have a limit on the tranche name. I wonder
if we also need to take care of this and implement some common limit that
applies to tranch names regardless of how they're created?

Do we need to set a limit? If we're using a DSA and dshash, we could let
folks use arbitrary long tranche names, right? The reason for the limit in
the DSM registry is because the name is used as the key for the dshash
table.

--
nathan

#6Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#4)
Re: Improve LWLock tranche name visibility across backends

and instead reuse the existing static hash table, which is
capped at 128 custom wait events:

```
#define WAIT_EVENT_CUSTOM_HASH_MAX_SIZE 128
```

That's probably still high enough, thoughts?

I have no reason to believe that this number could be too low.
I am not aware of an extension that will initialize more than a
couple of LWLocks.

or maybe we can just allow WaitEventCustomNew to take in the eventId, and
if it's > 0, then use the passed in value, otherwise generate the next eventId.

I do like the latter approach more, what do you think?

I think I do prefer it too, but in both cases we'll have to make sure there
is no collision on the eventID (LWTRANCHE_FIRST_USER_DEFINED is currently
95).

As far as collisions are concerned, the key of the hash is the wait_event_info,
which is a bitwise OR of classId and eventId
```
wait_event_info = classId | eventId;
```
Do you think collision can still be possible?

Now, what I think will be a good API is to provide an alternative to
LWLockRegisterTranche,
which now takes in both a tranche ID and tranche_name. The new API I propose is
LWLockRegisterTrancheWaitEventCustom which takes only a tranche_name
and internally
calls WaitEventCustomNew to add a new wait_event_info to the hash
table. The wait_event_info
is made up of classId = PG_WAIT_LWLOCK and LWLockNewTrancheId().

I prefer we implement a new API for this to make it explicit that this
API will both register
a tranche and create a custom wait event. I do not think we should get
rid of LWLockRegisterTranche
because it is used by CreateLWLocks during startup and I don't see a
reason to change that.
See the attached v1.

What is missing from the attached v1 is:

1/ the documentation changes required in [0]https://www.postgresql.org/docs/current/xfunc-c.html.

2/ in dsm_registry.c, we are going to have to modify GetNamedDSHash and
GetNamedDSA to use the new API. Here is an example of how that looks like

```
@@ -389,14 +385,12 @@ GetNamedDSHash(const char *name, const
dshash_parameters *params, bool *found)
entry->type = DSMR_ENTRY_TYPE_DSH;

                /* Initialize the LWLock tranche for the DSA. */
-               dsa_state->tranche = LWLockNewTrancheId();
                sprintf(dsa_state->tranche_name, "%s%s", name,
DSMR_DSA_TRANCHE_SUFFIX);
-               LWLockRegisterTranche(dsa_state->tranche,
dsa_state->tranche_name);
+               dsa_state->tranche =
LWLockRegisterTrancheWaitEventCustom(dsa_state->tranche_name);
 ```

Is there a concern with a custom wait event to be created implicitly
via the GetNamed* APIs?

3/ I still did not workout the max length requirement of the tranche
name, but I think
it would have to be as large as the GetNamed* APIs from the dsm
registry support.

I hope with v1, we can agree to the general direction of this change.

[0]: https://www.postgresql.org/docs/current/xfunc-c.html

--
Sami

Attachments:

v1-0001-Create-LWLock-tranche-in-shared-memory.patchapplication/octet-stream; name=v1-0001-Create-LWLock-tranche-in-shared-memory.patchDownload+66-19
#7Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#6)
Re: Improve LWLock tranche name visibility across backends

On Fri, Jul 11, 2025 at 04:32:13PM -0500, Sami Imseih wrote:

Now, what I think will be a good API is to provide an alternative to
LWLockRegisterTranche,
which now takes in both a tranche ID and tranche_name. The new API I propose is
LWLockRegisterTrancheWaitEventCustom which takes only a tranche_name
and internally
calls WaitEventCustomNew to add a new wait_event_info to the hash
table. The wait_event_info
is made up of classId = PG_WAIT_LWLOCK and LWLockNewTrancheId().

I prefer we implement a new API for this to make it explicit that this
API will both register
a tranche and create a custom wait event. I do not think we should get
rid of LWLockRegisterTranche
because it is used by CreateLWLocks during startup and I don't see a
reason to change that.
See the attached v1.

Hm. I was thinking we could have LWLockNewTrancheId() take care of
registering the name. The CreateLWLocks() case strikes me as a special
path. IMHO LWLockRegisterTranche() should go away.

Is there a concern with a custom wait event to be created implicitly
via the GetNamed* APIs?

I'm not sure I see any particular advantage to using custom wait events
versus a dedicated LWLock tranche name table. If anything, the limits on
the number of tranches and the lengths of the names gives me pause.

--
nathan

#8Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#7)
Re: Improve LWLock tranche name visibility across backends

Attached is a proof of concept that does not alter the
LWLockRegisterTranche API.

IMHO we should consider modifying the API, because right now you have to
call LWLockRegisterTranche() in each backend. Why not accept the name as
an argument for LWLockNewTrancheId() and only require it to be called once
during your shared memory initialization?

Yes, we could do that, and this will simplify the tranche registration from the
current 2 step process of LWLockNewTrancheId() followed up with a
LWLockRegisterTranche(), to just simply LWLockNewTrancheId("my tranche").
I agree.

Instead, it detects when a registration is
performed by a normal backend and stores the tranche name in shared memory,
using a dshash keyed by tranche ID. Tranche name lookup now proceeds in
the order of built-in names, the local list, and finally the shared memory.
The fallback name "extension" can still be returned if an extension does
not register a tranche.

Why do we need three different places for the lock names? Is there a
reason we can't put it all in shared memory?

The real reason I felt it was better to keep three separate locations is that
it allows for a clear separation between user-defined tranches registered
during postmaster startup and those registered during a normal backend. The
tranches registered during postmaster are inherited by the backend via
fork() (or EXEC_BACKEND), and therefore, the dshash table will only be used
by a normal backend.

Since DSM is not available during postmaster, if we were to create a DSA
segment in place, similar to what's done in StatsShmemInit(), we would also
need to ensure that the initial shared memory is sized appropriately. This is
because it would need to be large enough to accommodate all user-defined
tranches registered during postmaster, without having to rely on new
dsm segments.
From my experimentation, this sizing is not as straightforward as simply
calculating # of tranches * size of a tranche entry.

I still think we should create the dsa during postmaster, as we do with
StatsShmemInit, but it would be better if postmaster keeps its hands off this
dshash and only normal backends can use them.

Thoughts?

2/ What is the appropriate size limit for a tranche name. The work done
in [0] caps the tranche name to 128 bytes for the dshash tranche, and
128 bytes + length of " DSA" suffix for the dsa tranche. Also, the
existing RequestNamedLWLockTranche caps the name to NAMEDATALEN. Currently,
LWLockRegisterTranche does not have a limit on the tranche name. I wonder
if we also need to take care of this and implement some common limit that
applies to tranch names regardless of how they're created?

Do we need to set a limit? If we're using a DSA and dshash, we could let
folks use arbitrary long tranche names, right? The reason for the limit in
the DSM registry is because the name is used as the key for the dshash
table.

Sure that is a good point. The dshash entry could be like below, without a limit
on the tranche_name.

```
typedef struct LWLockTracheNamesEntry
{
int trancheId;
const char *tranche_name;
} LWLockTracheNamesEntry;
```

Is there a concern with a custom wait event to be created implicitly
via the GetNamed* APIs?

I'm not sure I see any particular advantage to using custom wait events
versus a dedicated LWLock tranche name table. If anything, the limits on
the number of tranches and the lengths of the names gives me pause.

Sure, after contemplating on this a bit, I prefer a separate shared memory
as well. Custom wait events, while could work, will also be a bit of a confusing
user experience.

--
Sami

#9Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#8)
Re: Improve LWLock tranche name visibility across backends

On Mon, Jul 14, 2025 at 02:34:00PM -0500, Sami Imseih wrote:

Why do we need three different places for the lock names? Is there a
reason we can't put it all in shared memory?

The real reason I felt it was better to keep three separate locations is that
it allows for a clear separation between user-defined tranches registered
during postmaster startup and those registered during a normal backend. The
tranches registered during postmaster are inherited by the backend via
fork() (or EXEC_BACKEND), and therefore, the dshash table will only be used
by a normal backend.

Since DSM is not available during postmaster, if we were to create a DSA
segment in place, similar to what's done in StatsShmemInit(), we would also
need to ensure that the initial shared memory is sized appropriately. This is
because it would need to be large enough to accommodate all user-defined
tranches registered during postmaster, without having to rely on new
dsm segments.
From my experimentation, this sizing is not as straightforward as simply
calculating # of tranches * size of a tranche entry.

I still think we should create the dsa during postmaster, as we do with
StatsShmemInit, but it would be better if postmaster keeps its hands off this
dshash and only normal backends can use them.

Ah, I missed the problem with postmaster. Could we have the first backend
that needs to access the table be responsible for creating it and
populating it with the built-in/requested-at-startup entries? Also, is
there any chance that postmaster might need to access the tranche names?

--
nathan

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#9)
Re: Improve LWLock tranche name visibility across backends

Nathan Bossart <nathandbossart@gmail.com> writes:

Ah, I missed the problem with postmaster. Could we have the first backend
that needs to access the table be responsible for creating it and
populating it with the built-in/requested-at-startup entries? Also, is
there any chance that postmaster might need to access the tranche names?

Seems quite hazardous to let the postmaster get involved with such
a data structure. If it seems to need to, we'd better rethink
where to put the functionality that needs the access.

regards, tom lane

#11Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#9)
Re: Improve LWLock tranche name visibility across backends

Ah, I missed the problem with postmaster. Could we have the first backend
that needs to access the table be responsible for creating it and
populating it with the built-in/requested-at-startup entries?

We can certainly maintain a flag in the shared state that is set once
the first backend loads all the tranches in shared memory. That did not
cross my mind, but it feels wrong to offload such responsibility to a
normal backend.

Also, is there any chance that postmaster might need to access the
tranche names?

A postmaster does not currently have a reason to lookup
a tranche name, afaict. This only occurs when looking up wait events
or if lwlock tracing is enabled.

--
Sami

#12Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#11)
Re: Improve LWLock tranche name visibility across backends

On Mon, Jul 14, 2025 at 03:45:02PM -0500, Sami Imseih wrote:

Ah, I missed the problem with postmaster. Could we have the first backend
that needs to access the table be responsible for creating it and
populating it with the built-in/requested-at-startup entries?

We can certainly maintain a flag in the shared state that is set once
the first backend loads all the tranches in shared memory. That did not
cross my mind, but it feels wrong to offload such responsibility to a
normal backend.

Well, we already need each backend to either initialize or attach to the
dshash table, and the initialization would only ever happen once on a
running server. Adding a new initialization step to bootstrap the built-in
and registered-at-startup tranche names doesn't seem like that much of a
leap to me.

Another random thought: I worry that the dshash approach might be quite a
bit slower, and IIUC we just need to map an integer to a string. Maybe we
should just use a DSA for LWLockTrancheNames. IOW we'd leave it as a char
** but put it in shared memory.

--
nathan

#13Rahila Syed
rahilasyed90@gmail.com
In reply to: Sami Imseih (#8)
Re: Improve LWLock tranche name visibility across backends

Hi,

Since DSM is not available during postmaster, if we were to create a DSA
segment in place, similar to what's done in StatsShmemInit(), we would also
need to ensure that the initial shared memory is sized appropriately. This
is
because it would need to be large enough to accommodate all user-defined
tranches registered during postmaster, without having to rely on new
dsm segments.
From my experimentation, this sizing is not as straightforward as simply
calculating # of tranches * size of a tranche entry.

I still think we should create the dsa during postmaster, as we do with
StatsShmemInit, but it would be better if postmaster keeps its hands off
this
dshash and only normal backends can use them.

Thoughts?

Since creating a DSA segment in place during Postmaster startup still
requires calculating the size of
the tranche names table including the user-defined tranches, why not use
static shared memory to
pre-allocate a fixed sized table and arrays of size NAMEDATALEN to store
the names?

If a dshash table is used to store tranche names and IDs, where would the
tranche name for this table
be registered?

Thank you,
Rahila Syed

#14Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#6)
Re: Improve LWLock tranche name visibility across backends

Hi,

On Fri, Jul 11, 2025 at 04:32:13PM -0500, Sami Imseih wrote:

and instead reuse the existing static hash table, which is
capped at 128 custom wait events:

```
#define WAIT_EVENT_CUSTOM_HASH_MAX_SIZE 128
```

That's probably still high enough, thoughts?

I have no reason to believe that this number could be too low.
I am not aware of an extension that will initialize more than a
couple of LWLocks.

or maybe we can just allow WaitEventCustomNew to take in the eventId, and
if it's > 0, then use the passed in value, otherwise generate the next eventId.

I do like the latter approach more, what do you think?

I think I do prefer it too, but in both cases we'll have to make sure there
is no collision on the eventID (LWTRANCHE_FIRST_USER_DEFINED is currently
95).

As far as collisions are concerned, the key of the hash is the wait_event_info,
which is a bitwise OR of classId and eventId
```
wait_event_info = classId | eventId;
```
Do you think collision can still be possible?

I meant to say collision between the trancheID and WaitEventCustomCounter->nextId

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#15Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#12)
Re: Improve LWLock tranche name visibility across backends

Another random thought: I worry that the dshash approach might be quite a
bit slower, and IIUC we just need to map an integer to a string. Maybe we
should just use a DSA for LWLockTrancheNames. IOW we'd leave it as a char**
but put it in shared memory.

To use DSA just for this purpose, we would need to maintain an array of
dsa_pointers that reference the string(s), right? I am not clear what you
mean by using dsa to put the char**

--
Sami

#16Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#15)
Re: Improve LWLock tranche name visibility across backends

On Tue, Jul 15, 2025 at 11:52:19AM -0500, Sami Imseih wrote:

Another random thought: I worry that the dshash approach might be quite a
bit slower, and IIUC we just need to map an integer to a string. Maybe we
should just use a DSA for LWLockTrancheNames. IOW we'd leave it as a char**
but put it in shared memory.

To use DSA just for this purpose, we would need to maintain an array of
dsa_pointers that reference the string(s), right? I am not clear what you
mean by using dsa to put the char**

I was imagining putting the array in one big DSA allocation instead of
carting around a pointer for each tranche name. (Sorry, I realize I am
hand-waving over some of the details.)

--
nathan

#17Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#16)
Re: Improve LWLock tranche name visibility across backends

On Tue, Jul 15, 2025 at 11:57 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Tue, Jul 15, 2025 at 11:52:19AM -0500, Sami Imseih wrote:

Another random thought: I worry that the dshash approach might be quite a
bit slower, and IIUC we just need to map an integer to a string. Maybe we
should just use a DSA for LWLockTrancheNames. IOW we'd leave it as a char**
but put it in shared memory.

To use DSA just for this purpose, we would need to maintain an array of
dsa_pointers that reference the string(s), right? I am not clear what you
mean by using dsa to put the char**

I was imagining putting the array in one big DSA allocation instead of
carting around a pointer for each tranche name. (Sorry, I realize I am
hand-waving over some of the details.)

I understood it like this. Here is a sketch:

```
dsa_pointer p;

dsa = dsa_create(....)

p = dsa_allocate(dsa, LWLockTranchesInitialSize());
tranche_names = (char **) dsa_get_address(dsa, p);
tranche_names[0] = "my tranche";
tranche_names[1] = "my tranche";
```

We will need to track the size and resize if needed.

Is this what you mean, from a high level?

--
Sami

#18Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#17)
Re: Improve LWLock tranche name visibility across backends

On Tue, Jul 15, 2025 at 12:06:00PM -0500, Sami Imseih wrote:

On Tue, Jul 15, 2025 at 11:57 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:

I was imagining putting the array in one big DSA allocation instead of
carting around a pointer for each tranche name. (Sorry, I realize I am
hand-waving over some of the details.)

I understood it like this. Here is a sketch:

```
dsa_pointer p;

dsa = dsa_create(....)

p = dsa_allocate(dsa, LWLockTranchesInitialSize());
tranche_names = (char **) dsa_get_address(dsa, p);
tranche_names[0] = "my tranche";
tranche_names[1] = "my tranche";
```

We will need to track the size and resize if needed.

Is this what you mean, from a high level?

Yes, that's roughly what I had in mind. We might need to employ some
tricks to avoid a limit on tranche name length, but maybe that's not worth
the energy.

--
nathan

#19Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Rahila Syed (#13)
Re: Improve LWLock tranche name visibility across backends

Hi,

On Tue, Jul 15, 2025 at 12:59:04PM +0530, Rahila Syed wrote:

Hi,

If a dshash table is used to store tranche names and IDs, where would the
tranche name for this table
be registered?

I guess it could be a new BuiltinTrancheId for this dsa but not sure what Nathan
and Sami have in mind.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#20Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#19)
Re: Improve LWLock tranche name visibility across backends

Hi,

If a dshash table is used to store tranche names and IDs, where would the
tranche name for this table
be registered?

I guess it could be a new BuiltinTrancheId for this dsa but not sure what
Nathan
and Sami have in mind.

Yes, it will be a BuiltinTrancheId for a shared memory that is allocated
during postmaster for tracking tranches. The shared memory will then
only be used by normal backends to register tranches. Any tranche
registered during postmaster is inherited by the backends.

Regards,

Sami

Show quoted text
#21Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#18)
#22Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#21)
#23Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Nathan Bossart (#22)
#24Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#23)
#25Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#23)
#26Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#24)
#27Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#26)
#28Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#27)
#29Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#28)
#30Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#29)
#31Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#30)
#32Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#31)
#33Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#32)
#34Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#33)
#35Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#34)
#36Rahila Syed
rahilasyed90@gmail.com
In reply to: Bertrand Drouvot (#35)
#37Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#35)
#38Sami Imseih
samimseih@gmail.com
In reply to: Rahila Syed (#36)
#39Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#38)
#40Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#39)
#41Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#40)
#42Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#41)
#43Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#42)
#44Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#43)
#45Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#44)
#46Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#45)
#47Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#46)
#48Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#47)
#49Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#48)
#50Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#49)
#51Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#50)
#52Nathan Bossart
nathandbossart@gmail.com
In reply to: Bertrand Drouvot (#51)
#53Andres Freund
andres@anarazel.de
In reply to: Nathan Bossart (#52)
#54Nathan Bossart
nathandbossart@gmail.com
In reply to: Andres Freund (#53)
#55Andres Freund
andres@anarazel.de
In reply to: Nathan Bossart (#54)
#56Nathan Bossart
nathandbossart@gmail.com
In reply to: Andres Freund (#55)
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#56)
#58Sami Imseih
samimseih@gmail.com
In reply to: Tom Lane (#57)
#59Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#58)
#60Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#59)
#61Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#60)
#62Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#61)
#63Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#62)
#64Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sami Imseih (#63)
#65Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#64)
#66Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#62)
#67Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#66)
#68Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#67)
#69Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#68)
#70Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#69)
#71Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#70)
#72Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#71)
#73Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#72)
#74Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#73)
#75Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#74)
#76Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#75)
#77Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#76)
#78Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#77)
#79Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#78)
#80Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#79)
#81Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#80)
#82Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#81)
#83Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#80)
#84Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#83)
#85Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#84)
#86Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#85)
#87Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Nathan Bossart (#85)
#88Rahila Syed
rahilasyed90@gmail.com
In reply to: Nathan Bossart (#84)
#89Nathan Bossart
nathandbossart@gmail.com
In reply to: Bertrand Drouvot (#87)
#90Nathan Bossart
nathandbossart@gmail.com
In reply to: Rahila Syed (#88)
#91Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#90)
#92Rahila Syed
rahilasyed90@gmail.com
In reply to: Nathan Bossart (#90)
#93Nathan Bossart
nathandbossart@gmail.com
In reply to: Rahila Syed (#92)
#94Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#91)
#95Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#94)
#96Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#95)
#97Alexander Lakhin
exclusion@gmail.com
In reply to: Nathan Bossart (#96)
#98Sami Imseih
samimseih@gmail.com
In reply to: Alexander Lakhin (#97)
#99Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#98)
#100Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#99)
#101Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#100)
#102Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#101)
#103Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#102)
#104Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#103)
#105Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#104)
#106Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#105)
#107Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#106)