Add memory_limit_hits to pg_stat_replication_slots

Started by Bertrand Drouvot9 months ago28 messageshackers

Jump to latest

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

9 months ago

Hi hackers,

I think that it's currently not always possible to determine how many times
logical_decoding_work_mem has been reached.

For example, say a transaction is made of 40 subtransactions, and I get:

slot_name | spill_txns | spill_count | total_txns
--------------+------------+-------------+------------
logical_slot | 40 | 41 | 1
(1 row)

Then I know that logical_decoding_work_mem has been reached one time (total_txns).

But as soon as another transaction is decoded (that does not involve spilling):

slot_name | spill_txns | spill_count | total_txns
--------------+------------+-------------+------------
logical_slot | 40 | 41 | 2
(1 row)

Then we don't know if logical_decoding_work_mem has been reached one or two
times.

Please find attached a patch to $SUBJECT, to report the number of times the
logical_decoding_work_mem has been reached.

With such a counter one could get a ratio like total_txns/memory_limit_hits.

That could help to see if reaching logical_decoding_work_mem is rare or
frequent enough. If frequent, then maybe there is a need to adjust
logical_decoding_work_mem.

Based on my simple example above, one could say that it might be possible to get
the same with:

(spill_count - spill_txns) + (stream_count - stream_txns)

but that doesn't appear to be the case with a more complicated example (277 vs 247):

slot_name | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits | (spc-spct)+(strc-strt)
--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
logical_slot | 405 | 552 | 19 | 5 | 105 | 277 | 247
(1 row)

Not sure I like memory_limit_hits that much, maybe work_mem_exceeded is better?

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Masahiko Sawada

sawada.mshk@gmail.com

9 months ago

In reply to: Bertrand Drouvot (#1)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Wed, Aug 27, 2025 at 12:26 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi hackers,

I think that it's currently not always possible to determine how many times
logical_decoding_work_mem has been reached.

For example, say a transaction is made of 40 subtransactions, and I get:

slot_name | spill_txns | spill_count | total_txns
--------------+------------+-------------+------------
logical_slot | 40 | 41 | 1
(1 row)

Then I know that logical_decoding_work_mem has been reached one time (total_txns).

But as soon as another transaction is decoded (that does not involve spilling):

slot_name | spill_txns | spill_count | total_txns
--------------+------------+-------------+------------
logical_slot | 40 | 41 | 2
(1 row)

Then we don't know if logical_decoding_work_mem has been reached one or two
times.

Please find attached a patch to $SUBJECT, to report the number of times the
logical_decoding_work_mem has been reached.

With such a counter one could get a ratio like total_txns/memory_limit_hits.

That could help to see if reaching logical_decoding_work_mem is rare or
frequent enough. If frequent, then maybe there is a need to adjust
logical_decoding_work_mem.

Based on my simple example above, one could say that it might be possible to get
the same with:

(spill_count - spill_txns) + (stream_count - stream_txns)

but that doesn't appear to be the case with a more complicated example (277 vs 247):

slot_name | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits | (spc-spct)+(strc-strt)
--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
logical_slot | 405 | 552 | 19 | 5 | 105 | 277 | 247
(1 row)

Not sure I like memory_limit_hits that much, maybe work_mem_exceeded is better?

Looking forward to your feedback,

Yes, it's a quite different situation in two cases: spilling 100
transactions in one ReorderBufferCheckMemoryLimit() call and spilling
1 transaction in each 100 ReorderBufferCheckMemoryLimit() calls, even
though spill_txn is 100 in both cases. And we don't have any
statistics to distinguish between these cases. I agree with the
statistics.

One minor comment is:

@@ -1977,6 +1978,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
  repSlotStat.stream_bytes = rb->streamBytes;
  repSlotStat.total_txns = rb->totalTxns;
  repSlotStat.total_bytes = rb->totalBytes;
+ repSlotStat.memory_limit_hits = rb->memory_limit_hits;

Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Masahiko Sawada (#2)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi,

On Thu, Sep 11, 2025 at 03:24:54PM -0700, Masahiko Sawada wrote:

On Wed, Aug 27, 2025 at 12:26 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Looking forward to your feedback,

Yes,

Thanks for looking at it!

it's a quite different situation in two cases: spilling 100
transactions in one ReorderBufferCheckMemoryLimit() call and spilling
1 transaction in each 100 ReorderBufferCheckMemoryLimit() calls, even
though spill_txn is 100 in both cases. And we don't have any
statistics to distinguish between these cases.

Right.

One minor comment is:
@@ -1977,6 +1978,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
repSlotStat.stream_bytes = rb->streamBytes;
repSlotStat.total_txns = rb->totalTxns;
repSlotStat.total_bytes = rb->totalBytes;
+ repSlotStat.memory_limit_hits = rb->memory_limit_hits;
Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Makes sense, done with memoryLimitHits in v2 attached (that's the only change
as compared with v1).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Amit Kapila

amit.kapila16@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#3)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Makes sense, done with memoryLimitHits in v2 attached (that's the only change
as compared with v1).

The memory_limit_hits doesn't go well with the other names in the
view. Can we consider memory_exceeded_count? I find
memory_exceeded_count (or memory_exceeds_count) more clear and
matching with the existing counters. Also, how about keeping it
immediately after slot_name in the view? Keeping it in the end after
total_bytes seems out of place.

--
With Regards,
Amit Kapila.

shveta malik

shveta.malik@gmail.com

8 months ago

In reply to: Amit Kapila (#4)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Mon, Sep 22, 2025 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Makes sense, done with memoryLimitHits in v2 attached (that's the only change
as compared with v1).

The memory_limit_hits doesn't go well with the other names in the
view. Can we consider memory_exceeded_count? I find
memory_exceeded_count (or memory_exceeds_count) more clear and
matching with the existing counters. Also, how about keeping it
immediately after slot_name in the view? Keeping it in the end after
total_bytes seems out of place.

Since fields like spill_txns, spill_bytes, and stream_txns also talk
about exceeding 'logical_decoding_work_mem', my preference would be to
place this new field immediately after these spill and stream fields
(and before total_bytes). If not this, then as Amit suggested,
immediately before all these fields.
Other options for name could be 'mem_limit_exceeded_count' or
'mem_limit_hit_count'

thanks
Shveta

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: shveta malik (#5)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Mon, Sep 22, 2025 at 05:21:35PM +0530, shveta malik wrote:

On Mon, Sep 22, 2025 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Makes sense, done with memoryLimitHits in v2 attached (that's the only change
as compared with v1).

The memory_limit_hits doesn't go well with the other names in the
view. Can we consider memory_exceeded_count? I find
memory_exceeded_count (or memory_exceeds_count) more clear and
matching with the existing counters. Also, how about keeping it
immediately after slot_name in the view? Keeping it in the end after
total_bytes seems out of place.

Since fields like spill_txns, spill_bytes, and stream_txns also talk
about exceeding 'logical_decoding_work_mem', my preference would be to
place this new field immediately after these spill and stream fields
(and before total_bytes). If not this, then as Amit suggested,
immediately before all these fields.
Other options for name could be 'mem_limit_exceeded_count' or
'mem_limit_hit_count'

Thank you, Shveta and Amit, for looking at it. Since we already use txns as
abbreviation for transactions then I think it's ok to use "mem". Then I'm using
a mix of your proposals with "mem_exceeded_count" in v3 attached. Regarding the
field position, I like Shveta's proposal and did it that way.

However, technically speaking, "exceeded" is not the perfect wording since
the code was doing (and is still doing something similar to):

        if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
-               rb->size < logical_decoding_work_mem * (Size) 1024)
+               !memory_limit_reached)
                return;

as the comment describes correctly using "reached":

/*
* Check whether the logical_decoding_work_mem limit was reached, and if yes
* pick the largest (sub)transaction at-a-time to evict and spill its changes to
* disk or send to the output plugin until we reach under the memory limit.

So I think that "reached" or "hit" would be better wording. However, the
documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
so I went with "exceeded" for consistency. I think that's fine, if not we may want
to use "reached" for those 3 stats descriptions.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Masahiko Sawada

sawada.mshk@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#6)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Sep 22, 2025 at 05:21:35PM +0530, shveta malik wrote:

On Mon, Sep 22, 2025 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Makes sense, done with memoryLimitHits in v2 attached (that's the only change
as compared with v1).

The memory_limit_hits doesn't go well with the other names in the
view. Can we consider memory_exceeded_count? I find
memory_exceeded_count (or memory_exceeds_count) more clear and
matching with the existing counters. Also, how about keeping it
immediately after slot_name in the view? Keeping it in the end after
total_bytes seems out of place.

Since fields like spill_txns, spill_bytes, and stream_txns also talk
about exceeding 'logical_decoding_work_mem', my preference would be to
place this new field immediately after these spill and stream fields
(and before total_bytes). If not this, then as Amit suggested,
immediately before all these fields.
Other options for name could be 'mem_limit_exceeded_count' or
'mem_limit_hit_count'

Thank you, Shveta and Amit, for looking at it. Since we already use txns as
abbreviation for transactions then I think it's ok to use "mem". Then I'm using
a mix of your proposals with "mem_exceeded_count" in v3 attached. Regarding the
field position, I like Shveta's proposal and did it that way.

However, technically speaking, "exceeded" is not the perfect wording since
the code was doing (and is still doing something similar to):
if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
-               rb->size < logical_decoding_work_mem * (Size) 1024)
+               !memory_limit_reached)
return;
as the comment describes correctly using "reached":

/*
* Check whether the logical_decoding_work_mem limit was reached, and if yes
* pick the largest (sub)transaction at-a-time to evict and spill its changes to
* disk or send to the output plugin until we reach under the memory limit.

So I think that "reached" or "hit" would be better wording. However, the
documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
so I went with "exceeded" for consistency. I think that's fine, if not we may want
to use "reached" for those 3 stats descriptions.

I find "exceeded" is fine as the documentation for logical decoding
also uses it[1]https://www.postgresql.org/docs/devel/logicaldecoding-streaming.html:

Similar to spill-to-disk behavior, streaming is triggered when the
total amount of changes decoded from the WAL (for all in-progress
transactions) exceeds the limit defined by logical_decoding_work_mem
setting.

One comment for the v3 patch:

+       <para>
+        Number of times <literal>logical_decoding_work_mem</literal> has been
+        exceeded while decoding changes from WAL for this slot.
+       </para>

How about rewording it to like:

Number of times the memory used by logical decoding has exceeded
logical_decoding_work_mem.

Regards,

[1]: https://www.postgresql.org/docs/devel/logicaldecoding-streaming.html

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Masahiko Sawada (#7)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi,

On Tue, Sep 23, 2025 at 11:39:22AM -0700, Masahiko Sawada wrote:

On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
However, technically speaking, "exceeded" is not the perfect wording since
the code was doing (and is still doing something similar to):
if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
-               rb->size < logical_decoding_work_mem * (Size) 1024)
+               !memory_limit_reached)
return;
as the comment describes correctly using "reached":

/*
* Check whether the logical_decoding_work_mem limit was reached, and if yes
* pick the largest (sub)transaction at-a-time to evict and spill its changes to
* disk or send to the output plugin until we reach under the memory limit.

So I think that "reached" or "hit" would be better wording. However, the
documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
so I went with "exceeded" for consistency. I think that's fine, if not we may want
to use "reached" for those 3 stats descriptions.
I find "exceeded" is fine as the documentation for logical decoding
also uses it[1]:

Similar to spill-to-disk behavior, streaming is triggered when the
total amount of changes decoded from the WAL (for all in-progress
transactions) exceeds the limit defined by logical_decoding_work_mem
setting.

Yes it also uses "exceeds" but I think it's not 100% accurate. It would be
if, in ReorderBufferCheckMemoryLimit, we were using "<=" instead of "<" in:

if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
rb->size < logical_decoding_work_mem * (Size) 1024)

I think an accurate wording would be "reaches or exceeds" in all those places,
but just using "exceeds" looks good enough.

One comment for the v3 patch:
+       <para>
+        Number of times <literal>logical_decoding_work_mem</literal> has been
+        exceeded while decoding changes from WAL for this slot.
+       </para>
How about rewording it to like:

Number of times the memory used by logical decoding has exceeded
logical_decoding_work_mem.

That sounds better, thanks! Used this wording in v4 attached (that's the only
change as compared to v3).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Masahiko Sawada

sawada.mshk@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#8)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Tue, Sep 23, 2025 at 11:39:22AM -0700, Masahiko Sawada wrote:
On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
However, technically speaking, "exceeded" is not the perfect wording since
the code was doing (and is still doing something similar to):
if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
-               rb->size < logical_decoding_work_mem * (Size) 1024)
+               !memory_limit_reached)
return;
as the comment describes correctly using "reached":

/*
* Check whether the logical_decoding_work_mem limit was reached, and if yes
* pick the largest (sub)transaction at-a-time to evict and spill its changes to
* disk or send to the output plugin until we reach under the memory limit.

So I think that "reached" or "hit" would be better wording. However, the
documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
so I went with "exceeded" for consistency. I think that's fine, if not we may want
to use "reached" for those 3 stats descriptions.
I find "exceeded" is fine as the documentation for logical decoding
also uses it[1]:

Similar to spill-to-disk behavior, streaming is triggered when the
total amount of changes decoded from the WAL (for all in-progress
transactions) exceeds the limit defined by logical_decoding_work_mem
setting.
Yes it also uses "exceeds" but I think it's not 100% accurate. It would be
if, in ReorderBufferCheckMemoryLimit, we were using "<=" instead of "<" in:

if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
rb->size < logical_decoding_work_mem * (Size) 1024)

I think an accurate wording would be "reaches or exceeds" in all those places,
but just using "exceeds" looks good enough.
One comment for the v3 patch:
+       <para>
+        Number of times <literal>logical_decoding_work_mem</literal> has been
+        exceeded while decoding changes from WAL for this slot.
+       </para>
How about rewording it to like:

Number of times the memory used by logical decoding has exceeded
logical_decoding_work_mem.
That sounds better, thanks! Used this wording in v4 attached (that's the only
change as compared to v3).

Thank you for updating the patch! Here are some comments:

---
+   bool        memory_limit_reached = (rb->size >=
logical_decoding_work_mem * (Size) 1024);
+
+   if (memory_limit_reached)
+       rb->memExceededCount += 1;

Do we want to use 'exceeded' for the variable too for better consistency?

---
One thing I want to clarify is that even if the memory usage exceeds
the logical_decoding_work_mem it doesn't necessarily mean we serialize
or stream transactions because of
ReorderBufferCheckAndTruncateAbortedTXN(). For example, in a situation
where many large already-aborted transactions are truncated by
transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
a high number in mem_exceeded_count column but it might not actually
require any adjustment for logical_decoding_work_mem. One idea is to
increment that counter if exceeding memory usage is caused to
serialize or stream any transactions. On the other hand, it might make
sense and be straightforward too to show a pure statistic that the
memory usage exceeded the logical_decoding_work_mem. What do you
think?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#10

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Masahiko Sawada (#9)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi,

On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:

On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Thank you for updating the patch! Here are some comments:
---
+   bool        memory_limit_reached = (rb->size >=
logical_decoding_work_mem * (Size) 1024);
+
+   if (memory_limit_reached)
+       rb->memExceededCount += 1;
Do we want to use 'exceeded' for the variable too for better consistency?

I thought about it, but since we use ">=" I think that "reached" is more
accurate. So I went for "reached" for this one and "exceeded" for "user facing"
ones. That said I don't have a strong opinion about it, and I'd be ok to use
"exceeded" if you feel strong about it.

---
One thing I want to clarify is that even if the memory usage exceeds
the logical_decoding_work_mem it doesn't necessarily mean we serialize
or stream transactions because of
ReorderBufferCheckAndTruncateAbortedTXN().

Right.

For example, in a situation
where many large already-aborted transactions are truncated by
transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
a high number in mem_exceeded_count column but it might not actually
require any adjustment for logical_decoding_work_mem.

Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
stream_txns, no?

One idea is to
increment that counter if exceeding memory usage is caused to
serialize or stream any transactions. On the other hand, it might make
sense and be straightforward too to show a pure statistic that the
memory usage exceeded the logical_decoding_work_mem. What do you
think?

The new counter, as it is proposed, helps to see if the workload hits the
logical_decoding_work_mem frequently or not. I think it's valuable information
to have on its own.

Now to check if logical_decoding_work_mem needs adjustment, one could compare
mem_exceeded_count with the existing spill_txns and stream_txns.

For example, If I abort 20 transactions that exceeded logical_decoding_work_mem
, I'd get:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
0 | 0 | 20
(1 row)

That way I could figure out that mem_exceeded_count has been reached for
aborted transactions.

OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:

That probably means that mem_exceeded_count would need to be increased.

What do you think?

BTW, while doing some tests for the above examples, I noticed that the patch
was missing a check on memExceededCount in UpdateDecodingStats() (that produced
mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
v5 attached.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#11

Masahiko Sawada

sawada.mshk@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#10)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:
On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Thank you for updating the patch! Here are some comments:
---
+   bool        memory_limit_reached = (rb->size >=
logical_decoding_work_mem * (Size) 1024);
+
+   if (memory_limit_reached)
+       rb->memExceededCount += 1;
Do we want to use 'exceeded' for the variable too for better consistency?
I thought about it, but since we use ">=" I think that "reached" is more
accurate. So I went for "reached" for this one and "exceeded" for "user facing"
ones. That said I don't have a strong opinion about it, and I'd be ok to use
"exceeded" if you feel strong about it.

Agreed with the current style. Thank you for the explanation.

---
One thing I want to clarify is that even if the memory usage exceeds
the logical_decoding_work_mem it doesn't necessarily mean we serialize
or stream transactions because of
ReorderBufferCheckAndTruncateAbortedTXN().

Right.

For example, in a situation
where many large already-aborted transactions are truncated by
transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
a high number in mem_exceeded_count column but it might not actually
require any adjustment for logical_decoding_work_mem.

Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
stream_txns, no?

Right. I think only mem_exceeded_count has a high number while
spill_txns and stream_txns have lower numbers in this case (like you
shown in your first example below).

One idea is to
increment that counter if exceeding memory usage is caused to
serialize or stream any transactions. On the other hand, it might make
sense and be straightforward too to show a pure statistic that the
memory usage exceeded the logical_decoding_work_mem. What do you
think?

The new counter, as it is proposed, helps to see if the workload hits the
logical_decoding_work_mem frequently or not. I think it's valuable information
to have on its own.

Now to check if logical_decoding_work_mem needs adjustment, one could compare
mem_exceeded_count with the existing spill_txns and stream_txns.

For example, If I abort 20 transactions that exceeded logical_decoding_work_mem
, I'd get:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
0 | 0 | 20
(1 row)

That way I could figure out that mem_exceeded_count has been reached for
aborted transactions.

OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
38 | 20 | 58
(1 row)

That probably means that mem_exceeded_count would need to be increased.

What do you think?

Right. But one might argue that if we increment mem_exceeded_count
only when serializing or streaming is actually performed,
mem_exceeded_count would be 0 in the first example and therefore users
would be able to simply check mem_exceeded_count without any
computation.

BTW, while doing some tests for the above examples, I noticed that the patch
was missing a check on memExceededCount in UpdateDecodingStats() (that produced
mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
v5 attached.

Thank you for updating the patch!

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#12

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Masahiko Sawada (#11)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi,

On Thu, Sep 25, 2025 at 12:14:04PM -0700, Masahiko Sawada wrote:

On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

That probably means that mem_exceeded_count would need to be increased.

What do you think?

Right. But one might argue that if we increment mem_exceeded_count
only when serializing or streaming is actually performed,
mem_exceeded_count would be 0 in the first example and therefore users
would be able to simply check mem_exceeded_count without any
computation.

Right but we'd not be able to see when the memory limit has been reached for all
the cases (that would hide the aborted transactions case). I think that with
the current approach we have the best of both world (even if it requires some
computations).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13

Chao Li

li.evan.chao@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#10)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi Bertrand,

Thanks for the patch. The patch overall goods look to me. Just a few small comments:

On Sep 25, 2025, at 18:17, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:

<v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch>

1.
```
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -690,6 +690,9 @@ struct ReorderBuffer
 	int64		streamCount;	/* streaming invocation counter */
 	int64		streamBytes;	/* amount of data decoded */

+	/* Number of times logical_decoding_work_mem has been reached */
+	int64		memExceededCount;
```

For other metrics, the commented with “Statistics about xxx” above, and line comment after every metric. Maybe use the same style, so that it would be easy to add new metrics in future.

2.
```
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2100,7 +2100,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_REPLICATION_SLOT_COLS 10
+#define PG_STAT_GET_REPLICATION_SLOT_COLS 11
 	text	   *slotname_text = PG_GETARG_TEXT_P(0);
 	NameData	slotname;
 	TupleDesc	tupdesc;
@@ -2125,11 +2125,13 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stream_bytes",
 					   INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "total_txns",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "mem_exceeded_count",
 					   INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_bytes",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_txns",
 					   INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 10, "total_bytes",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 11, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 	BlessTupleDesc(tupdesc);
```

Is it better to add the new field in the last place?

Say if a client does “select * from pg_stat_get_replication_slit()”, it will just gets an extra column instead of mis-ordered columns.

3.
```
+       <para>
+        Number of times the memory used by logical decoding has exceeded
+        <literal>logical_decoding_work_mem</literal>.
+       </para>
```

Feels like “has” is not needed.

Maybe the wording can be simplified as:

Number of times logical decoding exceeded <literal>logical_decoding_work_mem</literal>.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#14

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Chao Li (#13)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi Evan,

On Fri, Sep 26, 2025 at 02:34:58PM +0800, Chao Li wrote:

Hi Bertrand,

Thanks for the patch. The patch overall goods look to me. Just a few small comments:

Thanks for looking at it!

On Sep 25, 2025, at 18:17, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:

<v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch>
1.
```
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -690,6 +690,9 @@ struct ReorderBuffer
int64		streamCount;	/* streaming invocation counter */
int64		streamBytes;	/* amount of data decoded */
+	/* Number of times logical_decoding_work_mem has been reached */
+	int64		memExceededCount;
```
For other metrics, the commented with “Statistics about xxx” above, and line comment after every metric. Maybe use the same style, so that it would be easy to add new metrics in future.

I'm not sure: for the moment we have only this stat related to logical_decoding_work_mem,
memory usage. If we add other stats in this area later, we could add a comment
"section" as you suggest.

2.
```
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2100,7 +2100,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
Datum
pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
{
-#define PG_STAT_GET_REPLICATION_SLOT_COLS 10
+#define PG_STAT_GET_REPLICATION_SLOT_COLS 11
text	   *slotname_text = PG_GETARG_TEXT_P(0);
NameData	slotname;
TupleDesc	tupdesc;
@@ -2125,11 +2125,13 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
INT8OID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stream_bytes",
INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "total_txns",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "mem_exceeded_count",
INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_bytes",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_txns",
INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 10, "total_bytes",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 11, "stats_reset",
TIMESTAMPTZOID, -1, 0);
BlessTupleDesc(tupdesc);
```

Is it better to add the new field in the last place?

Say if a client does “select * from pg_stat_get_replication_slit()”, it will just gets an extra column instead of mis-ordered columns.

I think it's good to have the function fields "ordering" matching the view
fields ordering. FWIW, the ordering has been discussed in [1]/messages/by-id/CAJpy0uBskXMq65rvWm8a-KR7cSb_sZH9CPRCnWAQrTOF5fciGw@mail.gmail.com.

3.
```
+       <para>
+        Number of times the memory used by logical decoding has exceeded
+        <literal>logical_decoding_work_mem</literal>.
+       </para>
```

Feels like “has” is not needed.

It's already done that way in other parts of the documentation:

$ git grep "has exceeded" "*sgml"
doc/src/sgml/maintenance.sgml: vacuum has exceeded the defined insert threshold, which is defined as:
doc/src/sgml/monitoring.sgml: logical decoding to decode changes from WAL has exceeded
doc/src/sgml/monitoring.sgml: from WAL for this slot has exceeded
doc/src/sgml/monitoring.sgml: Number of times the memory used by logical decoding has exceeded
doc/src/sgml/ref/create_subscription.sgml: retention duration has exceeded the

So that looks ok to me (I'm not a native English speaker though).

[1]: /messages/by-id/CAJpy0uBskXMq65rvWm8a-KR7cSb_sZH9CPRCnWAQrTOF5fciGw@mail.gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#15

Masahiko Sawada

sawada.mshk@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#12)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Thu, Sep 25, 2025 at 10:26 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Thu, Sep 25, 2025 at 12:14:04PM -0700, Masahiko Sawada wrote:

On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

That probably means that mem_exceeded_count would need to be increased.

What do you think?

Right. But one might argue that if we increment mem_exceeded_count
only when serializing or streaming is actually performed,
mem_exceeded_count would be 0 in the first example and therefore users
would be able to simply check mem_exceeded_count without any
computation.

Right but we'd not be able to see when the memory limit has been reached for all
the cases (that would hide the aborted transactions case). I think that with
the current approach we have the best of both world (even if it requires some
computations).

Agreed. It would be better to show a raw statistic so that users can
use the number as they want.

I've made a small comment change and added the commit message to the
v5 patch. I'm going to push the attached patch barring any objection
or review comments.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#16

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Masahiko Sawada (#15)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi,

On Thu, Oct 02, 2025 at 04:39:40PM -0700, Masahiko Sawada wrote:

On Thu, Sep 25, 2025 at 10:26 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Agreed. It would be better to show a raw statistic so that users can
use the number as they want.

I've made a small comment change

Thanks!

Comparing v5 and v6:

-       /* Number of times logical_decoding_work_mem has been reached */
+       /* Number of times the logical_decoding_work_mem limit has been reached */

LGTM.

and added the commit message to the
v5 patch. I'm going to push the attached patch barring any objection
or review comments.

The commit message LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#17

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

8 months ago

In reply to: Masahiko Sawada (#15)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Fri, Oct 3, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Sep 25, 2025 at 10:26 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Thu, Sep 25, 2025 at 12:14:04PM -0700, Masahiko Sawada wrote:

On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

That probably means that mem_exceeded_count would need to be increased.

What do you think?

Right. But one might argue that if we increment mem_exceeded_count
only when serializing or streaming is actually performed,
mem_exceeded_count would be 0 in the first example and therefore users
would be able to simply check mem_exceeded_count without any
computation.

Right but we'd not be able to see when the memory limit has been reached for all
the cases (that would hide the aborted transactions case). I think that with
the current approach we have the best of both world (even if it requires some
computations).

Agreed. It would be better to show a raw statistic so that users can
use the number as they want.

I've made a small comment change and added the commit message to the
v5 patch. I'm going to push the attached patch barring any objection
or review comments.

+ bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
(Size) 1024);
+
+ if (memory_limit_reached)
+ rb->memExceededCount += 1;

If the memory limit is hit but no transaction was serialized, the
stats won't be updated since UpdateDecodingStats() won't be called. We
need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
if no transaction was streamed or spilled.

-SELECT slot_name, spill_txns, spill_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
-            slot_name            | spill_txns | spill_count
----------------------------------+------------+-------------
- regression_slot_stats4_twophase |          0 |           0
+SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
+            slot_name            | spill_txns | spill_count |
mem_exceeded_count
+---------------------------------+------------+-------------+--------------------
+ regression_slot_stats4_twophase |          0 |           0 |
         1
 (1 row)

Are we sure that mem_exceeded_count will always be 1 in this case? Can
it be 2 or more because of background activity?

--
Best Wishes,
Ashutosh Bapat

#18

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

8 months ago

In reply to: Ashutosh Bapat (#17)

Re: Add memory_limit_hits to pg_stat_replication_slots

Hi,

On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:

+ bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
(Size) 1024);
+
+ if (memory_limit_reached)
+ rb->memExceededCount += 1;

Thanks for looking at it!

If the memory limit is hit but no transaction was serialized, the
stats won't be updated since UpdateDecodingStats() won't be called. We
need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
if no transaction was streamed or spilled.

I did some testing and the stats are reported because UpdateDecodingStats() is
also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
,for example, total_txns is reported even if no transaction was streamed or
spilled.

-SELECT slot_name, spill_txns, spill_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
-            slot_name            | spill_txns | spill_count
----------------------------------+------------+-------------
- regression_slot_stats4_twophase |          0 |           0
+SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
+            slot_name            | spill_txns | spill_count |
mem_exceeded_count
+---------------------------------+------------+-------------+--------------------
+ regression_slot_stats4_twophase |          0 |           0 |
1
(1 row)

Are we sure that mem_exceeded_count will always be 1 in this case? Can
it be 2 or more because of background activity?

I think that the question could be the same for spill_txns and spill_count. It
seems to have been working fine (that way) since this test exists (added in
072ee847ad4c) but I think that you raised a good point.

Sawada-San, what do you think about this particular test, is it safe to rely
on the exact values here?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#19

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

8 months ago

In reply to: Bertrand Drouvot (#18)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Fri, Oct 3, 2025 at 6:45 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
+ bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
(Size) 1024);
+
+ if (memory_limit_reached)
+ rb->memExceededCount += 1;
Thanks for looking at it!

If the memory limit is hit but no transaction was serialized, the
stats won't be updated since UpdateDecodingStats() won't be called. We
need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
if no transaction was streamed or spilled.

I did some testing and the stats are reported because UpdateDecodingStats() is
also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
,for example, total_txns is reported even if no transaction was streamed or
spilled.

In a very pathological case, where all transactions happen to be
aborted while decoding and yet memory limit is hit many times, nothing
will be reported till first committed transaction after it is decoded.
Which may never happen. I didn't find a call stack where by
UpdateDecodingStats could be reached from
ReorderBufferCheckAndTruncateAbortedTXN().

--
Best Wishes,
Ashutosh Bapat

#20

Masahiko Sawada

sawada.mshk@gmail.com

8 months ago

In reply to: Bertrand Drouvot (#18)

Re: Add memory_limit_hits to pg_stat_replication_slots

On Fri, Oct 3, 2025 at 6:15 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
+ bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
(Size) 1024);
+
+ if (memory_limit_reached)
+ rb->memExceededCount += 1;
Thanks for looking at it!

-SELECT slot_name, spill_txns, spill_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
-            slot_name            | spill_txns | spill_count
----------------------------------+------------+-------------
- regression_slot_stats4_twophase |          0 |           0
+SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
+            slot_name            | spill_txns | spill_count |
mem_exceeded_count
+---------------------------------+------------+-------------+--------------------
+ regression_slot_stats4_twophase |          0 |           0 |
1
(1 row)
Are we sure that mem_exceeded_count will always be 1 in this case? Can
it be 2 or more because of background activity?
I think that the question could be the same for spill_txns and spill_count. It
seems to have been working fine (that way) since this test exists (added in
072ee847ad4c) but I think that you raised a good point.

Sawada-San, what do you think about this particular test, is it safe to rely
on the exact values here?

In short, I'm fine with the change proposed by Ashtosh. I believe that
in this case it's safe to rely on the exact values in principle since
once we reach the memory limit we truncate all changes in the
transaction and skip further changes. This test with the patch fails
if there are other activities enough to reach the memory limit and
those transactions are aborted, which it's unlikely to happen, I
guess. That being said, there is no downside if we check
'mem_exceeded_count > 0' instead of checking the exact number and it
seems more stable for future changes.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#21

Masahiko Sawada

sawada.mshk@gmail.com

8 months ago

In reply to: Ashutosh Bapat (#19)

#22

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

8 months ago

In reply to: Masahiko Sawada (#21)