BUG #19029: Replication Slot size keeps increasing while logical subscription works fine
The following bug has been logged on the website:
Bug reference: 19029
Logged by: Thadeus Anand
Email address: thadeus@rmkv.com
PostgreSQL version: 17.6
Operating system: Ubuntu 24.04.2
Description:
We have seven servers distributed geographically. We publish about 62 tables
through logical replication, with six subscribers. We were hugely affected
by the memory allocation bug after upgrading to 17.5. We had to suspend the
replication and resort to manually updating the tables.
So we updated to 17.6 almost immediately. Now the logical replication works
perfectly, but we notice the replication slot folders increasing in size
quite dramatically, even when there is no lag found in the subscribers.
We also have a replica set up for streaming replication, and that works fine
and that replication slot folder size is normal. So the issue is only with
logical replication.
Our wal_keep_size is set to 0, and my max_slot_wal_keep_size is set to 20GB.
I'm convinced this must be a bug, as it started happening only after the
minor version upgrade which was done three days ago. But if you have any
other suggestion, please help.
Restarting the subscriber does not help. Dropping the subscription clears
the replication slot, but that isn't a solution.
Now we are dropping and recreating the subscriptions every night, hoping we
will get a patch/solution soon.
If you need any further data, please ask and I will provide.
On Sun, Aug 24, 2025 at 3:14 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 19029
Logged by: Thadeus Anand
Email address: thadeus@rmkv.com
PostgreSQL version: 17.6
Operating system: Ubuntu 24.04.2
Description:We have seven servers distributed geographically. We publish about 62 tables
through logical replication, with six subscribers. We were hugely affected
by the memory allocation bug after upgrading to 17.5. We had to suspend the
replication and resort to manually updating the tables.So we updated to 17.6 almost immediately. Now the logical replication works
perfectly, but we notice the replication slot folders increasing in size
quite dramatically, even when there is no lag found in the subscribers.
What do you mean by this? Do you mean that restart_lsn of logical
slots is not moving forward or number of slots are growing in
$PGDATA/pg_replslot or something else? Specifically, I want to know
what do you mean by: 'but we notice the replication slot folders
increasing in size'?
--
With Regards,
Amit Kapila.
Hi,
The number of replication slots does not increase at all. But the size of
each folder keeps increasing simultaneously.
When I restart a subscriber, the size of the respective slot resets, then
immediately grows up to the size of the slots of other subscribers.
The table data is properly synchronized across the locations, and the
status of the replication remains as 'streaming'. So, effectively, the
logical replication works (which it wasn't in version 17.5), but the
replication slot isn't resetting.
Thadeus Anand.
On Mon, Aug 25, 2025 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Show quoted text
On Sun, Aug 24, 2025 at 3:14 AM PG Bug reporting form
<noreply@postgresql.org> wrote:The following bug has been logged on the website:
Bug reference: 19029
Logged by: Thadeus Anand
Email address: thadeus@rmkv.com
PostgreSQL version: 17.6
Operating system: Ubuntu 24.04.2
Description:We have seven servers distributed geographically. We publish about 62
tables
through logical replication, with six subscribers. We were hugely
affected
by the memory allocation bug after upgrading to 17.5. We had to suspend
the
replication and resort to manually updating the tables.
So we updated to 17.6 almost immediately. Now the logical replication
works
perfectly, but we notice the replication slot folders increasing in size
quite dramatically, even when there is no lag found in the subscribers.What do you mean by this? Do you mean that restart_lsn of logical
slots is not moving forward or number of slots are growing in
$PGDATA/pg_replslot or something else? Specifically, I want to know
what do you mean by: 'but we notice the replication slot folders
increasing in size'?--
With Regards,
Amit Kapila.
On Mon, Aug 25, 2025 at 10:39 AM Thadeus Anand <thadeus@rmkv.com> wrote:
Hi,
The number of replication slots does not increase at all. But the size of each folder keeps increasing simultaneously.
Can you share an example as to what you mean by folder keeps
increasing? Is the slot size on disk is increasing?
When I restart a subscriber, the size of the respective slot resets, then immediately grows up to the size of the slots of other subscribers.
To understand the problem you are facing, we need to see some real
data on the size. Can you share the output of pg_replication_slots
both before and after the folder's size increase?
The table data is properly synchronized across the locations, and the status of the replication remains as 'streaming'. So, effectively, the logical replication works (which it wasn't in version 17.5), but the replication slot isn't resetting.
Is it possible to share a reproducer or some steps to reproduce the problem?
--
With Regards,
Amit Kapila.
Hi,
Can you share an example as to what you mean by folder keeps
increasing? Is the slot size on disk is increasing?
The size on the disk keeps increasing. Last week, it went upto 35 GB per
slot, then I had to put a 20GB limit on max_slot_wal_keep_size. After
reaching the size limit, the subscriptions went inactive. Then I dropped
the subscriptions and freed up the replication slots.
To understand the problem you are facing, we need to see some real
data on the size. Can you share the output of pg_replication_slots
both before and after the folder's size increase?
I cannot share the real data right now because I have dropped the
subscriptions and publications altogether. I can create them again tonight
(I am in India) and share the details tomorrow.
Is it possible to share a reproducer or some steps to reproduce the
problem?
What I had was a straightforward logical replication setup. We had two
publications and six subscribing servers. Please also note that this setup
was working fine with PostgreSQL 15. During May we upgraded to 17.5, and
ran into a bug which is described in the following release notes section of
17.6
-
Avoid re-distributing cache invalidation messages from other
transactions during logical replication (vignesh C) §
<https://postgr.es/c/45c357e0e>
Our previous round of minor releases included a bug fix to ensure that
replication receiver processes would respond to cross-process cache
invalidation messages, preventing them from using stale catalog data while
performing replication updates. However, the fix unintentionally made them
also redistribute those messages again, leading to an exponential increase
in the number of invalidation messages, which would often end in a memory
allocation failure. Fix by not redistributing received messages.
Due to the bug, we stopped using logical replication and started it again
after updating to 17.6. Now we are having this issue.
Thadeus Anand.
On Mon, Aug 25, 2025 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
Show quoted text
On Mon, Aug 25, 2025 at 10:39 AM Thadeus Anand <thadeus@rmkv.com> wrote:
Hi,
The number of replication slots does not increase at all. But the size
of each folder keeps increasing simultaneously.
Can you share an example as to what you mean by folder keeps
increasing? Is the slot size on disk is increasing?When I restart a subscriber, the size of the respective slot resets,
then immediately grows up to the size of the slots of other subscribers.
To understand the problem you are facing, we need to see some real
data on the size. Can you share the output of pg_replication_slots
both before and after the folder's size increase?The table data is properly synchronized across the locations, and the
status of the replication remains as 'streaming'. So, effectively, the
logical replication works (which it wasn't in version 17.5), but the
replication slot isn't resetting.Is it possible to share a reproducer or some steps to reproduce the
problem?--
With Regards,
Amit Kapila.
Dear Thadeus,
The size on the disk keeps increasing. Last week, it went upto 35 GB per slot,
then I had to put a 20GB limit on max_slot_wal_keep_size. After reaching the size,
the subscriptions went inactive. Then I dropped the subscriptions and freed up
the replication slots.
Can you clarify which directory occupied the disk? Is it `pg_wal`?
You told that "upto 35GB per slot", and 6 replication slots exist on your system,
so did the disk usage increase 210GB in total?
I cannot share the real data right now because I have dropped the subscriptions
and publications altogether. I can create them again tonight (I am in India) and
share the details tomorrow.
Thanks. If possible, can you share the script to emulate system architecture and
settings? It is very helpful to fully understand the shape of the system and
definitions.
What I had was a straightforward logical replication setup. We had two publications
and six subscribing servers. Please also note that this setup was working fine
with PostgreSQL 15. During May we upgraded to 17.5, and ran into a bug which is
described in the following release notes section of 17.6
So you did run both DDLs and DMLs on the publisher side right? Can you also
provide SQL commands you run on the system? It's very helpful if all SQLs are
written in the executable scripts or something.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
Can you clarify which directory occupied the disk? Is it `pg_wal`?
You told that "upto 35GB per slot", and 6 replication slots exist on your
system,
so did the disk usage increase 210GB in total?
Under the pg_replslot folder, each replication slot's folder increased
simultaneously. There were 12 folders (two publications, six subscribers),
and the total size increase was about 420 GB).
Thanks. If possible, can you share the script to emulate system
architecture and
settings? It is very helpful to fully understand the shape of the system
and
definitions.
I will share as much data as I can.
So you did run both DDLs and DMLs on the publisher side right? Can you
also
provide SQL commands you run on the system? It's very helpful if all SQLs
are
written in the executable scripts or something.
This is a complex ERP setup and I may not be able to give you all the SQL
from the publisher side. All of these 62 tables are either master tables or
configuration tables, so only INSERTs, DELETEs and UPDATEs happen, nothing
else.
Thank you for your help. I sincerely hope this is a real issue and not a
stupid configuration error at my end effectively wasting your valuable time.
Thadeus Anand.
On Mon, Aug 25, 2025 at 1:35 PM Hayato Kuroda (Fujitsu) <
kuroda.hayato@fujitsu.com> wrote:
Show quoted text
Dear Thadeus,
The size on the disk keeps increasing. Last week, it went upto 35 GB per
slot,
then I had to put a 20GB limit on max_slot_wal_keep_size. After reaching
the size,
the subscriptions went inactive. Then I dropped the subscriptions and
freed up
the replication slots.Can you clarify which directory occupied the disk? Is it `pg_wal`?
You told that "upto 35GB per slot", and 6 replication slots exist on your
system,
so did the disk usage increase 210GB in total?I cannot share the real data right now because I have dropped the
subscriptions
and publications altogether. I can create them again tonight (I am in
India) and
share the details tomorrow.Thanks. If possible, can you share the script to emulate system
architecture and
settings? It is very helpful to fully understand the shape of the system
and
definitions.What I had was a straightforward logical replication setup. We had two
publications
and six subscribing servers. Please also note that this setup was working
fine
with PostgreSQL 15. During May we upgraded to 17.5, and ran into a bug
which is
described in the following release notes section of 17.6So you did run both DDLs and DMLs on the publisher side right? Can you also
provide SQL commands you run on the system? It's very helpful if all SQLs
are
written in the executable scripts or something.Best regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Thadeus,
Under the pg_replslot folder, each replication slot's folder increased
simultaneously. There were 12 folders (two publications, six subscribers), and
the total size increase was about 420 GB).
Oh, I misunderstood like that number of WAL files increased.
Let me do some more questions to diagnose your system.
Can you share us the logical_decoding_work_mem on the publisher side? You can obtain via:
```
SHOW logical_decoding_work_mem;
```
Also, when you succeeded to reproduce, can you run the below command to see the
conent of the directory? ${DATA_PUB} can be tuned based on your environment.
```
du -sh ${DATA_PUB}/pg_replslot/*/*
```
This can see for which file uses the disk so much.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
The logical_decoding_work_mem at the publisher is currently set at 1 GB.
I remember setting this a while ago as part of my struggle to get rid of
the memory allocation issue.
Thadeus Anand.
On Mon, 25 Aug, 2025, 3:19 pm Hayato Kuroda (Fujitsu), <
kuroda.hayato@fujitsu.com> wrote:
Show quoted text
Dear Thadeus,
Under the pg_replslot folder, each replication slot's folder increased
simultaneously. There were 12 folders (two publications, sixsubscribers), and
the total size increase was about 420 GB).
Oh, I misunderstood like that number of WAL files increased.
Let me do some more questions to diagnose your system.
Can you share us the logical_decoding_work_mem on the publisher side? You
can obtain via:
```
SHOW logical_decoding_work_mem;
```Also, when you succeeded to reproduce, can you run the below command to
see the
conent of the directory? ${DATA_PUB} can be tuned based on your
environment.
```
du -sh ${DATA_PUB}/pg_replslot/*/*
```
This can see for which file uses the disk so much.Best regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, 25 Aug 2025 at 12:59, Thadeus Anand <thadeus@rmkv.com> wrote:
Hi,
Can you share an example as to what you mean by folder keeps
increasing? Is the slot size on disk is increasing?The size on the disk keeps increasing. Last week, it went upto 35 GB per slot, then I had to put a 20GB limit on max_slot_wal_keep_size. After reaching the size limit, the subscriptions went inactive. Then I dropped the subscriptions and freed up the replication slots.
To understand the problem you are facing, we need to see some real
data on the size. Can you share the output of pg_replication_slots
both before and after the folder's size increase?I cannot share the real data right now because I have dropped the subscriptions and publications altogether. I can create them again tonight (I am in India) and share the details tomorrow.
Is it possible to share a reproducer or some steps to reproduce the problem?
What I had was a straightforward logical replication setup. We had two publications and six subscribing servers. Please also note that this setup was working fine with PostgreSQL 15. During May we upgraded to 17.5, and ran into a bug which is described in the following release notes section of 17.6
Have you tested this on PG16 or any other PG17 versions where it works
successfully? I’m trying to determine which specific version has this
issue.
Regards,
Vignesh
Hi,
In case my previous emails didn't make it clear, we started having this
issue after upgrading to 17.6 only.
Thadeus Anand.
On Mon, 25 Aug, 2025, 3:36 pm vignesh C, <vignesh21@gmail.com> wrote:
Show quoted text
On Mon, 25 Aug 2025 at 12:59, Thadeus Anand <thadeus@rmkv.com> wrote:
Hi,
Can you share an example as to what you mean by folder keeps
increasing? Is the slot size on disk is increasing?The size on the disk keeps increasing. Last week, it went upto 35 GB per
slot, then I had to put a 20GB limit on max_slot_wal_keep_size. After
reaching the size limit, the subscriptions went inactive. Then I dropped
the subscriptions and freed up the replication slots.To understand the problem you are facing, we need to see some real
data on the size. Can you share the output of pg_replication_slots
both before and after the folder's size increase?I cannot share the real data right now because I have dropped the
subscriptions and publications altogether. I can create them again tonight
(I am in India) and share the details tomorrow.Is it possible to share a reproducer or some steps to reproduce the
problem?
What I had was a straightforward logical replication setup. We had two
publications and six subscribing servers. Please also note that this setup
was working fine with PostgreSQL 15. During May we upgraded to 17.5, and
ran into a bug which is described in the following release notes section of
17.6Have you tested this on PG16 or any other PG17 versions where it works
successfully? I’m trying to determine which specific version has this
issue.Regards,
Vignesh
On Mon, Aug 25, 2025 at 3:34 PM Thadeus Anand <thadeus@rmkv.com> wrote:
The logical_decoding_work_mem at the publisher is currently set at 1 GB.
I remember setting this a while ago as part of my struggle to get rid of the memory allocation issue.
Hmm, this means that your transaction size has a large number of
changes which leads to spilling of changes. This makes another
question of Kuroda-San more important, which is to show the size and
contents of pg_replslot. I have tried to check the fixes done in 17.5
and 17.6 but I don't see any obvious change which could lead to such a
problem. I could be missing something which we can try to find with
more information and probably if you can share a test.
--
With Regards,
Amit Kapila.
Hi,
In the meanwhile, I'm sharing my configuration file here, just in case you
need to know any particular setting.
Again, thank you very much for taking the time to help.
Thadeus Anand.
On Mon, Aug 25, 2025 at 4:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Show quoted text
On Mon, Aug 25, 2025 at 3:34 PM Thadeus Anand <thadeus@rmkv.com> wrote:
The logical_decoding_work_mem at the publisher is currently set at 1 GB.
I remember setting this a while ago as part of my struggle to get rid of
the memory allocation issue.
Hmm, this means that your transaction size has a large number of
changes which leads to spilling of changes. This makes another
question of Kuroda-San more important, which is to show the size and
contents of pg_replslot. I have tried to check the fixes done in 17.5
and 17.6 but I don't see any obvious change which could lead to such a
problem. I could be missing something which we can try to find with
more information and probably if you can share a test.--
With Regards,
Amit Kapila.
Attachments:
Hi,
We created the publications and subscriptions again last night after
restarting the publisher and all the subscribers.
So far, everything seems to be working fine, and the replication slot
folders remain at 4 KB each.
I did not change anything, and didn't do anything differently this time. In
any case, I will monitor this closely and let you know if any
malfunction.occurs.
Thanks and regards,
Thadeus Anand.
On Mon, Aug 25, 2025 at 5:05 PM Thadeus Anand <thadeus@rmkv.com> wrote:
Show quoted text
Hi,
In the meanwhile, I'm sharing my configuration file here, just in case you
need to know any particular setting.Again, thank you very much for taking the time to help.
Thadeus Anand.
On Mon, Aug 25, 2025 at 4:43 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Mon, Aug 25, 2025 at 3:34 PM Thadeus Anand <thadeus@rmkv.com> wrote:
The logical_decoding_work_mem at the publisher is currently set at 1 GB.
I remember setting this a while ago as part of my struggle to get rid
of the memory allocation issue.
Hmm, this means that your transaction size has a large number of
changes which leads to spilling of changes. This makes another
question of Kuroda-San more important, which is to show the size and
contents of pg_replslot. I have tried to check the fixes done in 17.5
and 17.6 but I don't see any obvious change which could lead to such a
problem. I could be missing something which we can try to find with
more information and probably if you can share a test.--
With Regards,
Amit Kapila.
On Tue, Aug 26, 2025 at 1:45 PM Thadeus Anand <thadeus@rmkv.com> wrote:
We created the publications and subscriptions again last night after restarting the publisher and all the subscribers.
So far, everything seems to be working fine, and the replication slot folders remain at 4 KB each.
I did not change anything, and didn't do anything differently this time. In any case, I will monitor this closely and let you know if any malfunction.occurs.
Thanks for the update.
--
With Regards,
Amit Kapila.
Hi,
Since this afternoon, the replication slot folders have started to grow in
size again. I have attached a couple of screenshots showing the size and
some contents of a folder. Hope you find them useful.
I checked the database log at the publisher, and could find nothing
suspicious around that time when the spill files started appearing.
Thadeus Anand.
On Tue, Aug 26, 2025 at 4:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Show quoted text
On Tue, Aug 26, 2025 at 1:45 PM Thadeus Anand <thadeus@rmkv.com> wrote:
We created the publications and subscriptions again last night after
restarting the publisher and all the subscribers.
So far, everything seems to be working fine, and the replication slot
folders remain at 4 KB each.
I did not change anything, and didn't do anything differently this time.
In any case, I will monitor this closely and let you know if any
malfunction.occurs.Thanks for the update.
--
With Regards,
Amit Kapila.
Hi,
Here is the result of "select * from pg_replication_slots"
slot_name plugin slot_type datoid database temporary active active_pid xmin
catalog_xmin restart_lsn confirmed_flush_lsn wal_status safe_wal_size
two_phase inactive_since conflicting invalidation_reason failover synced
rmkv_crm_sub_twn pgoutput logical 5 postgres FALSE TRUE 240,409 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_twn pgoutput logical 5 postgres FALSE TRUE 240,410 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
standby_slot [NULL] physical [NULL] [NULL] FALSE TRUE 240,458 [NULL] [NULL]
2EA/834D54B8 [NULL] reserved 21,486,545,736 FALSE [NULL] [NULL] [NULL] FALSE
FALSE
rmkv_crm_sub_cbe pgoutput logical 5 postgres FALSE TRUE 240,404 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_cbe pgoutput logical 5 postgres FALSE TRUE 240,406 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_tvl pgoutput logical 5 postgres FALSE TRUE 240,408 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_tvl pgoutput logical 5 postgres FALSE TRUE 240,411 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_blr pgoutput logical 5 postgres FALSE TRUE 240,405 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_blr pgoutput logical 5 postgres FALSE TRUE 240,407 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_vch pgoutput logical 5 postgres FALSE TRUE 240,403 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_vch pgoutput logical 5 postgres FALSE TRUE 240,401 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_vdp pgoutput logical 5 postgres FALSE TRUE 240,400 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_vdp pgoutput logical 5 postgres FALSE TRUE 419,107 [NULL]
10861325 2E9/296801D0 2EA/1E788780 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
Thadeus Anand.
On Tue, Aug 26, 2025 at 5:44 PM Thadeus Anand <thadeus@rmkv.com> wrote:
Show quoted text
Hi,
Since this afternoon, the replication slot folders have started to grow in
size again. I have attached a couple of screenshots showing the size and
some contents of a folder. Hope you find them useful.I checked the database log at the publisher, and could find nothing
suspicious around that time when the spill files started appearing.Thadeus Anand.
On Tue, Aug 26, 2025 at 4:05 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Tue, Aug 26, 2025 at 1:45 PM Thadeus Anand <thadeus@rmkv.com> wrote:
We created the publications and subscriptions again last night after
restarting the publisher and all the subscribers.
So far, everything seems to be working fine, and the replication slot
folders remain at 4 KB each.
I did not change anything, and didn't do anything differently this
time. In any case, I will monitor this closely and let you know if any
malfunction.occurs.Thanks for the update.
--
With Regards,
Amit Kapila.
Hi Thadeus Anand, Recently in our postgres 16.4 version system we have
encountered similar kind of pg_repslot folder size growing abnormally. The
files in this folders are just spill over Wal files due to long running or
complex transactions. There are commands to check the slot's holding
transactions count which related to this spill. If required can share the
same. So suggesting to monitor the transactions/sub transactions that are
executing. Well for our case, one of the past transaction invoked the
issue.
Not sure whether it's bug which have to be addressed in 17.5.
On Tue, 26 Aug, 2025, 17:51 Thadeus Anand, <thadeus@rmkv.com> wrote:
Show quoted text
Hi,
Here is the result of "select * from pg_replication_slots"
slot_name plugin slot_type datoid database temporary active active_pid
xmin catalog_xmin restart_lsn confirmed_flush_lsn wal_status safe_wal_size
two_phase inactive_since conflicting invalidation_reason failover synced
rmkv_crm_sub_twn pgoutput logical 5 postgres FALSE TRUE 240,409 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_twn pgoutput logical 5 postgres FALSE TRUE 240,410 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
standby_slot [NULL] physical [NULL] [NULL] FALSE TRUE 240,458 [NULL]
[NULL] 2EA/834D54B8 [NULL] reserved 21,486,545,736 FALSE [NULL] [NULL]
[NULL] FALSE FALSE
rmkv_crm_sub_cbe pgoutput logical 5 postgres FALSE TRUE 240,404 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_cbe pgoutput logical 5 postgres FALSE TRUE 240,406 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_tvl pgoutput logical 5 postgres FALSE TRUE 240,408 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_tvl pgoutput logical 5 postgres FALSE TRUE 240,411 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_blr pgoutput logical 5 postgres FALSE TRUE 240,405 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_blr pgoutput logical 5 postgres FALSE TRUE 240,407 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_vch pgoutput logical 5 postgres FALSE TRUE 240,403 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_vch pgoutput logical 5 postgres FALSE TRUE 240,401 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_vdp pgoutput logical 5 postgres FALSE TRUE 240,400 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_vdp pgoutput logical 5 postgres FALSE TRUE 419,107 [NULL]
10861325 2E9/296801D0 2EA/1E788780 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSEThadeus Anand.
On Tue, Aug 26, 2025 at 5:44 PM Thadeus Anand <thadeus@rmkv.com> wrote:
Hi,
Since this afternoon, the replication slot folders have started to grow
in size again. I have attached a couple of screenshots showing the size and
some contents of a folder. Hope you find them useful.I checked the database log at the publisher, and could find nothing
suspicious around that time when the spill files started appearing.Thadeus Anand.
On Tue, Aug 26, 2025 at 4:05 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Tue, Aug 26, 2025 at 1:45 PM Thadeus Anand <thadeus@rmkv.com> wrote:
We created the publications and subscriptions again last night after
restarting the publisher and all the subscribers.
So far, everything seems to be working fine, and the replication slot
folders remain at 4 KB each.
I did not change anything, and didn't do anything differently this
time. In any case, I will monitor this closely and let you know if any
malfunction.occurs.Thanks for the update.
--
With Regards,
Amit Kapila.
Hi Nantha,
Thank you for your valuable input. So it's really not a bug, but an
existing behavior you mean.
I do not know or understand what a "spill" is. I will look it up. But the
tables that are part of the publication are not updated as part of any huge
transaction. They may be part of some other long running procedures though.
If that can create such spills, we can look into them.
Can someone please advise on how to avoid such spills please?
Thanks and regards,
Thadeus Anand.
On Wed, 27 Aug, 2025, 8:09 am Nantha kumar.T., <nanthad@gmail.com> wrote:
Show quoted text
Hi Thadeus Anand, Recently in our postgres 16.4 version system we have
encountered similar kind of pg_repslot folder size growing abnormally. The
files in this folders are just spill over Wal files due to long running or
complex transactions. There are commands to check the slot's holding
transactions count which related to this spill. If required can share the
same. So suggesting to monitor the transactions/sub transactions that are
executing. Well for our case, one of the past transaction invoked the
issue.Not sure whether it's bug which have to be addressed in 17.5.
On Tue, 26 Aug, 2025, 17:51 Thadeus Anand, <thadeus@rmkv.com> wrote:
Hi,
Here is the result of "select * from pg_replication_slots"
slot_name plugin slot_type datoid database temporary active active_pid
xmin catalog_xmin restart_lsn confirmed_flush_lsn wal_status
safe_wal_size two_phase inactive_since conflicting invalidation_reason
failover synced
rmkv_crm_sub_twn pgoutput logical 5 postgres FALSE TRUE 240,409 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_twn pgoutput logical 5 postgres FALSE TRUE 240,410 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
standby_slot [NULL] physical [NULL] [NULL] FALSE TRUE 240,458 [NULL]
[NULL] 2EA/834D54B8 [NULL] reserved 21,486,545,736 FALSE [NULL] [NULL]
[NULL] FALSE FALSE
rmkv_crm_sub_cbe pgoutput logical 5 postgres FALSE TRUE 240,404 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_cbe pgoutput logical 5 postgres FALSE TRUE 240,406 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_tvl pgoutput logical 5 postgres FALSE TRUE 240,408 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_tvl pgoutput logical 5 postgres FALSE TRUE 240,411 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_blr pgoutput logical 5 postgres FALSE TRUE 240,405 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_blr pgoutput logical 5 postgres FALSE TRUE 240,407 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_vch pgoutput logical 5 postgres FALSE TRUE 240,403 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_vch pgoutput logical 5 postgres FALSE TRUE 240,401 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_crm_sub_vdp pgoutput logical 5 postgres FALSE TRUE 240,400 [NULL]
10861325 2E9/296801D0 2EA/834D54B8 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSE
rmkv_main_sub_vdp pgoutput logical 5 postgres FALSE TRUE 419,107 [NULL]
10861325 2E9/296801D0 2EA/1E788780 reserved 15,681,629,000 FALSE [NULL]
FALSE [NULL] FALSE FALSEThadeus Anand.
On Tue, Aug 26, 2025 at 5:44 PM Thadeus Anand <thadeus@rmkv.com> wrote:
Hi,
Since this afternoon, the replication slot folders have started to grow
in size again. I have attached a couple of screenshots showing the size and
some contents of a folder. Hope you find them useful.I checked the database log at the publisher, and could find nothing
suspicious around that time when the spill files started appearing.Thadeus Anand.
On Tue, Aug 26, 2025 at 4:05 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Tue, Aug 26, 2025 at 1:45 PM Thadeus Anand <thadeus@rmkv.com> wrote:
We created the publications and subscriptions again last night after
restarting the publisher and all the subscribers.
So far, everything seems to be working fine, and the replication slot
folders remain at 4 KB each.
I did not change anything, and didn't do anything differently this
time. In any case, I will monitor this closely and let you know if any
malfunction.occurs.Thanks for the update.
--
With Regards,
Amit Kapila.
Dear Thadeus,
Thanks for sharing the info. From your screenshot, I can see that there is a
transaction 10861356 which modifies many tuples. Logical decoding has a mechanism
to spill a part of changes to the disk to avoid using much amount of memory and
it seems to be used here.
I think as next step we can clarify which process starts the transaction. Can
you run below query when you reproduce the issue?
```
# SELECT * FROM pg_stat_activity WHERE backend_xid = '${XID}';
```
Files under pg_replslot/${slot_name} has a format: `xid-${XID}-lsn-${LSN_UPPER}-${LSN_LOWER}.spill`
so that ${XID} can be tuned when it happns. In your attached case, 10861356.
Best regards,
Hayato Kuroda
FUJITSU LIMITED