BUG #16327: Too many files in pg_replslot folder

Started by PG Bug reporting formabout 6 years ago3 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16327
Logged by: Dmitry Romanenko
Email address: romdn@inbox.ru
PostgreSQL version: 12.2
Operating system: Windows
Description:

Hello!
4.5 million rows are inserted from the temporary table. For each of the
three subscriptions, 712,000 xid-*-lsn-*-*.spill files of 3 GB are
generated. The following lines appear in the work log: ERROR: replication
slot is busy processing with PID. Replication stops working.
What to do?

PostgreSQL 12.2, compiled by Visual C++ build 1914, 64-bit
postgresql.auto.conf
max_connections = '30'
shared_buffers = '512MB'
effective_cache_size = '6GB'
maintenance_work_mem = '512MB'
checkpoint_completion_target = '0.9'
wal_buffers = '16MB'
default_statistics_target = '100'
work_mem = '21845kB'
min_wal_size = '2GB'
max_wal_size = '8GB'
max_worker_processes = '4'
max_parallel_workers_per_gather = '2'
max_parallel_workers = '4'
max_parallel_maintenance_workers = '2'
random_page_cost = '2'

P.S. Replication on PostgreSql 11.1 (with default settings) worked fine with
inserting 4.5 million rows. The xid - * - lsn - * - *. Snap files were
smaller and larger in size. But if there were several inserts in a row of
4.5 million lines, an error appeared: ERROR: out of memory.

Best regards, Dmitry

#2Euler Taveira
euler.taveira@2ndquadrant.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16327: Too many files in pg_replslot folder

On Sun, 29 Mar 2020 at 11:49, PG Bug reporting form <noreply@postgresql.org>
wrote:

P.S. Replication on PostgreSql 11.1 (with default settings) worked fine
with
inserting 4.5 million rows. The xid - * - lsn - * - *. Snap files were
smaller and larger in size. But if there were several inserts in a row of
4.5 million lines, an error appeared: ERROR: out of memory.

Could you provide a test case? Are you using subtransactions?

logical_decoding_work_mem [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cec2edfa7859279f36d2374770ca920c59c73dd8 was added in v13 to ameliorate some high
memory usage while decoding transactions. It is weird that it does not 'out
of memory' in v11 too (v11 introduced Generational memory allocator to
reduce logical decoding memory usage); I expect a high memory usage in v10
in a big transaction scenario.

If you are looking for a workaround, you should try to disable all
subscriptions and then enable only one at a time.

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cec2edfa7859279f36d2374770ca920c59c73dd8
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cec2edfa7859279f36d2374770ca920c59c73dd8

Regards,

--
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Dmitry
romdn@inbox.ru
In reply to: Euler Taveira (#2)
Re: BUG #16327: Too many files in pg_replslot folder

29.03.2020 22:52, Euler Taveira пишет:

On Sun, 29 Mar 2020 at 11:49, PG Bug reporting form
<noreply@postgresql.org <mailto:noreply@postgresql.org>> wrote:

P.S. Replication on PostgreSql 11.1 (with default settings) worked
fine with
inserting 4.5 million rows. The xid - * - lsn - * - *. Snap files were
smaller and larger in size. But if there were several inserts in a
row of
4.5 million lines, an error appeared: ERROR: out of memory.

Could you provide a test case? Are you using subtransactions?
logical_decoding_work_mem [1] was added in v13 to ameliorate some high
memory usage while decoding transactions. It is weird that it does not
'out of memory' in v11 too (v11 introduced Generational memory
allocator to reduce logical decoding memory usage); I expect a high
memory usage in v10 in a big transaction scenario.

If you are looking for a workaround, you should try to disable all
subscriptions and then enable only one at a time.

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cec2edfa7859279f36d2374770ca920c59c73dd8

Regards,

--
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Thanks for the answer!

Could you provide a test case?

I will try to do a test in the coming days.

Are you using subtransactions?

No. Functions are called sequentially from the windws scheduler. A large
file is loaded into a temporary table. In the cursor, entries are
processed line by line and placed in another temporary table. From which
they are already transferred to the main table.

If you are looking for a workaround, you should try to disable all

subscriptions and then enable only one at a time.
Good, but it will be inconvenient for the administrator. Or back to v11.1.

Today noticed in the repetition process.
Select from pg_stat_replication after a while gives:
pid  application_name client_port backend_start       state    
sent_lsn   write_lsn write_lag     replay_lag    reply_time
4916 sub_255590       2020-03-30 20:38:20 streaming 5/3344C598
5/3344A7D0 00:00:00.0058 00:00:00.0058 2020-03-30 21:42:50
4724 sub_3            558282020-03-30 20:38:34 streaming 5/3344C598
5/3344A7D0 00:00:00.0015 00:00:00.0015 2020-03-30 21:42:50
928  sub_4            59003       2020-03-30 20:38:56 streaming
5/3344C598 5/334442F0 2020-03-30 21:42:49
3328 sub_2            61992       2020-03-30 21:47:44 startup

And on server 2:
ERROR: failed to start broadcast WAL: ERROR: replication slot "sub_2" is
busy processing with PID 4916

And even later:
pid  application_name client_port backend_start       state    
sent_lsn   write_lsn write_lag     replay_lag    reply_time
4536 sub_250299       2020-03-30 22:33:33 startup
528  sub_3            53671      2020-03-30 22:31:28 startup
364  sub_4            51630 2020-03-30 22:28:29 startup

Best regards, Dmitry