pgbackrest after a network outage unable to perform backup [fails always]
list,
After a n/w link outage my pgbackrest to a remote repo server down for a
few days. Once the link is established, my pgbackrest always fails for
diff, full backups it starts then fails with error "unable to archive
before 600000ms timeout. "
I have copied the already existing archive to a safe location (another
folder )on the reposerver, Then I stopped the stanza from the reposerver,
and done a stanza-delete --force on the reposerver.
Then I recreated the stanza again with the same stanza name and did the
info check command, but it also fails with the 60000ms time out.
I am checking the Repo-archive-push-async.log it says
[root@db1 ~]# tail -f /var/log/pgbackrest/TM_Repo-archive-push-async.log
2026-02-24 12:29:37.826 P00 WARN: local-2 process terminated unexpectedly
on signal 11
2026-02-24 12:29:37.827 P00 WARN: unable to wait on child process: [10]
No child processes
2026-02-24 12:29:37.827 P00 WARN: unable to wait on child process: [10]
No child processes
2026-02-24 12:29:37.827 P00 WARN: local-4 process terminated unexpectedly
on signal 6
2026-02-24 12:29:37.827 P00 WARN: local-5 process terminated unexpectedly
on signal 11
2026-02-24 12:29:37.827 P00 WARN: local-6 process terminated unexpectedly
on signal 11
-------------------PROCESS START-------------------
2026-02-24 12:43:59.302 P00 INFO: archive-push:async command begin
2.52.1: [/data/postgres/data/pg_wal] --archive-async --compress-type=zst
--exec-id=2537881-b2a35ac0 --log-level-console=off --log-level-stderr=off
--pg1-path= /data/postgres/data --pg-version-force=16 --process-max=6
--repo1-host=10.25.0.202 --repo1-host-user=pgbackrest
--spool-path=/var/spool/pgbackrest --stanza=TM_Repo
2026-02-24 12:43:59.325 P00 INFO: push 10141 WAL file(s) to archive:
0000000100000BD9000000F9...0000000100000C0100000097
This goes for hours now, not yet finished. Is this normal behaviour ? [
My bandwidth is limited btw DBServer and repo server is only 20Mbps )
How can I overcome this copying of all the old piled up WAL files to the
reposerver (becoz it takes long hours, maybe a day / two ? by the time the
new transactional WALs grew ?) .
My goal is to initiate a full backup afresh on the reposerver , so it
doesn't matter all the old piled up WAL files to async to my repo server
right [ I know I am going to lose the database transaction consistency by
this act. any other way ? ]
But before a full backup when I do the info check
$ sudo -u pgbackrest pgbackrest --stanza=TM_Repo --log-level-console=info
check
it does not succeed, always fails with 60000 ms timeout error[82] ..
Any hints to solve this much appreciated ..
Thank you,
Krishane
More info below.. .
[root@db1 data]# cat /etc/pgbackrest/pgbackrest.conf
[TM_Repo]
pg1-path=/data/postgres/data
pg1-port=5444
pg1-user=postgres
pg-version-force=16
pg1-database=postgres
[global]
repo1-host=10.25.0.202
repo1-host-user=pgbackrest
archive-async=y
spool-path=/var/spool/pgbackrest
log-level-console=info
#log-level-file=debug
log-level-stderr=info
delta=y
compress-type=zst
[global:archive-get]
process-max= 4
[global:archive-push]
process-max= 6
[root@db1 data]#
------------
pgBackRest 2.52.1
OS RHEL 9.4
Postgres 16
On Tue, Feb 24, 2026 at 5:18 AM KK CHN <kkchn.in@gmail.com> wrote:
This goes for hours now, not yet finished. Is this normal behaviour ?
Yes, if there is a lot of WAL
My goal is to initiate a full backup afresh on the reposerver , so it
doesn't matter all the old piled up WAL files
You will need to (carefully!) disable pgbackrest archiving, clean up the
old WAL, then start it up again. Basic sequence:
1. Set archive_command to '/bin/true'
2. Kill any existing pgbackrest processes, empty out the spool directory
3. Wait for Postgres to cleanup / recycle the WAL (speed up with a manual
CHECKPOINT)
4. Restore your archive_command to the pgbackrest version
5. Run pgbackrest check to verify WALs are being archived again
6. Run a full backup
Ideally, test these steps on a dev system, and understand why each step and
why in that order. :)
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On Tue, Feb 24, 2026 at 9:20 PM Greg Sabino Mullane <htamfids@gmail.com>
wrote:
On Tue, Feb 24, 2026 at 5:18 AM KK CHN <kkchn.in@gmail.com> wrote:
This goes for hours now, not yet finished. Is this normal behaviour ?
Yes, if there is a lot of WAL
My goal is to initiate a full backup afresh on the reposerver , so it
doesn't matter all the old piled up WAL files
You will need to (carefully!) disable pgbackrest archiving, clean up the
old WAL, then start it up again. Basic sequence:1. Set archive_command to '/bin/true'
2. Kill any existing pgbackrest processes, empty out the spool directory
3. Wait for Postgres to cleanup / recycle the WAL (speed up with a manual
CHECKPOINT)
4. Restore your archive_command to the pgbackrest version
5. Run pgbackrest check to verify WALs are being archived again
6. Run a full backupIdeally, test these steps on a dev system, and understand why each step
and why in that order. :)
Thank you Greg .
Show quoted text
Cheers,
Greg--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support