pg_basebackup && long time storage of wal_archive/ content

Started by Matthias Apitzalmost 6 years ago5 messagesgeneral
Jump to latest
#1Matthias Apitz
guru@unixarea.de

Hello,

The run (as user 'postgres') on the server of the cmd:

pg_basebackup -U ${DBSUSER} -Ft -z -D /data/postgresql11/backup-wal_archive-20200414-1/

produces correctly the following situtation on the server:

srap32dxr1:/data/postgresql11 # ls -ltrh backup-20200414-1
total 1.7G
-rw------- 1 postgres postgres 1.7G Apr 14 16:20 base.tar.gz
-rw------- 1 postgres postgres 17K Apr 14 16:20 pg_wal.tar.gz
srap32dxr1:/data/postgresql11 # ls -ltrh data/pg_wal
total 81M
-rw------- 1 postgres postgres 16M Apr 14 15:11 000000010000000C0000009F
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C000000A0
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C000000A1
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C000000A2
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup
drwx------ 2 postgres postgres 36K Apr 14 16:25 archive_status
-rw------- 1 postgres postgres 16M Apr 14 16:50 000000010000000C0000009E
srap32dxr1:/data/postgresql11 # ls -ltrh wal_archive/
total 49M
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C0000009B
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C0000009C
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C0000009D
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup

The content of the dir wal_archive/ is growing and growing over the
time.

The pg_wat.tar.gz contains:

srap32dxr1:/data/postgresql11 # tar tzf backup-20200414-1/pg_wal.tar.gz
000000010000000C0000009D

i..e. the files 000000010000000C0000009B ... 000000010000000C0000009D
could be removed in wal_archive/

Correct? Or how is the long time storage of this dir?

matthias

--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Matthias Apitz (#1)
Re: pg_basebackup && long time storage of wal_archive/ content

On 4/14/20 8:00 AM, Matthias Apitz wrote:

Hello,

The run (as user 'postgres') on the server of the cmd:

pg_basebackup -U ${DBSUSER} -Ft -z -D /data/postgresql11/backup-wal_archive-20200414-1/

produces correctly the following situtation on the server:

srap32dxr1:/data/postgresql11 # ls -ltrh backup-20200414-1
total 1.7G
-rw------- 1 postgres postgres 1.7G Apr 14 16:20 base.tar.gz
-rw------- 1 postgres postgres 17K Apr 14 16:20 pg_wal.tar.gz
srap32dxr1:/data/postgresql11 # ls -ltrh data/pg_wal
total 81M
-rw------- 1 postgres postgres 16M Apr 14 15:11 000000010000000C0000009F
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C000000A0
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C000000A1
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C000000A2
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup
drwx------ 2 postgres postgres 36K Apr 14 16:25 archive_status
-rw------- 1 postgres postgres 16M Apr 14 16:50 000000010000000C0000009E
srap32dxr1:/data/postgresql11 # ls -ltrh wal_archive/
total 49M
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C0000009B
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C0000009C
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C0000009D
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup

The content of the dir wal_archive/ is growing and growing over the
time.

The pg_wat.tar.gz contains:

srap32dxr1:/data/postgresql11 # tar tzf backup-20200414-1/pg_wal.tar.gz
000000010000000C0000009D

i..e. the files 000000010000000C0000009B ... 000000010000000C0000009D
could be removed in wal_archive/

Correct? Or how is the long time storage of this dir?

pg_basebackup is a point in time backup tool. If you want something that
continuously archives and prunes as it goes then you probably want to
look at the tools below:

https://pgbackrest.org/

https://www.pgbarman.org/

https://postgrespro.github.io/pg_probackup/

matthias

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Matthias Apitz
guru@unixarea.de
In reply to: Adrian Klaver (#2)
Re: pg_basebackup && long time storage of wal_archive/ content

El día Dienstag, April 14, 2020 a las 08:28:35 -0700, Adrian Klaver escribió:

On 4/14/20 8:00 AM, Matthias Apitz wrote:

Hello,

The run (as user 'postgres') on the server of the cmd:

pg_basebackup -U ${DBSUSER} -Ft -z -D /data/postgresql11/backup-wal_archive-20200414-1/

produces correctly the following situtation on the server:

srap32dxr1:/data/postgresql11 # ls -ltrh backup-20200414-1
total 1.7G
-rw------- 1 postgres postgres 1.7G Apr 14 16:20 base.tar.gz
-rw------- 1 postgres postgres 17K Apr 14 16:20 pg_wal.tar.gz
srap32dxr1:/data/postgresql11 # ls -ltrh data/pg_wal
total 81M
-rw------- 1 postgres postgres 16M Apr 14 15:11 000000010000000C0000009F
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C000000A0
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C000000A1
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C000000A2
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup
drwx------ 2 postgres postgres 36K Apr 14 16:25 archive_status
-rw------- 1 postgres postgres 16M Apr 14 16:50 000000010000000C0000009E
srap32dxr1:/data/postgresql11 # ls -ltrh wal_archive/
total 49M
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C0000009B
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C0000009C
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C0000009D
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup

The content of the dir wal_archive/ is growing and growing over the
time.

The pg_wal.tar.gz contains:

srap32dxr1:/data/postgresql11 # tar tzf backup-20200414-1/pg_wal.tar.gz
000000010000000C0000009D

i..e. the files 000000010000000C0000009B ... 000000010000000C0000009D
could be removed in wal_archive/

Correct? Or how is the long time storage of this dir?

pg_basebackup is a point in time backup tool.

Adrian,

Thanks for the answer. What do you mean by 'point in time backup tool'?
And, coming back to my question, can older WAL in wal_archive/ be
removed without any risks if the daily

pg_wal.tar.gz
base.tar.gz

are moved to a safe place?

If you want something that
continuously archives and prunes as it goes then you probably want to look
at the tools below:

https://pgbackrest.org/

https://www.pgbarman.org/

https://postgrespro.github.io/pg_probackup/

Thanks, I will have a look into the tools.

matthias

--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Matthias Apitz (#3)
Re: pg_basebackup && long time storage of wal_archive/ content

On Wed, 2020-04-15 at 08:16 +0200, Matthias Apitz wrote:

The run (as user 'postgres') on the server of the cmd:

pg_basebackup -U ${DBSUSER} -Ft -z -D /data/postgresql11/backup-wal_archive-20200414-1/

produces correctly the following situtation on the server:

srap32dxr1:/data/postgresql11 # ls -ltrh backup-20200414-1
total 1.7G
-rw------- 1 postgres postgres 1.7G Apr 14 16:20 base.tar.gz
-rw------- 1 postgres postgres 17K Apr 14 16:20 pg_wal.tar.gz
srap32dxr1:/data/postgresql11 # ls -ltrh data/pg_wal
total 81M
-rw------- 1 postgres postgres 16M Apr 14 15:11 000000010000000C0000009F
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C000000A0
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C000000A1
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C000000A2
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup
drwx------ 2 postgres postgres 36K Apr 14 16:25 archive_status
-rw------- 1 postgres postgres 16M Apr 14 16:50 000000010000000C0000009E
srap32dxr1:/data/postgresql11 # ls -ltrh wal_archive/
total 49M
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C0000009B
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C0000009C
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C0000009D
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup

The content of the dir wal_archive/ is growing and growing over the
time.

The pg_wal.tar.gz contains:

srap32dxr1:/data/postgresql11 # tar tzf backup-20200414-1/pg_wal.tar.gz
000000010000000C0000009D

i..e. the files 000000010000000C0000009B ... 000000010000000C0000009D
could be removed in wal_archive/

Correct? Or how is the long time storage of this dir?

The best you can do is to use "pg_archivecleanup" with one of these *.backup
files. If you run

pg_archivecleanup wal_archive 000000010000000C0000009D.00000028.backup

it will delete all WAL archives older than that backup.

So that backup can still be restored, and you can recover to any later point
in time, but all older WAL is removed.

pg_basebackup is a point in time backup tool.

Thanks for the answer. What do you mean by 'point in time backup tool'?

I think he meant "online file system-level backup tool" - you need
point-in-time-recovery to restore such a backup.

And, coming back to my question, can older WAL in wal_archive/ be
removed without any risks if the daily

pg_wal.tar.gz
base.tar.gz

are moved to a safe place?

Yes, because "-X stream" is the default for "pg_basebackup".

"pg_wal.tar.gz" will contain all the WAL necessary to recover the backup, so
if you unpack its contents into the "pg_wal" subdirectory of the unpacked
base backup, you can simply start the server, and it will recover to the end
of the backup.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Matthias Apitz (#3)
Re: pg_basebackup && long time storage of wal_archive/ content

On 4/14/20 11:16 PM, Matthias Apitz wrote:

El día Dienstag, April 14, 2020 a las 08:28:35 -0700, Adrian Klaver escribió:

On 4/14/20 8:00 AM, Matthias Apitz wrote:

Hello,

The run (as user 'postgres') on the server of the cmd:

pg_basebackup -U ${DBSUSER} -Ft -z -D /data/postgresql11/backup-wal_archive-20200414-1/

produces correctly the following situtation on the server:

srap32dxr1:/data/postgresql11 # ls -ltrh backup-20200414-1
total 1.7G
-rw------- 1 postgres postgres 1.7G Apr 14 16:20 base.tar.gz
-rw------- 1 postgres postgres 17K Apr 14 16:20 pg_wal.tar.gz
srap32dxr1:/data/postgresql11 # ls -ltrh data/pg_wal
total 81M
-rw------- 1 postgres postgres 16M Apr 14 15:11 000000010000000C0000009F
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C000000A0
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C000000A1
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C000000A2
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup
drwx------ 2 postgres postgres 36K Apr 14 16:25 archive_status
-rw------- 1 postgres postgres 16M Apr 14 16:50 000000010000000C0000009E
srap32dxr1:/data/postgresql11 # ls -ltrh wal_archive/
total 49M
-rw------- 1 postgres postgres 16M Apr 14 15:20 000000010000000C0000009B
-rw------- 1 postgres postgres 16M Apr 14 16:10 000000010000000C0000009C
-rw------- 1 postgres postgres 16M Apr 14 16:20 000000010000000C0000009D
-rw------- 1 postgres postgres 342 Apr 14 16:20 000000010000000C0000009D.00000028.backup

The content of the dir wal_archive/ is growing and growing over the
time.

The pg_wal.tar.gz contains:

srap32dxr1:/data/postgresql11 # tar tzf backup-20200414-1/pg_wal.tar.gz
000000010000000C0000009D

i..e. the files 000000010000000C0000009B ... 000000010000000C0000009D
could be removed in wal_archive/

Correct? Or how is the long time storage of this dir?

pg_basebackup is a point in time backup tool.

Adrian,

Thanks for the answer. What do you mean by 'point in time backup tool'?

That pg_basebackup captures the state of the cluster at some point in
time. As you have found out keeping up with the current state of the
cluster requires further work. Laurenz provided one built in tool for
doing that. The other tools I posted earlier also provide means of
keeping backups consistent.

And, coming back to my question, can older WAL in wal_archive/ be
removed without any risks if the daily

pg_wal.tar.gz
base.tar.gz

are moved to a safe place?

If you want something that
continuously archives and prunes as it goes then you probably want to look
at the tools below:

https://pgbackrest.org/

https://www.pgbarman.org/

https://postgrespro.github.io/pg_probackup/

Thanks, I will have a look into the tools.

matthias

--
Adrian Klaver
adrian.klaver@aklaver.com