Hot backup in PostgreSQL

Started by W.P.over 5 years ago9 messagesgeneral
Jump to latest
#1W.P.
laurentp@wp.pl

Hi there,

how to do "hot backup" (copying files) while database running?

Not using pg_dump.

Is there some equivalent of Oracle (I knew v8) "ALTER DATABASE BEGIN
BACKUP", which froze writes to database files, pushing everything to
redo files?

Laurent.

#2Paul Förster
paul.foerster@gmail.com
In reply to: W.P. (#1)
Re: Hot backup in PostgreSQL

Hi Laurent,

On 22. Oct, 2020, at 08:42, W.P. <laurentp@wp.pl> wrote:

Hi there,

how to do "hot backup" (copying files) while database running?

Not using pg_dump.

Is there some equivalent of Oracle (I knew v8) "ALTER DATABASE BEGIN BACKUP", which froze writes to database files, pushing everything to redo files?

yes, there is. Please read the chapter: 25.3.3.1. Making A Non-Exclusive Low-Level Backup

https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

Cheers,
Paul

In reply to: W.P. (#1)
Re: Hot backup in PostgreSQL

On Thu, Oct 22, 2020 at 08:42:08AM +0200, W.P. wrote:

Hi there,

how to do "hot backup" (copying files) while database running?

Not using pg_dump.

Is there some equivalent of Oracle (I knew v8) "ALTER DATABASE BEGIN
BACKUP", which froze writes to database files, pushing everything to redo
files?

There are many ways to do it. To be able to suggest proper solution we'd
need to know:
1. what is the problem with pg_dump?
2. what is the exact problem you're solving (clearly it's not only
"having backup", as this is done using pg_dump without any problem).

depesz

In reply to: W.P. (#1)
Re: Hot backup in PostgreSQL

On Thu, Oct 22, 2020 at 09:45:36AM +0200, W.P. wrote:

There are many ways to do it. To be able to suggest proper solution we'd
need to know:
1. what is the problem with pg_dump?

Time (I guess a bit, but copying files could be done using rsync, so much
faster).

Is it *really* too slow for you? Please note that you can easily make it
much faster by doing -Fd -j $( nproc ).

2. what is the exact problem you're solving (clearly it's not only
"having backup", as this is done using pg_dump without any problem).

Maybe this is old way, but at some point of time I was doing Oracle 8
backups just by copying files.
Also I guess, restore using copy files should be much faster than using psql
/ pg_restore.

You might want to read this:
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

Best regards,

depesz

#5Andreas Joseph Krogh
andreas@visena.com
In reply to: hubert depesz lubaczewski (#4)
Re: Hot backup in PostgreSQL

På torsdag 22. oktober 2020 kl. 10:18:12, skrev hubert depesz lubaczewski <
depesz@depesz.com <mailto:depesz@depesz.com>>:
On Thu, Oct 22, 2020 at 09:45:36AM +0200, W.P. wrote:

There are many ways to do it. To be able to suggest proper solution we'd
need to know:
1. what is the problem with pg_dump?

Time (I guess a bit, but copying files could be done using rsync, so much
faster).

Is it *really* too slow for you? Please note that you can easily make it
much faster by doing -Fd -j $( nproc ).

I got curious and tried with this DB:

andreak@[local]:5433 13.0 visena=# select
pg_size_pretty(pg_database_size(current_database()));
┌────────────────┐
│ pg_size_pretty │
├────────────────┤
│ 47 GB │
└────────────────┘
(1 row)

nproc=16

Regular pg_dump:

$ time pg_dump -O -d visena > ~/data/visena/visena.dmp

real 2m43,904s
user 0m10,135s
sys 0m24,260s

Parallell pg_dump:

$ time pg_dump -OFd -j $(nproc) -f ~/data/visena/pg_backup -d visena

real 3m43,726s
user 12m36,620s
sys 0m9,537s

pg_dump with pbzip2

$ time pg_dump -O -d visena | pbzip2 -c > ~/data/visena/visena.dmp.bz2

real 6m58,741s
user 92m4,833s
sys 2m18,565s

Here are the sizes of all:

7,4G pg_backup (directory with -Fd)
32G visena.dmp
5,8G visena.dmp.bz2

--

Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

#6Sushant Pawar
sushantxp@gmail.com
In reply to: W.P. (#1)
Re: Hot backup in PostgreSQL

Hi,

If the requirement is to take online backup(hot) backup that can be used to
do point in time recovery, you can rely on low-level API functionality
mentioned earlier by Paul. If you to be care free about not missing any
specific mount point other than default mount point, you can use
pg_basebackup for online backup.

Pg_basebackup will take care of backing up all database files including
custom tablespace and has the option to compress the backup.You can refer
to below link for more information.

https://www.postgresql.org/docs/11/app-pgbasebackup.html

On Thu, Oct 22, 2020 at 12:12 PM W.P. <laurentp@wp.pl> wrote:

Show quoted text

Hi there,

how to do "hot backup" (copying files) while database running?

Not using pg_dump.

Is there some equivalent of Oracle (I knew v8) "ALTER DATABASE BEGIN
BACKUP", which froze writes to database files, pushing everything to
redo files?

Laurent.

#7Adam Brusselback
adambrusselback@gmail.com
In reply to: W.P. (#1)
Re: Hot backup in PostgreSQL

how to do "hot backup" (copying files) while database running?

As others have shown, there are ways to do this with PG's internal tooling
(pg_basebackup).

However, I would highly recommend you use an external backup tool like
pgbackrest [1] to save yourself the pain of implementing things incorrectly
and ending up with non-viable backups when you need them most. I'm not
affiliated with them at all, but have just used pgbackrest in production
for years now with great results. It takes care of PITR, and manages
backup retention (and associated WAL retention). Those can be a bit of a
pain to do manually otherwise.

Just my $0.02, hope it helps!

1. https://pgbackrest.org/

#8Mark Johnson
remi9898@gmail.com
In reply to: Adam Brusselback (#7)
Re: Hot backup in PostgreSQL

User managed backups in PostgreSQL work very similar to what you know from
Oracle. You first place the cluster in backup mode, then copy the database
files, and lastly take the cluster out of backup mode. The first and last
steps are done using functions pg_start_backup('label',false,false) and
pg_stop_backup(false, false). [1]https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP..

If you use a utility supplied with PostgreSQL such as pg_basebackup, it
does these steps for you. If you are using a specific non-PostgreSQL
utility (i.e., Dell Networker or IBM Tivoli) see their documentation for
specifics.

[1]: https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP.
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP.

On Thu, Oct 22, 2020 at 11:14 AM Adam Brusselback <adambrusselback@gmail.com>
wrote:

Show quoted text

how to do "hot backup" (copying files) while database running?

As others have shown, there are ways to do this with PG's internal tooling
(pg_basebackup).

However, I would highly recommend you use an external backup tool like
pgbackrest [1] to save yourself the pain of implementing things incorrectly
and ending up with non-viable backups when you need them most. I'm not
affiliated with them at all, but have just used pgbackrest in production
for years now with great results. It takes care of PITR, and manages
backup retention (and associated WAL retention). Those can be a bit of a
pain to do manually otherwise.

Just my $0.02, hope it helps!

1. https://pgbackrest.org/

#9Stephen Frost
sfrost@snowman.net
In reply to: Mark Johnson (#8)
Re: Hot backup in PostgreSQL

Greetings,

* Mark Johnson (remi9898@gmail.com) wrote:

User managed backups in PostgreSQL work very similar to what you know from
Oracle. You first place the cluster in backup mode, then copy the database
files, and lastly take the cluster out of backup mode. The first and last
steps are done using functions pg_start_backup('label',false,false) and
pg_stop_backup(false, false). [1].

Just to be clear for the archives, the above is *not* sufficient to have
a consistent and valid backup- you must also ensure that archive_command
(or some other system) is configured to capture all of the WAL produced
from the start to the end of the backup, and you need to make sure that
a backup_label file is created in the backup (using the results of
pg_stop_backup). Some of this is discussed in the 'low level API' part
of the documentation, as linked before:

https://www.postgresql.org/docs/13/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

Though it's really best, as mentioned, to use an existing tool that's
been written to ensure all of this is done correctly and not to try and
build your own.

Thanks,

Stephen

Show quoted text

If you use a utility supplied with PostgreSQL such as pg_basebackup, it
does these steps for you. If you are using a specific non-PostgreSQL
utility (i.e., Dell Networker or IBM Tivoli) see their documentation for
specifics.

[1]
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP.

On Thu, Oct 22, 2020 at 11:14 AM Adam Brusselback <adambrusselback@gmail.com>
wrote:

how to do "hot backup" (copying files) while database running?

As others have shown, there are ways to do this with PG's internal tooling
(pg_basebackup).

However, I would highly recommend you use an external backup tool like
pgbackrest [1] to save yourself the pain of implementing things incorrectly
and ending up with non-viable backups when you need them most. I'm not
affiliated with them at all, but have just used pgbackrest in production
for years now with great results. It takes care of PITR, and manages
backup retention (and associated WAL retention). Those can be a bit of a
pain to do manually otherwise.

Just my $0.02, hope it helps!

1. https://pgbackrest.org/