Backup Copy of a Production server.

Started by KK CHNover 2 years ago5 messagesgeneral
Jump to latest
#1KK CHN
kkchn.in@gmail.com

List,

I am in need to copy a production PostgreSQL server data( 1 TB) to an
external storage( Say USB Hard Drive) and need to set up a backup server
with this data dir.

What is the trivial method to achieve this ??

1. Is Sqldump an option at a production server ?? ( Will this affect the
server performance and possible slowdown of the production server ? This
server has a high IOPS). This much size 1.2 TB will the Sqldump support ?
Any bottlenecks ?

2. Is copying the data directory from the production server to an external
storage and replace the data dir at a backup server with same postgres
version and replace it's data directory with this data dir copy is a viable
option ?

# cp -r ./data /media/mydb_backup ( Does this affect the Production
database server performance ??) due to the copy command overhead ?

OR doing a WAL Replication Configuration to a standby is the right method
to achieve this ??

This is to take out the database backup outside the Datacenter and our DC
policy won't allow us to establish a network connection outside the DC to a
remote location for WAL replication .

Any hints most welcome ..

Thank you
Krishane

#2Ron
ronljohnsonjr@gmail.com
In reply to: KK CHN (#1)
Re: Backup Copy of a Production server.

On 8/7/23 00:02, KK CHN wrote:

List,

I am in need to copy a production PostgreSQL server  data( 1 TB)  to  an
external storage( Say USB Hard Drive) and need to set up a backup server
with this data dir.

What is the trivial method to achieve this ??

1. Is Sqldump an option at a production server ?? (  Will this affect the
server performance  and possible slowdown of the production server ? This
server has a high IOPS). This much size 1.2 TB will the Sqldump support ?
Any bottlenecks ?

Whether or not there will be bottlenecks depends on how busy (CPU and disk
load) the current server is.

2. Is copying the data directory from the production server to an external
storage and replace the data dir  at a  backup server with same postgres
version and replace it's data directory with this data dir copy is a
viable option ?

# cp  -r   ./data  /media/mydb_backup  ( Does this affect the Production
database server performance ??)   due to the copy command overhead ?

OR  doing a WAL Replication Configuration to a standby is the right method
to achieve this ??

But you say you can't establish a network connection outside the DC.

 This is to take out the database backup outside the Datacenter and our DC
policy won't allow us to establish a network connection outside the DC to
a remote location for WAL replication .

If you're unsure of what Linux distro & version and Postgresql version that
you'll be restoring the database to, then the solution is:
DB=the_database_you_want_to_backup
THREADS=<some_number>
cd $PGDATA
cp -v pg_hba.conf postgresql.conf /media/mydb_backup
cd /media/mydb_backup
pg_dumpall --globals-only > globals.sql
pg_dump --format=d --verbose --jobs=$THREADS $DB &> ${DB}.log

If you're 100% positive that the system you might someday restore to is
*exactly* the same distro & version, and Postgresql major version, then I'd
use PgBackRest.

--
Born in Arizona, moved to Babylonia.

#3KK CHN
kkchn.in@gmail.com
In reply to: Ron (#2)
Re: Backup Copy of a Production server.

On Mon, Aug 7, 2023 at 10:49 AM Ron <ronljohnsonjr@gmail.com> wrote:

On 8/7/23 00:02, KK CHN wrote:

List,

I am in need to copy a production PostgreSQL server data( 1 TB) to an
external storage( Say USB Hard Drive) and need to set up a backup server
with this data dir.

What is the trivial method to achieve this ??

1. Is Sqldump an option at a production server ?? ( Will this affect the
server performance and possible slowdown of the production server ? This
server has a high IOPS). This much size 1.2 TB will the Sqldump support ?
Any bottlenecks ?

Whether or not there will be bottlenecks depends on how busy (CPU and disk
load) the current server is.

2. Is copying the data directory from the production server to an external
storage and replace the data dir at a backup server with same postgres
version and replace it's data directory with this data dir copy is a viable
option ?

# cp -r ./data /media/mydb_backup ( Does this affect the Production
database server performance ??) due to the copy command overhead ?

OR doing a WAL Replication Configuration to a standby is the right method
to achieve this ??

But you say you can't establish a network connection outside the DC. ( I
can't do for a remote machine .. But I can do a WAL replication to another
host in the same network inside the DC. So that If I do a sqldump or Copy
of Data dir of the standby server it won't affect the production server, is
this sounds good ? )

This is to take out the database backup outside the Datacenter and our DC
policy won't allow us to establish a network connection outside the DC to a
remote location for WAL replication .

If you're unsure of what Linux distro & version and Postgresql version
that you'll be restoring the database to, then the solution is:
DB=the_database_you_want_to_backup
THREADS=<some_number>
cd $PGDATA
cp -v pg_hba.conf postgresql.conf /media/mydb_backup
cd /media/mydb_backup
pg_dumpall --globals-only > globals.sql

What is the relevance of globals-only and what this will do ${DB}.log
// or is it ${DB}.sql ?

pg_dump --format=d --verbose --jobs=$THREADS $DB &> ${DB}.log // .log

Show quoted text

couldn't get an idea what it mean

If you're 100% positive that the system you might someday restore to is
*exactly* the same distro & version, and Postgresql major version, then
I'd use PgBackRest.

--
Born in Arizona, moved to Babylonia.

#4Chris Travers
chris.travers@gmail.com
In reply to: KK CHN (#3)
Re: Backup Copy of a Production server.

On Mon, Aug 7, 2023 at 7:00 PM KK CHN <kkchn.in@gmail.com> wrote:

On Mon, Aug 7, 2023 at 10:49 AM Ron <ronljohnsonjr@gmail.com> wrote:

On 8/7/23 00:02, KK CHN wrote:

List,

I am in need to copy a production PostgreSQL server data( 1 TB) to an
external storage( Say USB Hard Drive) and need to set up a backup server
with this data dir.

What is the trivial method to achieve this ??

pg_basebackup backs up the data dir.

Alternatively you could use another tool like pgbackrest which also has
some nice features regarding incremental backups, wal management etc and
parallel backup and restore.

1. Is Sqldump an option at a production server ?? ( Will this affect the
server performance and possible slowdown of the production server ? This
server has a high IOPS). This much size 1.2 TB will the Sqldump support ?
Any bottlenecks ?

A sql dump can be had with pg_dumpall, but this is a different backup
strategy. I would recommend in that case pg_dumpall -g to dump only
globals (roles and tablespaces) and then pg_dump on the databases
individually with format selection to either custom (if parallelism is not
required) or tar (if it is). See the pg_dump man page for details.

The file size will depend on file format selected etc. My naive guess for
custom format would be maybe 200-400GB. For tar format probably more
(double or more) but total size depends on many factors and cannot be
reliably estimated. In rare cases, it could even be larger than your data
directory.

Whether or not there will be bottlenecks depends on how busy (CPU and
disk load) the current server is.

pg_basebackup is limited by the fact that it is single threaded on both
sides (aside from wal), and this also limits disk I/O as well as network
throughput (if you have a long fat pipe).

pg_dump is also limited by having to interpret and serialize the output,
and also, if you have large text or binary fields, having to retrieve these
one at a time. Additionally you could have lock contention.

2. Is copying the data directory from the production server to an
external storage and replace the data dir at a backup server with same
postgres version and replace it's data directory with this data dir copy is
a viable option ?

There is a lot of complexity to doing that right. If you want to do that,
look at using pgbackrest.

# cp -r ./data /media/mydb_backup ( Does this affect the Production
database server performance ??) due to the copy command overhead ?

OR doing a WAL Replication Configuration to a standby is the right
method to achieve this ??

That is often also used, but you need to define what you want out of a
backup. A standby will protect you from hardware failure for the most
part. It will not, without a lot of other thought and configuration,
protect you from an administrator accidently dropping an important table or
database. WAL archiving and backups can help there though (and hence my
recommendation for pgbackrest, which can also restore the data directory
and/or wals to a standby).

But you say you can't establish a network connection outside the DC. ( I
can't do for a remote machine .. But I can do a WAL replication to another
host in the same network inside the DC. So that If I do a sqldump or Copy
of Data dir of the standby server it won't affect the production server, is
this sounds good ? )

With a good backup archive accessible from only the places it needs to be
accessed, this problem goes away.

This is to take out the database backup outside the Datacenter and our
DC policy won't allow us to establish a network connection outside the DC
to a remote location for WAL replication .

If you're unsure of what Linux distro & version and Postgresql version
that you'll be restoring the database to, then the solution is:
DB=the_database_you_want_to_backup
THREADS=<some_number>
cd $PGDATA
cp -v pg_hba.conf postgresql.conf /media/mydb_backup
cd /media/mydb_backup
pg_dumpall --globals-only > globals.sql

What is the relevance of globals-only and what this will do ${DB}.log
// or is it ${DB}.sql ?

pg_dump --format=d --verbose --jobs=$THREADS $DB &> ${DB}.log // .log

couldn't get an idea what it mean

If you're 100% positive that the system you might someday restore to is
*exactly* the same distro & version, and Postgresql major version, then
I'd use PgBackRest.

--
Born in Arizona, moved to Babylonia.

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

#5Ron
ronljohnsonjr@gmail.com
In reply to: KK CHN (#3)
Re: Backup Copy of a Production server.

On 8/7/23 07:05, KK CHN wrote:

On Mon, Aug 7, 2023 at 10:49 AM Ron <ronljohnsonjr@gmail.com> wrote:

On 8/7/23 00:02, KK CHN wrote:

List,

I am in need to copy a production PostgreSQL server  data( 1 TB)  to 
an external storage( Say USB Hard Drive) and need to set up a backup
server with this data dir.

What is the trivial method to achieve this ??

1. Is Sqldump an option at a production server ?? (  Will this affect
the server performance  and possible slowdown of the production
server ? This server has a high IOPS). This much size 1.2 TB will the
Sqldump support ? Any bottlenecks ?

Whether or not there will be bottlenecks depends on how busy (CPU and
disk load) the current server is.

2. Is copying the data directory from the production server to an
external storage and replace the data dir  at a  backup server with
same postgres version and replace it's data directory with this data
dir copy is a viable option ?

# cp  -r   ./data  /media/mydb_backup  ( Does this affect the
Production database server performance ??)   due to the copy command
overhead ?

OR  doing a WAL Replication Configuration to a standby is the right
method to achieve this ??

But you say you can't establish a network connection outside the DC.

( I can't do for a remote machine .. But I can do  a WAL replication to
another host in the same network inside the DC. So that If I  do a sqldump 
or Copy of Data dir of the standby server it won't affect the production
server, is this sounds good  ?  )

"WAL replication" is streaming replication.  Yes, you can do that.

But to what end?  See my prior comments about when you should use PgBackRest.

 This is to take out the database backup outside the Datacenter and
our DC policy won't allow us to establish a network connection
outside the DC to a remote location for WAL replication .

If you're unsure of what Linux distro & version and Postgresql version
that you'll be restoring the database to, then the solution is:
DB=the_database_you_want_to_backup
THREADS=<some_number>
cd $PGDATA
cp -v pg_hba.conf postgresql.conf /media/mydb_backup
cd /media/mydb_backup
pg_dumpall --globals-only > globals.sql

What is the relevance of  globals-only

It's all spelled out in the pg_dumpall man page.

and  what this will do  ${DB}.log  // or is it ${DB}.sql  ?

pg_dump --format=d --verbose --jobs=$THREADS $DB &> ${DB}.log      //
.log couldn't get an idea what it mean

I get the impression that you don't have any experience with the bash shell.

"&> ${DB}.log" redirects stdout and stderr to the file named ${DB}.log.

If you're 100% positive that the system you might someday restore to
is *exactly* the same distro & version, and Postgresql major version,
then I'd use PgBackRest.

--
Born in Arizona, moved to Babylonia.