Backups / replication

Started by Oliver Kohll - Mailing Listsalmost 16 years ago4 messagesgeneral
Jump to latest
#1Oliver Kohll - Mailing Lists
oliver.lists@gtwm.co.uk

Hello,

I'm interested in using WAL shipping / replication for backup purposes but have no interest in failover. Currently my situation is:

I have two servers, live and backup, which are in different cities. The backup server is also a test/development machine.

Backups of my most important database are made hourly with pg_dump, excluding some larger tables with non-critical logging data. Even so, as the database grows, backups are taking longer and it looks as though they may start to impact performance. A full backup is made nightly and transferred to the backup machine, along with all of the day's hourly backups.

I'm looking into using replication by WAL shipping - after all, there's no use to backing up data which hasn't changed since last time - only a small percentage of records are created/updated. However, I need

a) to be able to restore to a point in time easily, which I can do to within an hour at the moment by restoring the correct dump. Sometimes users ask for a restore having accidentally updated/deleted records.
b) to carry on running a test server database, that means one that's read and writeable.

I obviously can't use a replication slave as a read/write test server at the same time. At the moment I've thought of a couple of options, I don't know if either are possible - I have a bit of a hazy idea of WAL replication.

1) Continuously ship the WAL records to somewhere on the test server unknown to Postgres but run the test machine as a normal database completely separately. If a backup is needed, delete the test database, restore to the last full backup (a filesystem backup?) and copy all WAL records into Postgres' directory so it can see them. Start it up configured to replay them, up to a certain time.

2) Run two instances of Postgres on the test/backup server on different ports, one configured as a replication slave, one normal. I'm not sure if this is possible with the RPM builds I'm using.

Are either of those two likely? Any other suggestions? Another question is will the replication coming in v9.0 change things and would it be worth holding off until then? In particular Command Prompt's PITR tools look useful for restoring to a particular point in time, will these still work or will there be equivalents?

Regards
Oliver Kohll

oliver@agilebase.co.uk / +44(0)7814 828608 / skype:okohll

#2Alan Hodgson
ahodgson@simkin.ca
In reply to: Oliver Kohll - Mailing Lists (#1)
Re: Backups / replication

On Tuesday, June 15, 2010, "Oliver Kohll - Mailing Lists"
<oliver.lists@gtwm.co.uk> wrote:

Are either of those two likely? Any other suggestions? Another question
is will the replication coming in v9.0 change things and would it be
worth holding off until then? In particular Command Prompt's PITR tools
look useful for restoring to a particular point in time, will these
still work or will there be equivalents?

PITR in recent versions allows restoration to any point in time after the
base backup was created, assuming you have the WAL logs from that point
forward.

--
"No animals were harmed in the recording of this episode. We tried but that
damn monkey was just too fast."

#3Adrian von Bidder
avbidder@fortytwo.ch
In reply to: Oliver Kohll - Mailing Lists (#1)
Re: Backups / replication

[continuous backup]

On Tuesday 15 June 2010 21.42:52 Oliver Kohll - Mailing Lists wrote:

1) Continuously ship the WAL records to somewhere on the test server
unknown to Postgres but run the test machine as a normal database
completely separately. If a backup is needed, delete the test database,
restore to the last full backup (a filesystem backup?) and copy all WAL
records into Postgres' directory so it can see them. Start it up
configured to replay them, up to a certain time.

2) Run two instances of Postgres on the test/backup server on different
ports, one configured as a replication slave, one normal. I'm not sure
if this is possible with the RPM builds I'm using.

Both scenarious are possible. I don't know the rpm builds you're using; the
Debian packages allow configuring two instances on two different ports
AFAIK. Possibly the rpm installation do, too. Even if not: hacking up a
2nd start script which runs postgres against a different data directory /
config file should be quite trivial.

Keeping the base backup plus all the WAL files for the case you need to
restore will need quite a bit of diskspace if your database is reasonably
big (on some database I administrated, I scheduled weekly base backups and
kept a week of WAL - since we sometimes had quite a lot changes in the db,
WAL was quickly 10 times as big as the base backup. So depending on your DB
load, keeping a 2nd installation of postgres running and continuously
reading the WAL files might be cheaper in terms of disk space.

(and with 9.0, you even have a near real-time read-only copy of the db for
free gratis...)

cheers
-- vbi

--
90% of the people do not understand copyright,
the other 10% simply ignore it.
-- Aigars Mahinovs

#4Greg Smith
gsmith@gregsmith.com
In reply to: Adrian von Bidder (#3)
Re: Backups / replication

Adrian von Bidder wrote:

I don't know the rpm builds you're using; the
Debian packages allow configuring two instances on two different ports
AFAIK. Possibly the rpm installation do, too. Even if not: hacking up a
2nd start script which runs postgres against a different data directory /
config file should be quite trivial.

The situation is midway between here: you do have to hack up the
startup scripts a bit to get more than one server running with an RPM
install, but the changes are not too terrible. There's a sample and
article about it at
http://blog.2ndquadrant.com/en/2010/05/install-multiple-postgresql-servers-redhat-linux.html

You are correct that this is much easier on Debian.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us