[FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)
Hi Hackers,
In my Project i have to handle a Database with 600 GByte Text only,
distributed on 4 Tablespaces
on multiple Harddisks and Remote SAN's connected via Gigaethernet to the
Remote SAN-Storage.
I need more flexibility by doing Backups of my big Database, but the
built in Online Backupsystem
dont work for my Setup good enought for me. I Can not accept 16 MByte
big WAL's Files
for securing it on Tape. 16 MByte Data loss on a Crash Situation is
Fatal and no helpfully
(1 MByte to). I wish to have a continoues Backup without any data losses.
My Idea:
- 1 A Complete Database Backup from Scratch (its implemented right now)
- 2 Online streaming Backup thadt, updates my Basebackup continously
every Time Changes was made (the half functionality is allready implemented)
- 3 Able to redirect the Online Streamingbackup Files to an Remote
Servermachine (FTP) (the ARCHIVE Param in postgresql.conf can do thadt
allready
but the Method is not 100% Continously, big Holes of Datalosses
can occur)
- 4 Version Management of Multiple Backuplines by Timestamp (is not yet
implemented)
- 5 Recovery Option inside the PSQL-Client, for doing the Desaster
Recovery. (is not yet implemented)
Benefitts:
All Users of Hugh Databases (Missioncritical and allways Online) can
bring up its
Databases with the same information with differences 1-5 Sec. before the
Crash occurs!
ps:
At EMCSoftware there is an Tool thadt has can do thadt for ORACLE and
MSSQL but there
is not Option for Postgres avaiable );
Sry for my bad english and i hope there is some one thadt can understand
the Problem.
Apoc
apoc9009 wrote:
Hi Hackers,
In my Project i have to handle a Database with 600 GByte Text only,
distributed on 4 Tablespaces
on multiple Harddisks and Remote SAN's connected via Gigaethernet to the
Remote SAN-Storage.I need more flexibility by doing Backups of my big Database, but the
built in Online Backupsystem
dont work for my Setup good enought for me. I Can not accept 16 MByte
big WAL's Files
for securing it on Tape. 16 MByte Data loss on a Crash Situation is
Fatal and no helpfully
(1 MByte to). I wish to have a continoues Backup without any data losses.
http://www.postgresql.org/docs/8.2/static/warm-standby.html
Particularly section 23.4.4
That can get you to 1 second or less.
--
Richard Huxton
Archonet Ltd
http://www.postgresql.org/docs/8.2/static/warm-standby.html
Particularly section 23.4.4
Thadt is Replication NOT Backup
http://www.postgresql.org/docs/8.2/static/warm-standby.html
Particularly section 23.4.4
23.4.4 is thadt what iam using just im Time but this ist not eneought
for me!
No Versioning, no chances to prevent data losses
You have to wait until a WAL File ist written (Default Value for WAL
Files is 16 MBytes).
I need an EXCAT Copy from the Database and only the last changes in it
to for
updating my Backupresitory. If the System Crash, you have Dataloss of
over 16 MBytes
thadts Fatal and not acceptable! 1MByte Dataloss of ASCII Data is also
not acceptable!
apoc9009 wrote:
http://www.postgresql.org/docs/8.2/static/warm-standby.html
Particularly section 23.4.423.4.4 is thadt what iam using just im Time but this ist not eneought
for me!No Versioning, no chances to prevent data losses
You have to wait until a WAL File ist written (Default Value for WAL
Files is 16 MBytes).
You've either not read 23.4.4 or haven't understood it. If the text is
unclear, documentation additions/changes are always welcome.
--
Richard Huxton
Archonet Ltd
You've either not read 23.4.4 or haven't understood it. If the text is
unclear, documentation additions/changes are always welcome.
I have read this:
PostgreSQL directly supports file-based log shipping as described above.
It is also possible to implement record-based log shipping, though this
requires custom development.
But thadt is not thadt what iam looking for!
Filebased Logship backups having a big Problem for doing continous
Backups. You have to wait until
the Postmaster has written the WAL File, after this you can save it to
the Backupserver. But 1 WAL
has a size of 16 MByte ny default! (thadt is a big Datahole in your
"Online Backup"!)
It makes no sense to reduce the Filesize. If the Filesize is smaller
then 16 MBytes for WAL Files
you have still the same Problem, there are Data losses and thadt the
Center of the Problem.
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses, without locking Tables, without Shutdown
and without any User must be forced for logging out (Backup in
Production State Online without Datalosses).
apoc9009 wrote:
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses, without locking Tables, without Shutdown
and without any User must be forced for logging out (Backup in
Production State Online without Datalosses).
"Without datalosses" is utopy. For that, you'd need something like
synchronous replication, otherwise you're always going to have a window
where you have something committed in the server, but not yet in the
backup. So it's just a question of how wide that window is, how much
data loss can you live with.
With file-based log shipping, you can get down to 1 second, by using the
archive_timeout setting. It will produce a lot of log files with very
little content in them, but they will compress well.
The record-based log shipping will give you a very narrow window, down
to < 1 second or even less if you're willing to poll the server that
often, but requires some custom development.
I wonder, do you really need such a super real time backup solution,
when you have the remote SAN? Don't you trust that the SAN hardware?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Hi,
On 9/6/07, apoc9009 <apoc9009@yahoo.de> wrote:
Filebased Logship backups having a big Problem for doing continous
Backups. You have to wait until
the Postmaster has written the WAL File, after this you can save it to
the Backupserver. But 1 WAL
has a size of 16 MByte ny default! (thadt is a big Datahole in your
"Online Backup"!)
You should read the documentation more carefully:
"archive_timeout (integer)
The archive_command is only invoked on completed WAL segments.
Hence, if your server generates little WAL traffic (or has slack
periods where it does so), there could be a long delay between the
completion of a transaction and its safe recording in archive storage.
To put a limit on how old unarchived data can be, you can set
archive_timeout to force the server to switch to a new WAL segment
file periodically. When this parameter is greater than zero, the
server will switch to a new segment file whenever this many seconds
have elapsed since the last segment file switch. Note that archived
files that are closed early due to a forced switch are still the same
length as completely full files. Therefore, it is unwise to use a very
short archive_timeout — it will bloat your archive storage.
archive_timeout settings of a minute or so are usually reasonable.
This parameter can only be set in the postgresql.conf file or on the
server command line. "
(Taken from http://www.postgresql.org/docs/8.2/static/runtime-config-wal.html#GUC-ARCHIVE-TIMEOUT)
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses, without locking Tables, without Shutdown
and without any User must be forced for logging out (Backup in
Production State Online without Datalosses).
Well, there's what you want and there's what is possible. Warm standby
is definitely the best solution for your problem, even if not perfect.
--
Guillaume
apoc9009 wrote:
You've either not read 23.4.4 or haven't understood it. If the text is
unclear, documentation additions/changes are always welcome.I have read this:
PostgreSQL directly supports file-based log shipping as described above.
It is also possible to implement record-based log shipping, though this
requires custom development.But thadt is not thadt what iam looking for!
Filebased Logship backups having a big Problem for doing continous
Backups. You have to wait until
the Postmaster has written the WAL File, after this you can save it to
the Backupserver. But 1 WAL
has a size of 16 MByte ny default! (thadt is a big Datahole in your
"Online Backup"!)
Which is why that entire section is about copying just the changed parts
of WAL files.
It makes no sense to reduce the Filesize. If the Filesize is smaller
then 16 MBytes for WAL Files
you have still the same Problem, there are Data losses and thadt the
Center of the Problem.
But in your original email you said:
All Users of Hugh Databases (Missioncritical and allways Online) can
bring up its
Databases with the same information with differences 1-5 Sec. before
the Crash occurs!
That suggested to me that you didn't want per-transaction backup, just
one backup every second. OK, what you actually want is a continuous
backup with one copy made per transaction.
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses, without locking Tables, without Shutdown
and without any User must be forced for logging out (Backup in
Production State Online without Datalosses).
So, if I understand, you want on of:
1. External RAID array. If main machine dies, turn backup machine on.
Both share the same disks.
2. Something like DRBD to copy individual disk blocks between machines.
You could do this just for WAL.
3. Replication.
--
Richard Huxton
Archonet Ltd
Ühel kenal päeval, N, 2007-09-06 kell 12:53, kirjutas apoc9009:
You've either not read 23.4.4 or haven't understood it. If the text is
unclear, documentation additions/changes are always welcome.I have read this:
PostgreSQL directly supports file-based log shipping as described above.
It is also possible to implement record-based log shipping, though this
requires custom development.
check out SkyTools, it seems to make use of 23.4.4 to have sub-second
failure window
-----
Hannu
On Thu, 2007-09-06 at 12:21 +0200, apoc9009 wrote:
If the System Crash, you have Dataloss of
over 16 MBytes thats Fatal and not acceptable! 1MByte Dataloss of ASCII Data is also
not acceptable!
Is any data loss acceptable in the case of a disaster? How much?
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
Ühel kenal päeval, N, 2007-09-06 kell 12:53, kirjutas apoc9009:
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses,
Then you need the backup record to be completed on the backup machine
before the transaction commit returns on master.
This is quaranteed to be really slow for any significant transaction
rate but can be done using DRBD.
without locking Tables,
No backup I know of needs locking tables
without Shutdown
This one I just don't understand. Shtdown what ?
without any User must be forced for logging out (Backup in
Production State Online without Datalosses).
So you want the user to still be connected to the failed machine, but at
the same time be connected to the new live failover machine ?
-----
Hannu
Heikki Linnakangas schrieb:
apoc9009 wrote:
"Without datalosses" is utopy. For that, you'd need something like
synchronous replication, otherwise you're always going to have a window
where you have something committed in the server, but not yet in the
backup. So it's just a question of how wide that window is, how much
data loss can you live with.With file-based log shipping, you can get down to 1 second, by using the
archive_timeout setting. It will produce a lot of log files with very
little content in them, but they will compress well.
Yes, it is possible but not recommended . My Backup Servers Filesystem
will explode :D
The record-based log shipping will give you a very narrow window, down
to < 1 second or even less if you're willing to poll the server that
often, but requires some custom development.
Correct, but this is not good enought and i think there are a lot of
Peoples having the same Problem.
It was wishfull, having a Online Streaming Backupsubsystem, thadt can
produce portable Backupdatabases,
to prevent users for Millions of Archive Files on the Backup FTP-Server.
My Ideaa is the following Folder Structure:
/pg/backup/
Backup 12/24/2007 Version 1
/pg/backup/12_24_2007/base/rcvry.rcv <--- Basebackup
/pg/backup/12_24_2007/changes/0001.chg <--- Changed Data
/changes/0002.chg <--- Changed Data
/changes/0003.chg <--- Changed Data
Backup 12/24/2008 Version 2
/pg/backup/12_24_2008/base/rcvry.rcv <--- Basebackup
/pg/backup/12_24_2008/changes/0001.chg <--- Changed Data
/changes/0002.chg <--- Changed Data
/changes/0003.chg <--- Changed Data
....
/changes/0010.chg <--- Changed Data
/changes/0001.rsf <--- Recovery
Stripeset File (10 MByte) addon of Basebackup
delete *.chg
if a Stripeset of 10 *.chg Files exist, they should be converted or merged
to one greater Recovery Stripe File (*.RSF)
Things to do:
1.A Integradted FTP-Client to the Postmaster Process
2.Online Streamingbackup Logic inside the Postmaster
Apoc
So you want the user to still be connected to the failed machine, but at
the same time be connected to the new live failover machine ?-----
Hannu
No.
The User should be connected to the running db without restrictions
while backup is in progress
Apoc
Ühel kenal päeval, N, 2007-09-06 kell 16:15, kirjutas apoc9009:
So you want the user to still be connected to the failed machine, but at
the same time be connected to the new live failover machine ?-----
HannuNo.
The User should be connected to the running db without restrictions
while backup is in progress
And what's wrong with WAL-based backup then ?
Especially the variant described in 23.4.4 wich keeps the WAL copied to
backup machine up to last second ?
---------------
Hannu
On Thu, 2007-09-06 at 12:53 +0200, apoc9009 wrote:
You've either not read 23.4.4 or haven't understood it. If the text is
unclear, documentation additions/changes are always welcome.I have read this:
PostgreSQL directly supports file-based log shipping as described above.
It is also possible to implement record-based log shipping, though this
requires custom development.
We may expect that feature in later releases, but definitely not in 8.3
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses, without locking Tables, without Shutdown
and without any User must be forced for logging out (Backup in
Production State Online without Datalosses).
I'm not clear whether you are looking for Backup, or High Availability
Replication.
There is no data loss with the online backup technique described in the
manual.
If you require HA replication with zero data-loss then you are currently
restricted to non-database techniques, which you already mentioned, so
you have your answer already.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs schrieb:
On Thu, 2007-09-06 at 12:53 +0200, apoc9009 wrote:
You've either not read 23.4.4 or haven't understood it. If the text is
unclear, documentation additions/changes are always welcome.I have read this:
PostgreSQL directly supports file-based log shipping as described above.
It is also possible to implement record-based log shipping, though this
requires custom development.We may expect that feature in later releases, but definitely not in 8.3
I wish to have an Solution, thadt backup my Database DB wihout
Datalosses, without locking Tables, without Shutdown
and without any User must be forced for logging out (Backup in
Production State Online without Datalosses).I'm not clear whether you are looking for Backup, or High Availability
Replication.There is no data loss with the online backup technique described in the
manual.
No, there is a lost of Data.
The WAL File musst bei generated by the Postmaster and this File must be
copied to the
safe Remote Backupserver. If the Machine, where the Database is running
crashed, then the Last
WAL is lost and the Backup isnt complet,e this is the Center of the Problem.
If you require HA replication with zero data-loss then you are currently
restricted to non-database techniques, which you already mentioned, so
you have your answer already.
we talking about Backup this isnt the same as Replication.
On 9/6/07, apoc9009 <apoc9009@yahoo.de> wrote:
Backup 12/24/2008 Version 2
/pg/backup/12_24_2008/base/rcvry.rcv <--- Basebackup
/pg/backup/12_24_2008/changes/0001.chg <--- Changed Data
/changes/0002.chg <--- Changed Data
/changes/0003.chg <--- Changed Data
....
/changes/0010.chg <--- Changed Data/changes/0001.rsf <--- Recovery
Stripeset File (10 MByte) addon of Basebackup
delete *.chgif a Stripeset of 10 *.chg Files exist, they should be converted or merged
to one greater Recovery Stripe File (*.RSF)
Why? What does this actually do?
On Thu, Sep 06, 2007 at 04:08:10PM +0200, apoc9009 wrote:
archive_timeout setting. It will produce a lot of log files with very
little content in them, but they will compress well.Yes, it is possible but not recommended . My Backup Servers Filesystem
will explode :D
. . .
Correct, but this is not good enought and i think there are a lot of
Peoples having the same Problem.
It was wishfull, having a Online Streaming Backupsubsystem, thadt can
produce portable Backupdatabases,
to prevent users for Millions of Archive Files on the Backup FTP-Server.
It seems that what you want is near-real-time online backups with _no
cost_, which is not a feature that I think anyone will ever work on.
A
--
Andrew Sullivan | ajs@crankycanuck.ca
Users never remark, "Wow, this software may be buggy and hard
to use, but at least there is a lot of code underneath."
--Damien Katz
On Thu, 2007-09-06 at 17:53 +0300, Hannu Krosing wrote:
Ühel kenal päeval, N, 2007-09-06 kell 16:15, kirjutas apoc9009:
So you want the user to still be connected to the failed machine, but at
the same time be connected to the new live failover machine ?-----
HannuNo.
The User should be connected to the running db without restrictions
while backup is in progressAnd what's wrong with WAL-based backup then ?
Especially the variant described in 23.4.4 wich keeps the WAL copied to
backup machine up to last second ?
I think the OP means
- synchronous replication
- ability to access the standby node for queries
Yes, both requests are planned for later releases.
Bruce, can we edit the TODO to include a section called "High
Availability"?
We already have a few scattered references to such things, but in Admin
and WAL. We need a specific section.
We currently have these items, that can be moved to the new section:
- Allow a warm standby system to also allow read-only statements
- Allow WAL traffic to be streamed to another server for stand-by
replication
new section would look like this:
High Availability
-----------------
- Allow a warm standby system to also allow read-only statements
- Allow WAL traffic to be streamed to another server for stand-by
replication (synchronous/asynchronous options)
- Improve scalability of recovery for large I/O bound databases
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com