Incremental backup

Started by Martín Marquésover 23 years ago25 messageshackers

martin@bugs.unl.edu.ar

over 23 years ago

How's this issue going on the 7.4 development tree?
I saw it on the TODO list, but didn't find much on the archives of this
mailing list.

--
Porqué usar una base de datos relacional cualquiera,
si podés usar PostgreSQL?
-----------------------------------------------------------------
Martín Marqués | mmarques@unl.edu.ar
Programador, Administrador, DBA | Centro de Telematica
Universidad Nacional
del Litoral
-----------------------------------------------------------------

Bruce Momjian

bruce@momjian.us

over 23 years ago

In reply to: Martín Marqués (#1)

Re: Incremental backup

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups. It will be in 7.4.

---------------------------------------------------------------------------

Martin Marques wrote:

How's this issue going on the 7.4 development tree?
I saw it on the TODO list, but didn't find much on the archives of this
mailing list.

--
Porqu? usar una base de datos relacional cualquiera,
si pod?s usar PostgreSQL?
-----------------------------------------------------------------
Mart?n Marqu?s | mmarques@unl.edu.ar
Programador, Administrador, DBA | Centro de Telematica
Universidad Nacional
del Litoral
-----------------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Christopher Kings-Lynne

chriskl@familyhealth.com.au

over 23 years ago

In reply to: Bruce Momjian (#2)

Re: Incremental backup

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups. It will be in 7.4.

Does that mean that the poor guy/gal is implementing redo for all the index
types?

Chris

Bruce Momjian

bruce@momjian.us

over 23 years ago

In reply to: Christopher Kings-Lynne (#3)

Re: Incremental backup

Christopher Kings-Lynne wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups. It will be in 7.4.

Does that mean that the poor guy/gal is implementing redo for all the index
types?

No idea.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Patrick Macdonald

patrickm@redhat.com

over 23 years ago

In reply to: Bruce Momjian (#2)

Re: Incremental backup

Bruce Momjian wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups.

PITR and incremental backup are different beasts. PITR deals with a backup
+ logs. Incremental backup deals with a full backup + X smaller/incremental
backups.

So... it doesn't look like anyone is working on incremental backup at the
moment.

Cheers,
Patrick

Show quoted text

It will be in 7.4.

---------------------------------------------------------------------------

Martin Marques wrote:

How's this issue going on the 7.4 development tree?
I saw it on the TODO list, but didn't find much on the archives of this
mailing list.

--
Porqu? usar una base de datos relacional cualquiera,
si pod?s usar PostgreSQL?
-----------------------------------------------------------------
Mart?n Marqu?s | mmarques@unl.edu.ar
Programador, Administrador, DBA | Centro de Telematica
Universidad Nacional
del Litoral
-----------------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Bruce Momjian

bruce@momjian.us

over 23 years ago

In reply to: Patrick Macdonald (#5)

Re: Incremental backup

Patrick Macdonald wrote:

Bruce Momjian wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups.

PITR and incremental backup are different beasts. PITR deals with a backup
+ logs. Incremental backup deals with a full backup + X smaller/incremental
backups.

So... it doesn't look like anyone is working on incremental backup at the
moment.

But why would someone want incremental backups compared to PITR? The
backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems
pretty weird. :-)

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Patrick Macdonald

patrickm@redhat.com

over 23 years ago

In reply to: Bruce Momjian (#6)

Re: Incremental backup

Bruce Momjian wrote:

Patrick Macdonald wrote:

Bruce Momjian wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups.

PITR and incremental backup are different beasts. PITR deals with a backup
+ logs. Incremental backup deals with a full backup + X smaller/incremental
backups.

So... it doesn't look like anyone is working on incremental backup at the
moment.

But why would someone want incremental backups compared to PITR? The
backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems
pretty weird. :-)

Yeah, it's a different method of producing a similar outcome. However, many
companies do not want to be concerned with the management (and space)
of archived logs. Incremental backup allows them the option of performing
a full backup and then only backing up the modifications on a regular basis.
When it's time to restore, they'll restore the full backup and then the
proper sequence of incremental backups.

Cheers,
Patrick

Bruce Momjian

bruce@momjian.us

over 23 years ago

In reply to: Patrick Macdonald (#7)

Re: Incremental backup

Patrick Macdonald wrote:

But why would someone want incremental backups compared to PITR? The
backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems
pretty weird. :-)

Yeah, it's a different method of producing a similar outcome. However, many
companies do not want to be concerned with the management (and space)
of archived logs. Incremental backup allows them the option of performing
a full backup and then only backing up the modifications on a regular basis.
When it's time to restore, they'll restore the full backup and then the
proper sequence of incremental backups.

Wow, I never even thought that was possible. Do other db's support that
feature?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Rod Taylor

rbt@rbt.ca

over 23 years ago

In reply to: Bruce Momjian (#8)

Re: Incremental backup

Wow, I never even thought that was possible. Do other db's support that
feature?

Isn't that basically what the current replication kits for Postgresql do
-- via triggers and log tables?

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

#10

Patrick Macdonald

patrickm@redhat.com

over 23 years ago

In reply to: Bruce Momjian (#8)

Re: Incremental backup

Bruce Momjian wrote:

Patrick Macdonald wrote:

Yeah, it's a different method of producing a similar outcome. However, many
companies do not want to be concerned with the management (and space)
of archived logs. Incremental backup allows them the option of performing
a full backup and then only backing up the modifications on a regular basis.
When it's time to restore, they'll restore the full backup and then the
proper sequence of incremental backups.

Wow, I never even thought that was possible. Do other db's support that
feature?

I know Oracle and DB2 have incremental backup in their arsenal (and iirc,
SQL Server has something called "differential backup"). Whatever the name,
it's a win at the enterprise level.

Cheers,
Patrick

#11

Manfred Koizar

mkoi-pg@aon.at

over 23 years ago

In reply to: Patrick Macdonald (#10)

Re: Incremental backup

On Thu, 13 Feb 2003 19:24:13 -0500, Patrick Macdonald
<patrickm@redhat.com> wrote:

I know Oracle and DB2 have incremental backup in their arsenal (and iirc,
SQL Server has something called "differential backup"). Whatever the name,
it's a win at the enterprise level.

"A differential backup copies only the database pages that have been
modified after the last full database backup."

This could be doable using XLogRecPtr pd_lsn in the page headers, but
I don't see an easy way to do it on a live database.

Servus
Manfred

#12

Martín Marqués

martin@bugs.unl.edu.ar

over 23 years ago

In reply to: Bruce Momjian (#6)

Re: Incremental backup

On Jue 13 Feb 2003 16:38, Bruce Momjian wrote:

Patrick Macdonald wrote:

Bruce Momjian wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups.

PITR and incremental backup are different beasts. PITR deals with a
backup + logs. Incremental backup deals with a full backup + X
smaller/incremental backups.

So... it doesn't look like anyone is working on incremental backup at the
moment.

But why would someone want incremental backups compared to PITR? The
backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems
pretty weird. :-)

Good backup systems, such as Informix (it's the one I used) doesn't do a query
backup, but a pages backup. What I mean is that it looks for pages in the
system that has changed from the las full backup and backs them up.

That's how an incremental backup works. PITR is another thing, which is even
more important. :-)

#13

Bruce Momjian

bruce@momjian.us

over 23 years ago

In reply to: Martín Marqués (#12)

Re: Incremental backup

OK, once we have PITR, will anyone want incremental backups?

---------------------------------------------------------------------------

Martin Marques wrote:

On Jue 13 Feb 2003 16:38, Bruce Momjian wrote:

Patrick Macdonald wrote:

Bruce Momjian wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups.

PITR and incremental backup are different beasts. PITR deals with a
backup + logs. Incremental backup deals with a full backup + X
smaller/incremental backups.

So... it doesn't look like anyone is working on incremental backup at the
moment.

But why would someone want incremental backups compared to PITR? The
backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems
pretty weird. :-)

Good backup systems, such as Informix (it's the one I used) doesn't do a query
backup, but a pages backup. What I mean is that it looks for pages in the
system that has changed from the las full backup and backs them up.

That's how an incremental backup works. PITR is another thing, which is even
more important. :-)

--
Porqu? usar una base de datos relacional cualquiera,
si pod?s usar PostgreSQL?
-----------------------------------------------------------------
Mart?n Marqu?s | mmarques@unl.edu.ar
Programador, Administrador, DBA | Centro de Telematica
Universidad Nacional
del Litoral
-----------------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#14

Martín Marqués

martin@bugs.unl.edu.ar

over 23 years ago

In reply to: Bruce Momjian (#13)

Re: Incremental backup

On Vie 14 Feb 2003 09:52, Bruce Momjian wrote:

OK, once we have PITR, will anyone want incremental backups?

I will probably not need it, but I know of people how have databases which
build dumps of more then 20GB.
They are interested in live incremental backups.

#15

Greg Copeland

greg@CopelandConsulting.Net

over 23 years ago

In reply to: Bruce Momjian (#13)

Re: Incremental backup

On Fri, 2003-02-14 at 06:52, Bruce Momjian wrote:

OK, once we have PITR, will anyone want incremental backups?

---------------------------------------------------------------------------

Martin Marques wrote:

On Jue 13 Feb 2003 16:38, Bruce Momjian wrote:

Patrick Macdonald wrote:

Bruce Momjian wrote:

Someone at Red Hat is working on point-in-time recovery, also known as
incremental backups.

PITR and incremental backup are different beasts. PITR deals with a
backup + logs. Incremental backup deals with a full backup + X
smaller/incremental backups.

So... it doesn't look like anyone is working on incremental backup at the
moment.

But why would someone want incremental backups compared to PITR? The
backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems
pretty weird. :-)

Good backup systems, such as Informix (it's the one I used) doesn't do a query
backup, but a pages backup. What I mean is that it looks for pages in the
system that has changed from the las full backup and backs them up.

That's how an incremental backup works. PITR is another thing, which is even
more important. :-)

I do imagine for some people it will register high on their list.

--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting

#16

Kevin Brown

kevin@sysexperts.com

over 23 years ago

In reply to: Bruce Momjian (#13)

Re: Incremental backup

Bruce Momjian wrote:

OK, once we have PITR, will anyone want incremental backups?

None of my database references (Date's "Introduction to Database
Systems" and Garcia-Molina's "Database Systems - The Complete Book",
in particular) seem to talk about PITR at all. At least, there's no
index entry for it. And a google search for "point in time recovery"
yields mostly marketing fluff.

Is there a good reference for this that someone can point me to? I'm
interested in exactly how it'll work, especially in terms of how logs
are stored versus the main data store, effects on performance, etc.

Thanks, and sorry for the newbie question. :-(

--
Kevin Brown kevin@sysexperts.com

#17

Chris Browne

cbbrowne@acm.org

over 23 years ago

In reply to: Kevin Brown (#16)

Re: Incremental backup

A long time ago, in a galaxy far, far away, kevin@sysexperts.com (Kevin Brown) wrote:

Bruce Momjian wrote:

OK, once we have PITR, will anyone want incremental backups?

None of my database references (Date's "Introduction to Database
Systems" and Garcia-Molina's "Database Systems - The Complete Book",
in particular) seem to talk about PITR at all. At least, there's no
index entry for it. And a google search for "point in time recovery"
yields mostly marketing fluff.

Well, from an "academic DBMS" standpoint, it isn't terribly
interesting, since it involves assumptions of messy imperfection that
academics prefer to avoid.

And that's not intended to insult the academics; it is often
reasonable to leave that "out of scope" much as an academic OS
researcher might prefer to try to avoid putting attention on things
like binary linkers, text file editors, and SCM systems like CVS,
which, while terribly important from a practical standpoint, don't
make for interesting OS research.

Is there a good reference for this that someone can point me to?
I'm interested in exactly how it'll work, especially in terms of how
logs are stored versus the main data store, effects on performance,
etc.

Thanks, and sorry for the newbie question. :-(

Unfortunately, the best sources I can think of are in the "O-Word"
literature, and the /practical/ answers require digging into really
messy bits of the documentation.

What it amounts to is that anyone that isn't a "near-O*****-guru" would be
strongly advised not to engage in PITR activity.

It doesn't surprise me overly that the documentation is poor: those
that can't figure it out despite the challenges almost surely
shouldn't be using the functionality...

What PITR generally consists of is the notion that you want to recover
to the state at a particular moment in time.

In O*****-nomenclature, this means that you recover as at some earlier
moment for which you have a good backup, and then re-apply changes,
which in their terms, are kept in "archive logs," which are somewhat
analagous to WAL files.
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://www3.sympatico.ca/cbbrowne/x.html
"We blew it -- too big, too slow..." - Bill Gates talking about NT, as
noted by Steven McGeady of Intel during a meeting with Gates

#18

Kevin Brown

kevin@sysexperts.com

over 23 years ago

In reply to: Chris Browne (#17)

Re: Incremental backup

Christopher Browne wrote:

What PITR generally consists of is the notion that you want to recover
to the state at a particular moment in time.

In O*****-nomenclature, this means that you recover as at some earlier
moment for which you have a good backup, and then re-apply changes,
which in their terms, are kept in "archive logs," which are somewhat
analagous to WAL files.

Yeah, that's pretty much what I figured.

Oracle has something they call "rollback segments" which I assume are
separate bits of data that have enough information to reverse changes
that were made to the database during a transaction, and I figured
PITR would (or could) apply particular saved rollback segments to the
current state in order to "roll back" a table, tablespace, or database
to the state it was in at a particular point in time.

As it is, it sounds like PITR is a bit less refined than I expected.

So the relevant question is: how is *our* PITR going to work? In
particular, how is it going to interact with our WAL files and the
table store? If I'm not mistaken, right now (well, as of 7.2 anyway)
we round robin through a fixed set of WAL files. For PITR, I assume
we'd need an archivelog function that would copy the WAL files as
they're checkpointed to some other location (with destination names
that reflect their order in time), just for starters.

It'd be *awfully* nice if you could issue a command to roll a table
(or, perhaps, a tablespace, if you've got a bunch of foreign keys and
such) back to a particular point in time, from the command line, with
no significant advance preparation (so long as the required files are
still around, and if they're not then abort the operation with the
appropriate error message). But it doesn't sound like that's what
we're talking about when we talk about PITR...

I wouldn't expect the O***** docs to be particularly revealing about
how the database manages PITR at the file level, but if it does, would
you happen to know where so I can look at it? What I've seen so far
is very basic and not very revealing at all...

--
Kevin Brown kevin@sysexperts.com

#19

Curt Sampson

cjs@cynic.net

over 23 years ago

In reply to: Bruce Momjian (#13)

Re: Incremental backup

On Fri, 14 Feb 2003, Bruce Momjian wrote:

OK, once we have PITR, will anyone want incremental backups?

Well, I'm not entirely clear on how PITR will work, so I may be off-base
here, but it seems to me that offering incremental backups that back
up only changed pages might not be all that big a win, given how
postgres writes its pages. On DBMSs that don't use MVCC, if you change a
particular item in a row ten times, one page is changed. If you do it in
postgres, you could well be changing ten pages, as the system writes the
two copies of the entire row wherever it can find space. So in databases
where a lot of rows are changed, where an incremental backup would
normally be a win because it would be much smaller than the logs over a
given period, it isn't going to be with postgres.

But you know, if we could get rid of redundant changes in the logs we're
using for backup, that could save a lot of space in a situation like
the one I described above. If a particular row and column is changed
fifty times over the course of a month, it's going to be recorded fifty
times in the log. But there's really no need for all fifty of those,
if you don't mind not being able to restore to any time before the
current time. You can reduce the size of the logs you need to store
for backup by throwing away the first forty-nine of those changes, and
keeping only the most recent version. There shouldn't be any worries
about referential integrity, because when you do a restore, you start
with a full backup that is ok, and once you've successfully applied
all the transactions in the log, you know it will be ok again, so any
intermediate states during the restore where integrity is not maintained
are not a problem.

cjs
--
Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

#20

Tom Lane

tgl@sss.pgh.pa.us

over 23 years ago

In reply to: Curt Sampson (#19)

Re: Incremental backup

Curt Sampson <cjs@cynic.net> writes:

... But there's really no need for all fifty of those,
if you don't mind not being able to restore to any time before the
current time.

Which, of course, is exactly the point of PITR designs.

When you know that your assistant trainee DBA deleted most of your
database with a mistyped command last Tuesday evening around 8pm,
it is cold comfort to know that your database has faithfully preserved
his committed changes. You want to get back to where you were Tuesday
afternoon, or preferably Tuesday evening 7:59pm. This is what PITR
setups can do for you.

If you don't feel you need PITR capability, fine ... but don't tell
the people who want it that they have no need for it.

regards, tom lane

#21

Curt Sampson

cjs@cynic.net

over 23 years ago

In reply to: Kevin Brown (#18)

#22

Curt Sampson

cjs@cynic.net

over 23 years ago

In reply to: Tom Lane (#20)