restore a specific schema from physical backup

Started by Rakesh Kumarover 9 years ago6 messagesgeneral
Jump to latest
#1Rakesh Kumar
rakeshkumar464a3@gmail.com

If a cluster is backed up physically using pg_basebackup, how can we
restore only a particular schema from it. Is it even possible?

Thanks

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rakesh Kumar (#1)
Re: restore a specific schema from physical backup

On 07/29/2016 02:16 PM, Rakesh Kumar wrote:

If a cluster is backed up physically using pg_basebackup, how can we
restore only a particular schema from it. Is it even possible?

Are you saying that?:

1) You ran pg_basebackup against a live cluster and sent the output to
another location.

2) At the other location the cluster is not in use.

3) You want to grab the contents of the inactive cluster directly off
the disk.

If that is the case, then no it is not possible without making the
cluster live.

If you mean something else then more details are needed.

Thanks

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Rakesh Kumar
rakeshkumar464a3@gmail.com
In reply to: Adrian Klaver (#2)
Re: restore a specific schema from physical backup

Are you saying that?:

1) You ran pg_basebackup against a live cluster and sent the output to
another location.

2) At the other location the cluster is not in use.

3) You want to grab the contents of the inactive cluster directly off the
disk.

If that is the case, then no it is not possible without making the cluster
live.

If you mean something else then more details are needed.

Sure.

1 - You ran pg_basebackup on node-1 against a live cluster and store
it on NFS or tape.
2 - Do a restore on node-2 from the backup taken on (1), but only for
a subset of the database
(schema/database)
3- Put the cluster live on node-2 after (2) completes. Essentially the
cluster will now be a small
subset of cluster on node-1.

Benefit: If I have to restore only 5% of entire db, it should be lot faster.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Rakesh Kumar (#3)
Re: restore a specific schema from physical backup

On 07/29/2016 02:31 PM, Rakesh Kumar wrote:

Are you saying that?:

1) You ran pg_basebackup against a live cluster and sent the output to
another location.

2) At the other location the cluster is not in use.

3) You want to grab the contents of the inactive cluster directly off the
disk.

If that is the case, then no it is not possible without making the cluster
live.

If you mean something else then more details are needed.

Sure.

1 - You ran pg_basebackup on node-1 against a live cluster and store
it on NFS or tape.
2 - Do a restore on node-2 from the backup taken on (1), but only for
a subset of the database
(schema/database)
3- Put the cluster live on node-2 after (2) completes. Essentially the
cluster will now be a small
subset of cluster on node-1.

Benefit: If I have to restore only 5% of entire db, it should be lot faster.

So no:

https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html
"pg_basebackup makes a binary copy of the database cluster files, while
making sure the system is put in and out of backup mode automatically.
Backups are always taken of the entire database cluster; it is not
possible to back up individual databases or database objects. For
individual database backups, a tool such as pg_dump must be used."

If you want to do that with built in tools then you will need to use
pg_dump.

I would suggest pg_dump -Fc as you have more options of what to do when
restoring objects. Remember pg_dump only works on individual databases
within a cluster.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5David Steele
david@pgmasters.net
In reply to: Rakesh Kumar (#3)
Re: restore a specific schema from physical backup

On 7/29/16 5:31 PM, Rakesh Kumar wrote:

Are you saying that?:

1) You ran pg_basebackup against a live cluster and sent the output to
another location.

2) At the other location the cluster is not in use.

3) You want to grab the contents of the inactive cluster directly off the
disk.

If that is the case, then no it is not possible without making the cluster
live.

If you mean something else then more details are needed.

Sure.

1 - You ran pg_basebackup on node-1 against a live cluster and store
it on NFS or tape.
2 - Do a restore on node-2 from the backup taken on (1), but only for
a subset of the database
(schema/database)
3- Put the cluster live on node-2 after (2) completes. Essentially the
cluster will now be a small
subset of cluster on node-1.

Benefit: If I have to restore only 5% of entire db, it should be lot faster.

pgBackRest allows specified databases to be restored from a cluster backup:

http://www.pgbackrest.org/user-guide.html#restore/option-db-include

I know you are interested in schema-level restores but this is the
closest thing that I know of.

--
-David
david@pgmasters.net

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Stephen Frost
sfrost@snowman.net
In reply to: David Steele (#5)
Re: restore a specific schema from physical backup

* David Steele (david@pgmasters.net) wrote:

On 7/29/16 5:31 PM, Rakesh Kumar wrote:

Sure.

1 - You ran pg_basebackup on node-1 against a live cluster and store
it on NFS or tape.
2 - Do a restore on node-2 from the backup taken on (1), but only for
a subset of the database
(schema/database)
3- Put the cluster live on node-2 after (2) completes. Essentially the
cluster will now be a small
subset of cluster on node-1.

Benefit: If I have to restore only 5% of entire db, it should be lot faster.

pgBackRest allows specified databases to be restored from a cluster backup:

http://www.pgbackrest.org/user-guide.html#restore/option-db-include

I know you are interested in schema-level restores but this is the
closest thing that I know of.

We have discussed providing the ability to restore a subset of a
database from a physical backup, but it's far from trivial. Working out
what files contain the catalog requires first reading through
pg_filenode.map and then understanding the structures of the relevant
catalogs. Only then will you know what schemas and tables exist and
what their relfilenode's are, which is necessary to perform the restore
of those objects.

Of course, WAL replay still has to be performed also, to reach a
consistent backup point. We've worked out how to get that to work,
though if you have a lot of WAL then that can still take a bit of time
and disk space.

With sufficient interest and resources, we might be able to make it
happen, but I wouldn't expect it near-term. Until then, at least the
database-level option, as David mentioned, can be used.

Thanks!

Stephen