Re: Methods to quickly spin up copies of an existing databases
if you are able/willing to use ZFS (rather than ext4, xfs, ...) to store
your database, then it might work for you. ZFS is copy-on-write so it
can very quickly clone a database.
kevin
Show quoted text
On 3/1/2019 12:08 PM, Arjun Ranade wrote:
I'm working on a project that requires on-demand creation of a fresh
database as quick as possible (seconds). Essentially, this is a build
server that will require a cloned instance of Postgres to run unit
tests on. So the pattern of use would be:* Build is triggered
* New postgres instance is created and connection details are
provided to build server
* Build runs unit tests that connect to the database
* Build completes, database is no longer needed and can be thrown awayThe "template" database will be approximately 50gb in size so my
question is what is the fastest way to spin up new instances of this
database? I've thought of using "CREATE DATABASE WITH TEMPLATE..."
but that takes too long. I've also thought about cloning the $PGDATA
directory, changing the port number in postgresql.conf, and starting a
new instance of postgres pointing to the cloned data directory.Both of these methods take longer than I'd like, so I'm wondering if
anyone else has this use case and what methods they've used to solve this.Thanks,
Arjun
Import Notes
Reply to msg id not found: CANrrCRxLYcoXw_VWD10jEYCAwYb-wX8AKSFxe8vaFDUHpnYS_Q@mail.gmail.comReference msg id not found: CANrrCRxLYcoXw_VWD10jEYCAwYb-wX8AKSFxe8vaFDUHpnYS_Q@mail.gmail.com
On Fri, Mar 01, 2019 at 11:57:30AM -0800, Kevin Wilkinson wrote:
if you are able/willing to use ZFS (rather than ext4, xfs, ...) to
store your database, then it might work for you. ZFS is
copy-on-write so it can very quickly clone a database.kevin
Hi Arjun
Redhat 7 does have LVM snapshots that does something similar. Kevin is
correct, COW is the secret.
Regards,
Ken
Kenneth Marshall <ktm@rice.edu> writes:
On Fri, Mar 01, 2019 at 11:57:30AM -0800, Kevin Wilkinson wrote:
if you are able/willing to use ZFS (rather than ext4, xfs, ...) to
store your database, then it might work for you. ZFS is
copy-on-write so it can very quickly clone a database.kevin
Hi Arjun
Redhat 7 does have LVM snapshots that does something similar. Kevin is
correct, COW is the secret.
Going a bit further...
Any sort of storage backend that can support *atomic* snapshots across
*all* volumes (in case multiple tablespaces ar involved), can be used to
permit $instantaneous cloning where instantaneous relates to the actual
snapshot time and crash recovery.
Inability to make *atomic* snaps but perhaps seperate snaps very
quickly, combined with PITR can result in clones of high-churn systems
sized in TBs (as in our use case) to be provisioned in about 1 minute.
Nothing but the most trivial system can be cloned rapidly and perhaps
any number of times in succession without employment of
thin-provisioning, copy-on-write (as mentioned already), etc.
Virtual copy is more and more compelling as physical
size, or more precisely, *physical* copy time grow.
HTH
Regards,
Ken
--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
Apologies for the low tech suggestion, but if this really is a clone of a
previously existing template, could the clone operation just be done ahead
of time? I.e., have the build server keep X copies ready for use and
generate additional copies as those are consumed, so that the cloning is no
longer on the critical path?
On Fri, Mar 1, 2019 at 11:09 AM Jerry Sievers <gsievers19@comcast.net>
wrote:
Show quoted text
Kenneth Marshall <ktm@rice.edu> writes:
On Fri, Mar 01, 2019 at 11:57:30AM -0800, Kevin Wilkinson wrote:
if you are able/willing to use ZFS (rather than ext4, xfs, ...) to
store your database, then it might work for you. ZFS is
copy-on-write so it can very quickly clone a database.kevin
Hi Arjun
Redhat 7 does have LVM snapshots that does something similar. Kevin is
correct, COW is the secret.Going a bit further...
Any sort of storage backend that can support *atomic* snapshots across
*all* volumes (in case multiple tablespaces ar involved), can be used to
permit $instantaneous cloning where instantaneous relates to the actual
snapshot time and crash recovery.Inability to make *atomic* snaps but perhaps seperate snaps very
quickly, combined with PITR can result in clones of high-churn systems
sized in TBs (as in our use case) to be provisioned in about 1 minute.Nothing but the most trivial system can be cloned rapidly and perhaps
any number of times in succession without employment of
thin-provisioning, copy-on-write (as mentioned already), etc.Virtual copy is more and more compelling as physical
size, or more precisely, *physical* copy time grow.HTH
Regards,
Ken--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
Pre-copying is not really an option since we could potentially need 1-X
instances so it needs to be scalable. XFS also allows for "cp --reflink"
which I could do on a PGDATA directory and then change the port number.
That's probably the method I'll try first.
We do use barman, but again a barman recover operation is pretty much just
an rsync.
I will first try XFS, then ZFS, and if those don't work, I'll look into a
SAN option.
On Fri, Mar 1, 2019 at 4:29 PM Bruce Klein <brucek@gmail.com> wrote:
Show quoted text
Apologies for the low tech suggestion, but if this really is a clone of a
previously existing template, could the clone operation just be done ahead
of time? I.e., have the build server keep X copies ready for use and
generate additional copies as those are consumed, so that the cloning is no
longer on the critical path?On Fri, Mar 1, 2019 at 11:09 AM Jerry Sievers <gsievers19@comcast.net>
wrote:Kenneth Marshall <ktm@rice.edu> writes:
On Fri, Mar 01, 2019 at 11:57:30AM -0800, Kevin Wilkinson wrote:
if you are able/willing to use ZFS (rather than ext4, xfs, ...) to
store your database, then it might work for you. ZFS is
copy-on-write so it can very quickly clone a database.kevin
Hi Arjun
Redhat 7 does have LVM snapshots that does something similar. Kevin is
correct, COW is the secret.Going a bit further...
Any sort of storage backend that can support *atomic* snapshots across
*all* volumes (in case multiple tablespaces ar involved), can be used to
permit $instantaneous cloning where instantaneous relates to the actual
snapshot time and crash recovery.Inability to make *atomic* snaps but perhaps seperate snaps very
quickly, combined with PITR can result in clones of high-churn systems
sized in TBs (as in our use case) to be provisioned in about 1 minute.Nothing but the most trivial system can be cloned rapidly and perhaps
any number of times in succession without employment of
thin-provisioning, copy-on-write (as mentioned already), etc.Virtual copy is more and more compelling as physical
size, or more precisely, *physical* copy time grow.HTH
Regards,
Ken--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
On Fri, Mar 01, 2019 at 04:51:32PM -0500, Arjun Ranade wrote:
Pre-copying is not really an option since we could potentially need 1-X
instances so it needs to be scalable.
Would it work to have a PG instance with a ready-to-use
cluster/DB inside a VM, and make a copy of that as needed ?
The copies then get thrown away.
Same here, pre-copying a bunch of VMs would help. Disk space
is (apparently) cheaper than time (for your use case).
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B