Methods to quickly spin up copies of an existing databases

Started by Arjun Ranadeabout 7 years ago2 messagesgeneral
Jump to latest
#1Arjun Ranade
ranade@nodalexchange.com

I'm working on a project that requires on-demand creation of a fresh
database as quick as possible (seconds). Essentially, this is a build
server that will require a cloned instance of Postgres to run unit tests
on. So the pattern of use would be:

- Build is triggered
- New postgres instance is created and connection details are provided
to build server
- Build runs unit tests that connect to the database
- Build completes, database is no longer needed and can be thrown away

The "template" database will be approximately 50gb in size so my question
is what is the fastest way to spin up new instances of this database? I've
thought of using "CREATE DATABASE WITH TEMPLATE..." but that takes too
long. I've also thought about cloning the $PGDATA directory, changing the
port number in postgresql.conf, and starting a new instance of postgres
pointing to the cloned data directory.

Both of these methods take longer than I'd like, so I'm wondering if anyone
else has this use case and what methods they've used to solve this.

Thanks,
Arjun

#2Kevin Wilkinson
w.kevin.wilkinson@gmail.com
In reply to: Arjun Ranade (#1)
how best to specify table constraints with "create table as <query>"

i want to recluster an immutable table without locking the table and
then add the table as a partition of a parent table. my plan was:

create table tbl_cpy as select * from tbl order by c1 asc;
alter table tbl_cpy add constraint c1 not null, c1>=lo c1<hi;
alter table tbl_parent attach partition tbl_cpy for values from (lo)
to (hi);

i want this to run quickly. so, i populate the table with a parallel
query (using create table as ...). i do NOT want attach partition to
rescan the table (which it does to validate the value range) so i have
to explicitly add constraints to tbl_cpy. but adding these constraints
itself causes a scan of the table.

so, i want to specify the constraints when the table is created. but
that seems not possible. if i first create an empty table with
constraints, then i cannot populate it with a parallel query (since
insert into is not parallel).

am i missing something? any ideas?

thanks,

kevin