BDR Cluster vs DB Config

Started by Jonathan Eastgateover 9 years ago5 messagesgeneral

jonathan.eastgate@simpro.co

over 9 years ago

Hi everyone.

We've been testing BDR on and off for the last 2 years and are keen to
start looking at implementing it in production as it seems 0.93 has
resolved most of the issues we faced with it in the early days.

However there is still one item that makes it a difficult proposition...

DSN config per database.

Is there any way to configure BDR on a cluster wide basis so that all DB's
on a cluster are replicated via BDR instead of having to configure a
connection for each DB we want to replicate?

The problem we have is over 20 clusters with about 200 DB's per cluster and
growing constantly so this would make deploying BDR a painful process - if
we had to add a connection for each existing DB and then every new DB.

Is there a way around this or are there plans to make this type of config
available?

Thanks in advance.

*Jonathan J. Eastgate*
Chief Technology Officer | simPRO Software Group
Ph: 1300 139 467 +61 7 3147 8777

<http://simprogroup.com/email-signature-promo/>

Keep up to date with simPRO at: http://simprogroup.com/blog
The contents of this email are subject to our email disclaimer
<http://simprogroup.com/au/legal/email-confidentiality-notice>.

--
--

Alvaro Aguayo Garcia-Rada

aaguayo@opensysperu.com

over 9 years ago

In reply to: Jonathan Eastgate (#1)

Re: BDR Cluster vs DB Config

Hello. BDR works on a per-database basis, so there's nothing like what you are looking for. However, if you initialize a BDR custer with bdr_init_copy, you will get all existing databases added to replication. Then, as part of the creation of new databases, you can use bdr_group_join function, which will allow you to add the new database(even empty) to the cluster.

Regards,

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC: (+51) 954183248
Website: www.ocs.pe

Sent from my Sony Xperia™ smartphone

---- Jonathan Eastgate wrote ----

Hi everyone.

However there is still one item that makes it a difficult proposition...

DSN config per database.

Is there any way to configure BDR on a cluster wide basis so that all DB's
on a cluster are replicated via BDR instead of having to configure a
connection for each DB we want to replicate?

Is there a way around this or are there plans to make this type of config
available?

Thanks in advance.

*Jonathan J. Eastgate*
Chief Technology Officer | simPRO Software Group
Ph: 1300 139 467 +61 7 3147 8777

<http://simprogroup.com/email-signature-promo/>

Keep up to date with simPRO at: http://simprogroup.com/blog
The contents of this email are subject to our email disclaimer
<http://simprogroup.com/au/legal/email-confidentiality-notice>.

--
--

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Jonathan Eastgate (#1)

Re: BDR Cluster vs DB Config

On 20 July 2016 at 13:22, Jonathan Eastgate <jonathan.eastgate@simpro.co>
wrote:

Hi everyone.

We've been testing BDR on and off for the last 2 years and are keen to
start looking at implementing it in production as it seems 0.93 has
resolved most of the issues we faced with it in the early days.

However there is still one item that makes it a difficult proposition...

DSN config per database.

Is there any way to configure BDR on a cluster wide basis so that all DB's
on a cluster are replicated via BDR instead of having to configure a
connection for each DB we want to replicate?

No.

Not only that, but if you're replicating lots of databases between
PostgreSQL instances you're likely to start facing some performance
problems around the sheer number of background workers required, the way
WAL needs to be processed multiple times, etc.

If you're using this for multi-tenancy or similar, see if you can isolate
by schema not by database.

The problem we have is over 20 clusters with about 200 DB's per cluster
and growing constantly so this would make deploying BDR a painful process -
if we had to add a connection for each existing DB and then every new DB.

Yeah. That's going to cause you pain even aside from the management of it.

Is there a way around this or are there plans to make this type of config
available?

There are no plans to automate this configuration. BDR works at a database
level, with the exception of bdr_init_copy bringing up all BDR-enabled
databases on the join target node as a one-time operation at setup.

Maybe once we eventually have some kind of answer for how to replicate
instance-global DDL that affects shared catalogs, like database
creation/drop, user creation/drop, etc, then it might make sense to extend
BDR or its successor to do this. But not at the moment.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Jonathan Eastgate

jonathan.eastgate@simpro.co

over 9 years ago

In reply to: Craig Ringer (#3)

Re: BDR Cluster vs DB Config

Thanks guys.

Very helpful - I was thinking we may need to look at moving to schemas
instead of individual db's.

I assume that once BDR is enabled on a database that any additional schemas
added post config are automatically included in BDR replication?

And so you see any issues having potentially 200 schemas within the DB -
performance or replication wise?

Thanks in advance.

*Jonathan J. Eastgate*
Chief Technology Officer | simPRO Software Group
Ph: 1300 139 467 +61 7 3147 8777

<http://simprogroup.com/email-signature-promo/>

Keep up to date with simPRO at: http://simprogroup.com/blog
The contents of this email are subject to our email disclaimer
<http://simprogroup.com/au/legal/email-confidentiality-notice>.

On Wed, Jul 20, 2016 at 5:18 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 20 July 2016 at 13:22, Jonathan Eastgate <jonathan.eastgate@simpro.co>
wrote:

Hi everyone.

We've been testing BDR on and off for the last 2 years and are keen to
start looking at implementing it in production as it seems 0.93 has
resolved most of the issues we faced with it in the early days.

However there is still one item that makes it a difficult proposition...

DSN config per database.

Is there any way to configure BDR on a cluster wide basis so that all
DB's on a cluster are replicated via BDR instead of having to configure a
connection for each DB we want to replicate?

No.

Not only that, but if you're replicating lots of databases between
PostgreSQL instances you're likely to start facing some performance
problems around the sheer number of background workers required, the way
WAL needs to be processed multiple times, etc.

If you're using this for multi-tenancy or similar, see if you can isolate
by schema not by database.

The problem we have is over 20 clusters with about 200 DB's per cluster
and growing constantly so this would make deploying BDR a painful process -
if we had to add a connection for each existing DB and then every new DB.

Yeah. That's going to cause you pain even aside from the management of it.

Is there a way around this or are there plans to make this type of config
available?

There are no plans to automate this configuration. BDR works at a database
level, with the exception of bdr_init_copy bringing up all BDR-enabled
databases on the join target node as a one-time operation at setup.

Maybe once we eventually have some kind of answer for how to replicate
instance-global DDL that affects shared catalogs, like database
creation/drop, user creation/drop, etc, then it might make sense to extend
BDR or its successor to do this. But not at the moment.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
--

Martín Marqués

martin@2ndquadrant.com

over 9 years ago

In reply to: Jonathan Eastgate (#4)

Re: BDR Cluster vs DB Config

El 20/07/16 a las 20:06, Jonathan Eastgate escribió:

I assume that once BDR is enabled on a database that any additional
schemas added post config are automatically included in BDR replication?

All DDLs (CREATE SCHEMA ...) will be replicated to the other nodes, but
if you are asking if the data from newly created tables will
automatically start replicating, that depends on how you initially
configured your replication sets.

The short answer is, the default is to replicate all tables, new ones
included.

But if you plan on doing selective replication by creating a replication
set and adding tables there, then it's on you to add or not the new
tables to that set.

And so you see any issues having potentially 200 schemas within the DB -
performance or replication wise?

I don't see any possible performance degradation just for having the
tables spread in different schemata.

There will be one walsender for each downstream node, and one apply
worker for each upstream node, so you literally cut down to 1/200 the
amount of replication slots and replication connections on each node.
That's a huge performance win, particularly on 9.4.

Regards,

--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general