PostgreSQL HA config recommendations

Started by Alex Gregoryalmost 11 years ago9 messagesgeneral

alex@c2company.com

almost 11 years ago

Hello-

I have been doing lots of reading and I really want to make sure that I get this HA architecture I am working on correct. I figured the best way would be to reach out to the community for advice.

I am installing Cisco Jabber and want to use Postgres for the back end. The Postgres servers will be running on CentOS.

The requirement is to have two servers in HA using a database stored on shared NetApp Filer storage in California. A third server will be in Ireland for DR purposes. There only really needs to be one writeable server in California if it keeps things simple. Automatic conflict resolution should be easier this way causing lower overhead.

I was thinking that I could use Slony but then I read that it does not like WAN replication. I have also read about streaming replication native to Postgres but was not sure how that would work over the WAN. Bucardo seems better for Data Warehousing or multimaster situations which this is not. That leaves pgpool ii which seems like it would add an extra layer of complexity.

When it comes down to to there are so many choices I am not sure if I need one or a combination of two. Any help you could provide could be greatly appreciated.

Thanks for the assistance,

Alex

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 11 years ago

In reply to: Alex Gregory (#1)

Re: PostgreSQL HA config recommendations

On 4/29/15 1:13 PM, Alex Gregory wrote:

I was thinking that I could use Slony but then I read that it does not like WAN replication. I have also read about streaming replication native to Postgres but was not sure how that would work over the WAN. Bucardo seems better for Data Warehousing or multimaster situations which this is not. That leaves pgpool ii which seems like it would add an extra layer of complexity.

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

I haven't played with Slony in forever, but when I did it loved to lock
things. That would not play well with high latency.

I have run londiste between sites within the same city, and that worked
well.

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their uses, but
I personally distrust that idea, especially for DR.

When it comes down to to there are so many choices I am not sure if I need one or a combination of two. Any help you could provide could be greatly appreciated.

If you want to replicate within a data center then streaming replication
is pretty nice, and as a bonus you might be able to do synchronous as
well. The downside to streaming rep is that it's binary, so if you ever
suffer data corruption you're practically guaranteed that corruption
will end up on the replica. Logical replication like londiste or Slony
are much more robust against that. You also can't use temporary tables
with streaming rep, and you have to replicate the details of ALL
activity, including maintenance like VACUUM. In some environments that
might be slower than logical replication.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Fabio Ugo Venchiarutti

fabio@vuole.me

almost 11 years ago

In reply to: Jim Nasby (#2)

Re: PostgreSQL HA config recommendations

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

In my experience streaming replication fits most use cases due to
inherent its simplicity and robustness, but you might need to adjust
your software design to get the best out of it.

More specifically, latency issues can be heavily mitigated by having
application software commit no more than one transaction per user
operation, provided 1 x "master<->sync_slave round trip time" is
acceptable delay when they submit forms or the like.

It can get much worse if the application server is on a different
geographical node than the DB master. In such case it is realistically
beneficial to batch multiple write operations in a single STATEMENT instead.
If the replication synchronous slave is on yet another node, the best
case (single statement) scenario would be 2 x round trip time. This
configuration is more common than you might think as some setups feature
remote app servers reading off synchronous slaves at their own physical
location but committing against a master that is somewhere else.

Cheers

On 30/04/15 11:06, Jim Nasby wrote:

On 4/29/15 1:13 PM, Alex Gregory wrote:

I was thinking that I could use Slony but then I read that it does not
like WAN replication. I have also read about streaming replication
native to Postgres but was not sure how that would work over the WAN.
Bucardo seems better for Data Warehousing or multimaster situations
which this is not. That leaves pgpool ii which seems like it would
add an extra layer of complexity.

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

I haven't played with Slony in forever, but when I did it loved to lock
things. That would not play well with high latency.

I have run londiste between sites within the same city, and that worked
well.

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their uses, but
I personally distrust that idea, especially for DR.

When it comes down to to there are so many choices I am not sure if I
need one or a combination of two. Any help you could provide could
be greatly appreciated.

If you want to replicate within a data center then streaming replication
is pretty nice, and as a bonus you might be able to do synchronous as
well. The downside to streaming rep is that it's binary, so if you ever
suffer data corruption you're practically guaranteed that corruption
will end up on the replica. Logical replication like londiste or Slony
are much more robust against that. You also can't use temporary tables
with streaming rep, and you have to replicate the details of ALL
activity, including maintenance like VACUUM. In some environments that
might be slower than logical replication.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Greg Sabino Mullane

greg@turnstep.com

almost 11 years ago

In reply to: Jim Nasby (#2)

Re: PostgreSQL HA config recommendations

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their uses, but
I personally distrust that idea, especially for DR.

Actually, Bucardo is very data-based, not statement at all. It stores up
a list of affected rows, then COPYs those from the source databases
to the target ones. It's particularly strong when you have large
delays and many updates to the same row.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201504301422
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlVCcwIACgkQvJuQZxSWSshEUQCg0EXckq3oTtR81I6Dv1nE1pgI
hQgAoILoHlNNw5+oE7bPR/CNyF32T5Qh
=vp93
-----END PGP SIGNATURE-----

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

William Dunn

dunnwjr@gmail.com

almost 11 years ago

In reply to: Fabio Ugo Venchiarutti (#3)

Re: PostgreSQL HA config recommendations

Alex,
Note that you should be weary of suggestions to make your replication
synchronous. Synchronous replication is rarely used for this kind of use
case (Cisco Jabber) where the most complete durability of the standby is
not of the utmost concern (as it would be in a banking application). Not
only will it decrease performance, but since you expect to have only one
local standby it could actually decrease your availability because if your
standby went down no transactions would be able to commit on the master.
See the Synchronous Replication section of the docs for more details (
http://www.postgresql.org/docs/devel/static/warm-standby.html).

Also note that the suggestion provided by Fabio that you should not have
your application commit more than one transaction per user operation is
only applicable in synchronous replication (though since this is for a Cisco
Jabber, where you neither have control over nor much information regarding
the number of commits sent by the transaction per user operation, that
suggestion is not applicable anyway...). In the case of asynchronous
master-slave replication the typical issue with streaming replication
latency is that you have transactions going to the master and then the
application sends a read only transaction to the slave before the slave
receives the transaction. So long as you don't have the application
consider the user operation completed before all the transactions are
committed I don't think having multiple transactions would make your
replication latency issue any less.

For example, if you had a calendar application where a user enters event
details and creates an event for the calendar. The application may be set
up to execute 2 transactions, 1) Add the event and details to the calendar
events table and 2) once the event creation transaction returns add the
current user as an attendee for that event. In this case both transactions
would be going against the master, so how far the slave is behind wouldn't
be a factor. Of course it would be faster overall to send the inserts as a
single database procedure, but that all goes against the master database so
the streaming replication is not a factor in that consideration.

*William J. Dunn*
*P* 978-844-4427 | *dunnwjr@gmail.com <dunnwjr@gmail.com>*
*dunnw@bu.edu <dunnw@bu.edu>*

On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio@vuole.me>
wrote:

Show quoted text

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

In my experience streaming replication fits most use cases due to inherent
its simplicity and robustness, but you might need to adjust your software
design to get the best out of it.

More specifically, latency issues can be heavily mitigated by having
application software commit no more than one transaction per user
operation, provided 1 x "master<->sync_slave round trip time" is acceptable
delay when they submit forms or the like.

It can get much worse if the application server is on a different
geographical node than the DB master. In such case it is realistically
beneficial to batch multiple write operations in a single STATEMENT instead.
If the replication synchronous slave is on yet another node, the best case
(single statement) scenario would be 2 x round trip time. This
configuration is more common than you might think as some setups feature
remote app servers reading off synchronous slaves at their own physical
location but committing against a master that is somewhere else.

Cheers

On 30/04/15 11:06, Jim Nasby wrote:

On 4/29/15 1:13 PM, Alex Gregory wrote:

I was thinking that I could use Slony but then I read that it does not
like WAN replication. I have also read about streaming replication
native to Postgres but was not sure how that would work over the WAN.
Bucardo seems better for Data Warehousing or multimaster situations
which this is not. That leaves pgpool ii which seems like it would
add an extra layer of complexity.

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

I haven't played with Slony in forever, but when I did it loved to lock
things. That would not play well with high latency.

I have run londiste between sites within the same city, and that worked
well.

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their uses, but
I personally distrust that idea, especially for DR.

When it comes down to to there are so many choices I am not sure if I

need one or a combination of two. Any help you could provide could
be greatly appreciated.

If you want to replicate within a data center then streaming replication
is pretty nice, and as a bonus you might be able to do synchronous as
well. The downside to streaming rep is that it's binary, so if you ever
suffer data corruption you're practically guaranteed that corruption
will end up on the replica. Logical replication like londiste or Slony
are much more robust against that. You also can't use temporary tables
with streaming rep, and you have to replicate the details of ALL
activity, including maintenance like VACUUM. In some environments that
might be slower than logical replication.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

William Dunn

dunnwjr@gmail.com

almost 11 years ago

In reply to: Fabio Ugo Venchiarutti (#3)

Re: PostgreSQL HA config recommendations

*William J. Dunn*
*willjdunn.com <http://willjdunn.com>*

*William J. Dunn*
*P* 978-844-4427 | *dunnwjr@gmail.com <dunnwjr@gmail.com>*
*dunnw@bu.edu <dunnw@bu.edu>*

On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio@vuole.me>
wrote:

Show quoted text

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

In my experience streaming replication fits most use cases due to inherent
its simplicity and robustness, but you might need to adjust your software
design to get the best out of it.

More specifically, latency issues can be heavily mitigated by having
application software commit no more than one transaction per user
operation, provided 1 x "master<->sync_slave round trip time" is acceptable
delay when they submit forms or the like.

It can get much worse if the application server is on a different
geographical node than the DB master. In such case it is realistically
beneficial to batch multiple write operations in a single STATEMENT instead.
If the replication synchronous slave is on yet another node, the best case
(single statement) scenario would be 2 x round trip time. This
configuration is more common than you might think as some setups feature
remote app servers reading off synchronous slaves at their own physical
location but committing against a master that is somewhere else.

Cheers

On 30/04/15 11:06, Jim Nasby wrote:

On 4/29/15 1:13 PM, Alex Gregory wrote:

I was thinking that I could use Slony but then I read that it does not
like WAN replication. I have also read about streaming replication
native to Postgres but was not sure how that would work over the WAN.
Bucardo seems better for Data Warehousing or multimaster situations
which this is not. That leaves pgpool ii which seems like it would
add an extra layer of complexity.

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

I haven't played with Slony in forever, but when I did it loved to lock
things. That would not play well with high latency.

I have run londiste between sites within the same city, and that worked
well.

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their uses, but
I personally distrust that idea, especially for DR.

When it comes down to to there are so many choices I am not sure if I

need one or a combination of two. Any help you could provide could
be greatly appreciated.

If you want to replicate within a data center then streaming replication
is pretty nice, and as a bonus you might be able to do synchronous as
well. The downside to streaming rep is that it's binary, so if you ever
suffer data corruption you're practically guaranteed that corruption
will end up on the replica. Logical replication like londiste or Slony
are much more robust against that. You also can't use temporary tables
with streaming rep, and you have to replicate the details of ALL
activity, including maintenance like VACUUM. In some environments that
might be slower than logical replication.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Fabio Ugo Venchiarutti

fabio@vuole.me

almost 11 years ago

In reply to: William Dunn (#6)

Re: PostgreSQL HA config recommendations

Point taken. William is right.

My recommendations were unusually pessimistic as I didn't take enough
time to assess global+instantaneous data changes visibility requirements
on your part.

If the cluster is ONLY HA and you don't need to read fresh data off
secondary nodes (E.G.: HA+read load balancing), asynchronous is good
enough in most cases.

On 01/05/15 06:37, William Dunn wrote:

Alex,
Note that you should be weary of suggestions to make your replication
synchronous. Synchronous replication is rarely used for this kind of use
case (Cisco Jabber) where the most complete durability of the standby is
not of the utmost concern (as it would be in a banking application). Not
only will it decrease performance, but since you expect to have only one
local standby it could actually decrease your availability because if
your standby went down no transactions would be able to commit on the
master. See the Synchronous Replication section of the docs for more
details (http://www.postgresql.org/docs/devel/static/warm-standby.html).

Also note that the suggestion provided by Fabio that you should not have
your application commit more than one transaction per user operation is
only applicable in synchronous replication (though since this is for a
Cisco Jabber, where you neither have control over nor much information
regarding the number of commits sent by the transaction per user
operation, that suggestion is not applicable anyway...). In the case of
asynchronous master-slave replication the typical issue with streaming
replication latency is that you have transactions going to the master
and then the application sends a read only transaction to the slave
before the slave receives the transaction. So long as you don't have the
application consider the user operation completed before all the
transactions are committed I don't think having multiple transactions
would make your replication latency issue any less.

For example, if you had a calendar application where a user enters
event details and creates an event for the calendar. The application
may be set up to execute 2 transactions, 1) Add the event and
details to the calendar events table and 2) once the event creation
transaction returns add the current user as an attendee for that
event. In this case both transactions would be going against the
master, so how far the slave is behind wouldn't be a factor. Of
course it would be faster overall to send the inserts as a single
database procedure, but that all goes against the master database so
the streaming replication is not a factor in that consideration.

*William J. Dunn*
_willjdunn.com <http://willjdunn.com>_

*William J. Dunn*
*P* 978-844-4427 | _dunnwjr@gmail.com <mailto:dunnwjr@gmail.com>_
_dunnw@bu.edu <mailto:dunnw@bu.edu>_

On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio@vuole.me
<mailto:fabio@vuole.me>> wrote:

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

In my experience streaming replication fits most use cases due to
inherent its simplicity and robustness, but you might need to adjust
your software design to get the best out of it.

More specifically, latency issues can be heavily mitigated by having
application software commit no more than one transaction per user
operation, provided 1 x "master<->sync_slave round trip time" is
acceptable delay when they submit forms or the like.

It can get much worse if the application server is on a different
geographical node than the DB master. In such case it is
realistically beneficial to batch multiple write operations in a
single STATEMENT instead.
If the replication synchronous slave is on yet another node, the
best case (single statement) scenario would be 2 x round trip time.
This configuration is more common than you might think as some
setups feature remote app servers reading off synchronous slaves at
their own physical location but committing against a master that is
somewhere else.

Cheers

On 30/04/15 11:06, Jim Nasby wrote:

On 4/29/15 1:13 PM, Alex Gregory wrote:

I was thinking that I could use Slony but then I read that
it does not
like WAN replication. I have also read about streaming
replication
native to Postgres but was not sure how that would work over
the WAN.
Bucardo seems better for Data Warehousing or multimaster
situations
which this is not. That leaves pgpool ii which seems like
it would
add an extra layer of complexity.

WAN delays can cause problems for any replication system; you
just have
to be aware of that and not push things too hard (or try and
violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be
rather
unhappy with. :)

I haven't played with Slony in forever, but when I did it loved
to lock
things. That would not play well with high latency.

I have run londiste between sites within the same city, and that
worked
well.

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their
uses, but
I personally distrust that idea, especially for DR.

When it comes down to to there are so many choices I am not
sure if I
need one or a combination of two. Any help you could
provide could
be greatly appreciated.

If you want to replicate within a data center then streaming
replication
is pretty nice, and as a bonus you might be able to do
synchronous as
well. The downside to streaming rep is that it's binary, so if
you ever
suffer data corruption you're practically guaranteed that corruption
will end up on the replica. Logical replication like londiste or
Slony
are much more robust against that. You also can't use temporary
tables
with streaming rep, and you have to replicate the details of ALL
activity, including maintenance like VACUUM. In some
environments that
might be slower than logical replication.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Alex Gregory

alex@c2company.com

almost 11 years ago

In reply to: Fabio Ugo Venchiarutti (#7)

Re: PostgreSQL HA config recommendations

This is great. Thank you both for thinking about my situation deeper.

The secondary node in Ireland is purely for DR purposes only. It will be a read only copy that can be referred to only in the event that its absolutely needed.

I will investigate the asynchronous methods more deeply. I came across this matrix with some good information to go from:

https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling#Features_in_the_Core_of_PostgreSQL

Thanks,

Alex

Show quoted text

On Apr 30, 2015, at 6:03 PM, Fabio Ugo Venchiarutti <fabio@vuole.me> wrote:

Point taken. William is right.

My recommendations were unusually pessimistic as I didn't take enough time to assess global+instantaneous data changes visibility requirements on your part.

If the cluster is ONLY HA and you don't need to read fresh data off secondary nodes (E.G.: HA+read load balancing), asynchronous is good enough in most cases.

On 01/05/15 06:37, William Dunn wrote:

Alex,
Note that you should be weary of suggestions to make your replication
synchronous. Synchronous replication is rarely used for this kind of use
case (Cisco Jabber) where the most complete durability of the standby is
not of the utmost concern (as it would be in a banking application). Not
only will it decrease performance, but since you expect to have only one
local standby it could actually decrease your availability because if
your standby went down no transactions would be able to commit on the
master. See the Synchronous Replication section of the docs for more
details (http://www.postgresql.org/docs/devel/static/warm-standby.html).

Also note that the suggestion provided by Fabio that you should not have
your application commit more than one transaction per user operation is
only applicable in synchronous replication (though since this is for a
Cisco Jabber, where you neither have control over nor much information
regarding the number of commits sent by the transaction per user
operation, that suggestion is not applicable anyway...). In the case of
asynchronous master-slave replication the typical issue with streaming
replication latency is that you have transactions going to the master
and then the application sends a read only transaction to the slave
before the slave receives the transaction. So long as you don't have the
application consider the user operation completed before all the
transactions are committed I don't think having multiple transactions
would make your replication latency issue any less.

For example, if you had a calendar application where a user enters
event details and creates an event for the calendar. The application
may be set up to execute 2 transactions, 1) Add the event and
details to the calendar events table and 2) once the event creation
transaction returns add the current user as an attendee for that
event. In this case both transactions would be going against the
master, so how far the slave is behind wouldn't be a factor. Of
course it would be faster overall to send the inserts as a single
database procedure, but that all goes against the master database so
the streaming replication is not a factor in that consideration.

*William J. Dunn*
_willjdunn.com <http://willjdunn.com>_

*William J. Dunn*
*P* 978-844-4427 | _dunnwjr@gmail.com <mailto:dunnwjr@gmail.com>_
_dunnw@bu.edu <mailto:dunnw@bu.edu>_

On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio@vuole.me
<mailto:fabio@vuole.me>> wrote:

WAN delays can cause problems for any replication system; you just have
to be aware of that and not push things too hard (or try and violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be rather
unhappy with. :)

In my experience streaming replication fits most use cases due to
inherent its simplicity and robustness, but you might need to adjust
your software design to get the best out of it.

More specifically, latency issues can be heavily mitigated by having
application software commit no more than one transaction per user
operation, provided 1 x "master<->sync_slave round trip time" is
acceptable delay when they submit forms or the like.

It can get much worse if the application server is on a different
geographical node than the DB master. In such case it is
realistically beneficial to batch multiple write operations in a
single STATEMENT instead.
If the replication synchronous slave is on yet another node, the
best case (single statement) scenario would be 2 x round trip time.
This configuration is more common than you might think as some
setups feature remote app servers reading off synchronous slaves at
their own physical location but committing against a master that is
somewhere else.

Cheers

On 30/04/15 11:06, Jim Nasby wrote:

On 4/29/15 1:13 PM, Alex Gregory wrote:

I was thinking that I could use Slony but then I read that
it does not
like WAN replication. I have also read about streaming
replication
native to Postgres but was not sure how that would work over
the WAN.
Bucardo seems better for Data Warehousing or multimaster
situations
which this is not. That leaves pgpool ii which seems like
it would
add an extra layer of complexity.

WAN delays can cause problems for any replication system; you
just have
to be aware of that and not push things too hard (or try and
violate the
laws of physics). For example, streaming replication set to be
synchronous crossing the planet is something you'd probably be
rather
unhappy with. :)

I haven't played with Slony in forever, but when I did it loved
to lock
things. That would not play well with high latency.

I have run londiste between sites within the same city, and that
worked
well.

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their
uses, but
I personally distrust that idea, especially for DR.

When it comes down to to there are so many choices I am not
sure if I
need one or a combination of two. Any help you could
provide could
be greatly appreciated.

If you want to replicate within a data center then streaming
replication
is pretty nice, and as a bonus you might be able to do
synchronous as
well. The downside to streaming rep is that it's binary, so if
you ever
suffer data corruption you're practically guaranteed that corruption
will end up on the replica. Logical replication like londiste or
Slony
are much more robust against that. You also can't use temporary
tables
with streaming rep, and you have to replicate the details of ALL
activity, including maintenance like VACUUM. In some
environments that
might be slower than logical replication.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 11 years ago

In reply to: Greg Sabino Mullane (#4)

Re: PostgreSQL HA config recommendations

On 4/30/15 1:23 PM, Greg Sabino Mullane wrote:

Bucardo and pg_pool are both based on the idea of replaying SQL
statements instead of replicating actual data. They have their uses, but
I personally distrust that idea, especially for DR.

Actually, Bucardo is very data-based, not statement at all. It stores up
a list of affected rows, then COPYs those from the source databases
to the target ones. It's particularly strong when you have large
delays and many updates to the same row.

Doh! My apologies.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general