Geographical redundancy

Started by Dennisover 19 years ago10 messagesgeneral
Jump to latest
#1Dennis
aiwa_azca@yahoo.com

Is there any feasible way to achieve geographical redundancy of postgresql database?

Say you have a website which uses PG on the backend to read/write data and you want to have the website running on 2 separate servers distributed geographically and have the data synchronize somehow over the internet.

In case one data center fails, website is still up and running from 2nd geographical location (from say 2nd DNS server).

Thank you.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#2Ben
bench@silentmedia.com
In reply to: Dennis (#1)
Re: Geographical redundancy

Sure, there are lots of ways. Factors that start constraining things
are:

- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?

On Dec 26, 2006, at 11:41 PM, Dennis wrote:

Show quoted text

Is there any feasible way to achieve geographical redundancy of
postgresql database?

Say you have a website which uses PG on the backend to read/write
data and you want to have the website running on 2 separate servers
distributed geographically and have the data synchronize somehow
over the internet.

In case one data center fails, website is still up and running from
2nd geographical location (from say 2nd DNS server).

Thank you.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#3Dennis
aiwa_azca@yahoo.com
In reply to: Ben (#2)
Re: Geographical redundancy

- Yes, both sites have to be online and changing data at the same time.
- data loss is unacceptable
- platform is Gentoo Linux
- downtime of up to 1 day is acceptable as long as there is no data loss
- throughput latency -> internet over 10megabit line

Ben <bench@silentmedia.com> wrote: Sure, there are lots of ways. Factors that start constraining things
are:

- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?

On Dec 26, 2006, at 11:41 PM, Dennis wrote:

Is there any feasible way to achieve geographical redundancy of
postgresql database?

Say you have a website which uses PG on the backend to read/write
data and you want to have the website running on 2 separate servers
distributed geographically and have the data synchronize somehow
over the internet.

In case one data center fails, website is still up and running from
2nd geographical location (from say 2nd DNS server).

Thank you.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#4Ben
bench@silentmedia.com
In reply to: Dennis (#3)
Re: Geographical redundancy

If you're sure that data loss is unacceptable no matter what happens
to either site, then I'm not aware of too many options. As I
understand it, pgpool can be configured to send data-altering queries
to multiple servers in order to simulate a multi-master cluster, but
it's never been clear to me how this works if you use any non-
deterministic functions like random(). I'm not aware of any other
multi-master synchronous replication solution.

The only option I can think of (and it's ugly so hopefully somebody
could suggest something more elegant) would be to make postgres sit
on top of something like DRBD, so that both sites can stay in sync
with every write to disk. Then pick a site at which to run your
database, having your application at both sites connect to the active
server. In case of a problem, start up postgres at the other site and
have your app reconnect.

This will be slow. It will eat up your bandwidth. It will add
complexity. But it won't loose data.

On Dec 29, 2006, at 2:38 PM, Dennis wrote:

Show quoted text

- Yes, both sites have to be online and changing data at the same
time.
- data loss is unacceptable
- platform is Gentoo Linux
- downtime of up to 1 day is acceptable as long as there is no data
loss
- throughput latency -> internet over 10megabit line

Ben <bench@silentmedia.com> wrote: Sure, there are lots of ways.
Factors that start constraining things
are:

- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?

On Dec 26, 2006, at 11:41 PM, Dennis wrote:

Is there any feasible way to achieve geographical redundancy of
postgresql database?

Say you have a website which uses PG on the backend to read/write
data and you want to have the website running on 2 separate servers
distributed geographically and have the data synchronize somehow
over the internet.

In case one data center fails, website is still up and running from
2nd geographical location (from say 2nd DNS server).

Thank you.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#5Dennis
aiwa_azca@yahoo.com
In reply to: Ben (#4)
Re: Geographical redundancy

I was thinking of maybe just having 2nd location receive a PG dump (full or incremental) every so often (an hour to 6 hours) and if the main location fails majorly, restore the PG cluster from the dump and switch DNS settings on the actual sites. I can make sure all website files are always in sync on both locations.

Ben <bench@silentmedia.com> wrote: If you're sure that data loss is unacceptable no matter what happens
to either site, then I'm not aware of too many options. As I
understand it, pgpool can be configured to send data-altering queries
to multiple servers in order to simulate a multi-master cluster, but
it's never been clear to me how this works if you use any non-
deterministic functions like random(). I'm not aware of any other
multi-master synchronous replication solution.

The only option I can think of (and it's ugly so hopefully somebody
could suggest something more elegant) would be to make postgres sit
on top of something like DRBD, so that both sites can stay in sync
with every write to disk. Then pick a site at which to run your
database, having your application at both sites connect to the active
server. In case of a problem, start up postgres at the other site and
have your app reconnect.

This will be slow. It will eat up your bandwidth. It will add
complexity. But it won't loose data.

On Dec 29, 2006, at 2:38 PM, Dennis wrote:

- Yes, both sites have to be online and changing data at the same
time.
- data loss is unacceptable
- platform is Gentoo Linux
- downtime of up to 1 day is acceptable as long as there is no data
loss
- throughput latency -> internet over 10megabit line

Ben wrote: Sure, there are lots of ways.
Factors that start constraining things
are:

- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?

On Dec 26, 2006, at 11:41 PM, Dennis wrote:

Is there any feasible way to achieve geographical redundancy of
postgresql database?

Say you have a website which uses PG on the backend to read/write
data and you want to have the website running on 2 separate servers
distributed geographically and have the data synchronize somehow
over the internet.

In case one data center fails, website is still up and running from
2nd geographical location (from say 2nd DNS server).

Thank you.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

In reply to: Dennis (#5)
Re: Geographical redundancy

On 30 Dec 2006 at 0:22, Dennis wrote:

I was thinking of maybe just having 2nd location receive a PG dump
(full or incremental) every so often (an hour to 6 hours) and if the

Just curious - how do you do an incremental dump? I had to do
something similar recently, and ended up effectively rolling my own
mini-pg_dump.

--Ray.

----------------------------------------------------------------------

Raymond O'Donnell
Director of Music, Galway Cathedral, Galway, Ireland
rod@iol.ie
----------------------------------------------------------------------

#7Ben
bench@silentmedia.com
In reply to: Dennis (#5)
Re: Geographical redundancy

On Sat, 30 Dec 2006, Dennis wrote:

I was thinking of maybe just having 2nd location receive a PG dump (full
or incremental) every so often (an hour to 6 hours) and if the main
location fails majorly, restore the PG cluster from the dump and switch
DNS settings on the actual sites. I can make sure all website files are
always in sync on both locations.

Well, first off, you can just rsync your archived WAL files. That may be
easier than playing with pg_dump:

http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html

But second, and more important given your data-loss desires, if you do it
this way you have a window where you can experience data loss.
Specifically, after a transaction is committed, that commit will be at
risk until the next transfer has completed.

#8Dennis
aiwa_azca@yahoo.com
In reply to: Ben (#7)
Re: Geographical redundancy

Well, I am mainly concerned with catastrophic failure. If 1st (main) datacenter fails majorly (say fire, earthquake, db server dies etc), I need to be able to restore websites/data quickly in another location. If I get a data loss of say 6-12 hours during a major failure (which should never occur), I am ok with that.

Ben <bench@silentmedia.com> wrote: On Sat, 30 Dec 2006, Dennis wrote:

I was thinking of maybe just having 2nd location receive a PG dump (full
or incremental) every so often (an hour to 6 hours) and if the main
location fails majorly, restore the PG cluster from the dump and switch
DNS settings on the actual sites. I can make sure all website files are
always in sync on both locations.

Well, first off, you can just rsync your archived WAL files. That may be
easier than playing with pg_dump:

http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html

But second, and more important given your data-loss desires, if you do it
this way you have a window where you can experience data loss.
Specifically, after a transaction is committed, that commit will be at
risk until the next transfer has completed.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#9Ben
bench@silentmedia.com
In reply to: Dennis (#8)
Re: Geographical redundancy

OK, well accepting data loss (even if it is "just" 6-12 hours worth)
really opens up a lot of possibilities...... EXCEPT that you also said you
want both sites to be able to modify data. Again, there is no real
multi-master replication available for postgres, so you'll have to have
both sites at least write to the same database server.

On Tue, 2 Jan 2007, Dennis wrote:

Show quoted text

Well, I am mainly concerned with catastrophic failure. If 1st (main)
datacenter fails majorly (say fire, earthquake, db server dies etc), I
need to be able to restore websites/data quickly in another location. If
I get a data loss of say 6-12 hours during a major failure (which should
never occur), I am ok with that.

Ben <bench@silentmedia.com> wrote: On Sat, 30 Dec 2006, Dennis wrote:

I was thinking of maybe just having 2nd location receive a PG dump (full
or incremental) every so often (an hour to 6 hours) and if the main
location fails majorly, restore the PG cluster from the dump and switch
DNS settings on the actual sites. I can make sure all website files are
always in sync on both locations.

Well, first off, you can just rsync your archived WAL files. That may be
easier than playing with pg_dump:

http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html

But second, and more important given your data-loss desires, if you do it
this way you have a window where you can experience data loss.
Specifically, after a transaction is committed, that commit will be at
risk until the next transfer has completed.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

#10Markus Wanner
markus@bluegap.ch
In reply to: Dennis (#1)
Re: Geographical redundancy

Hello Dennis,

Dennis wrote:

Is there any feasible way to achieve geographical redundancy of postgresql database?

As nobody mentioned it up until now: please check the very nice
documentation about High Availability and Failover here:

http://www.postgresql.org/docs/8.2/static/high-availability.html

It explains most optionsand its pros and cons.

Regards

Markus