postgresql scalability, active-active cluster

Started by brian stoneabout 19 years ago6 messagesgeneral
Jump to latest
#1brian stone
skye0507@yahoo.com

Are there any built in tools or 3rd party tools for distributing a postgresql database? I need an active active configuration; master-master with fail over. The project I am working needs to support a very large number of transactions a second. It will eventually require a main frame, or some absurd hardware. It makes much more sense to consider a clustered configuration. DB requests come in from a row of application servers. It would be nice if these requests could be distributed.

thanks,
skye

---------------------------------
Food fight? Enjoy some healthy debate
in the Yahoo! Answers Food & Drink Q&A.

#2Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: brian stone (#1)
Re: postgresql scalability, active-active cluster

On Sun, Jan 21, 2007 at 06:55:56AM -0800, brian stone wrote:

Are there any built in tools or 3rd party tools for distributing a postgresql database? I need an active active configuration; master-master with fail over. The project I am working needs to support a very large number of transactions a second. It will eventually require a main frame, or some absurd hardware. It makes much more sense to consider a clustered configuration. DB requests come in from a row of application servers. It would be nice if these requests could be distributed.

Take a look at Postgres-R or Continuent (which is commercial, but they
have an open-source version).
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#3Jeff Davis
pgsql@j-davis.com
In reply to: brian stone (#1)
Re: postgresql scalability, active-active cluster

On Sun, 2007-01-21 at 06:55 -0800, brian stone wrote:

Are there any built in tools or 3rd party tools for distributing a
postgresql database? I need an active active configuration; master-
master with fail over. The project I am working needs to support a
very large number of transactions a second. It will eventually require
a main frame, or some absurd hardware. It makes much more sense to
consider a clustered configuration. DB requests come in from a row of
application servers. It would be nice if these requests could be
distributed.

Consider pgpool, and look at the partitioning feature (which uses
different machines for different records, allowing writes to happen very
quickly).

Also, depending on what your needs are, a relational database might not
be right for you. You say you need transactions, but if you're comparing
to MySQL's master-master, I don't think that even supports ACID
transactions. Do you actually need transactions, or just many writes per
second (perhaps to memory rather than disk)?

Regards,
Jeff Davis

#4brian stone
skye0507@yahoo.com
In reply to: Jeff Davis (#3)
Re: postgresql scalability, active-active cluster

I never considered MySQL because I really DO need transactions. MySQL also lacks many enterprise features we need; well they say they have them but from my testing they are a bit under-cooked.

I need atomic actions across an N number of application servers. The goal here is scalability, which is why I brought up clustering. Unfortunately, we really do need those transactions :(

Thought has gone into solutions other than a relational database but they all ended up at the same place ... we started to design a database. Basically, we need the features of a relational database so other solutions started to look like one.

thanks for the help,
skye

Jeff Davis <pgsql@j-davis.com> wrote: On Sun, 2007-01-21 at 06:55 -0800, brian stone wrote:

Are there any built in tools or 3rd party tools for distributing a
postgresql database? I need an active active configuration; master-
master with fail over. The project I am working needs to support a
very large number of transactions a second. It will eventually require
a main frame, or some absurd hardware. It makes much more sense to
consider a clustered configuration. DB requests come in from a row of
application servers. It would be nice if these requests could be
distributed.

Consider pgpool, and look at the partitioning feature (which uses
different machines for different records, allowing writes to happen very
quickly).

Also, depending on what your needs are, a relational database might not
be right for you. You say you need transactions, but if you're comparing
to MySQL's master-master, I don't think that even supports ACID
transactions. Do you actually need transactions, or just many writes per
second (perhaps to memory rather than disk)?

Regards,
Jeff Davis

---------------------------------
Need a quick answer? Get one in minutes from people who know. Ask your question on Yahoo! Answers.

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: brian stone (#4)
Re: postgresql scalability, active-active cluster

brian stone wrote:

I never considered MySQL because I really DO need transactions. MySQL
also lacks many enterprise features we need; well they say they have
them but from my testing they are a bit under-cooked.

I need atomic actions across an N number of application servers. The
goal here is scalability, which is why I brought up clustering.
Unfortunately, we really do need those transactions :(

Maybe two-phase commit can help you here? (see the COMMIT PREPARED
docs)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#6Jeff Davis
pgsql@j-davis.com
In reply to: Alvaro Herrera (#5)
Re: postgresql scalability, active-active cluster

On Mon, 2007-01-22 at 17:37 -0300, Alvaro Herrera wrote:

brian stone wrote:

I never considered MySQL because I really DO need transactions. MySQL
also lacks many enterprise features we need; well they say they have
them but from my testing they are a bit under-cooked.

I need atomic actions across an N number of application servers. The
goal here is scalability, which is why I brought up clustering.
Unfortunately, we really do need those transactions :(

Maybe two-phase commit can help you here? (see the COMMIT PREPARED
docs)

How much of a concern are deadlocks?

Is the correct way to do this to serially execute all the statements on
Node1, then Node2, ... NodeN (in order) to avoid deadlocks?

If you used async execution on all nodes at once, and then re-issued any
deadlocked transactions, how rare would deadlocks need to be for that to
be faster?

Regards,
Jeff Davis