Is there a peer-to-peer server solution with PG?

Started by Mike Nolanabout 21 years ago16 messagesgeneral
Jump to latest
#1Mike Nolan
nolan@gw.tssi.com

I have need to set up a 2nd database server for a client in their new
offices in another state this month. We will be shutting down the old
offices later this year but we really don't want to have 2-3 days of
downtime while we physically transfer equipment 800 miles.

We should have decent data connections between the two offices starting
next week, but I was wonding if there is a good peer-to-peer option for
PostgreSQL at this time.

As I understand Slony, it is master-slave only.
--
Mike Nolan

#2David Fetter
david@fetter.org
In reply to: Mike Nolan (#1)
Re: Is there a peer-to-peer server solution with PG?

On Thu, Feb 03, 2005 at 06:25:50PM -0600, Mike Nolan wrote:

I have need to set up a 2nd database server for a client in their
new offices in another state this month. We will be shutting down
the old offices later this year but we really don't want to have 2-3
days of downtime while we physically transfer equipment 800 miles.

We should have decent data connections between the two offices
starting next week, but I was wonding if there is a good
peer-to-peer option for PostgreSQL at this time.

As I understand Slony, it is master-slave only.

Slony-1 is perfectly capable of replicating to a slave database, then
letting you decide to promote it to master, which is just what you'd
need. Why are you asking about multi-master?

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778

Remember to vote!

#3Mike Nolan
nolan@gw.tssi.com
In reply to: David Fetter (#2)
Re: Is there a peer-to-peer server solution with PG?

Slony-1 is perfectly capable of replicating to a slave database, then
letting you decide to promote it to master, which is just what you'd
need. Why are you asking about multi-master?

I am concerned that if I have to support the traffic to keep the slave
unit in sync PLUS support general database use from the 'slaved' office
to the master one, on the same comm line, I might start running into
congestion issues.

We will have people actively working the database in both office for
a period of several weeks to several months, depending on how the final
transfer plan unfolds.

Master/Slave is probably an acceptable solution, I was just wondering if
there was a multi-master one available yet.
--
Mike Nolan

#4David Fetter
david@fetter.org
In reply to: Mike Nolan (#3)
Re: Is there a peer-to-peer server solution with PG?

On Thu, Feb 03, 2005 at 07:03:36PM -0600, Mike Nolan wrote:

Slony-1 is perfectly capable of replicating to a slave database,
then letting you decide to promote it to master, which is just
what you'd need. Why are you asking about multi-master?

I am concerned that if I have to support the traffic to keep the
slave unit in sync PLUS support general database use from the
'slaved' office to the master one, on the same comm line, I might
start running into congestion issues.

Slony-1 does its level best to ship transactions in a compact way.
Any write operations are done as the net result of the write
transaction, not necessarily all the steps in between. IOW, don't
worry too much :)

We will have people actively working the database in both office for
a period of several weeks to several months, depending on how the
final transfer plan unfolds.

Sounds like a fit for Slony-1. Just make sure that nobody tries to
write to a slave, as such writes will fail.

Master/Slave is probably an acceptable solution, I was just
wondering if there was a multi-master one available yet.

Not really. If you *must* have multi-master, you probably have to get
Oracle or DB2 and pay /mucho dinero/.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778

Remember to vote!

#5Chris Browne
cbbrowne@acm.org
In reply to: David Fetter (#2)
Re: Is there a peer-to-peer server solution with PG?

Martha Stewart called it a Good Thing when nolan@gw.tssi.com (Mike Nolan) wrote:

Slony-1 is perfectly capable of replicating to a slave database,
then letting you decide to promote it to master, which is just what
you'd need. Why are you asking about multi-master?

I am concerned that if I have to support the traffic to keep the
slave unit in sync PLUS support general database use from the
'slaved' office to the master one, on the same comm line, I might
start running into congestion issues.

We will have people actively working the database in both office for
a period of several weeks to several months, depending on how the
final transfer plan unfolds.

Master/Slave is probably an acceptable solution, I was just
wondering if there was a multi-master one available yet.

There is an effort under way; in planning stages at this point. Don't
expect that to be "productized" next month...

Let me wag a finger at one of your assumptions...

You should re-examine assumptions with great care if you start
imagining that you'll get more throughput out of a general purpose
"multimaster" system. (Something designed specifically for your
application is quite another matter, particularly if your application
turns out to be, in some fashion "embarassingly parallelizable.")

Synchronization can't _conceivably_ come for free; it has _got_ to
have some cost in terms of decreasing overall performance.

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

Each of the 3 servers may only have to take on 1/3 of the updates from
the outside, but they surely have to accomodate the other 2/3 as well.

This not to say that there can't be some benefits from multimaster
replication; that's why such projects are proceeding.

But it's NOT a panacea; it's NOT an easy "general purpose solution."

I was in a room with The Thinkers; I got the sense that the lights
dimmed for blocks around when they put their thinking caps on :-). To
this group of Rather Smart Folk, perceiving the array of concurrency
and locking problems required great attention on their part. 'Easy'
is definitely not the right word...
--
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://linuxdatabases.info/info/slony.html
Rules of the Evil Overlord #31. "All naive, busty tavern wenches in my
realm will be replaced with surly, world-weary waitresses who will
provide no unexpected reinforcement and/or romantic subplot for the
hero or his sidekick." <http://www.eviloverlord.com/&gt;

#6Mike Nolan
nolan@gw.tssi.com
In reply to: Chris Browne (#5)
Re: Is there a peer-to-peer server solution with PG?

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

The update/query load isn't the real issue here, it's that these two
servers will be 800 miles apart and there are some advantages in having
each office connect to its local database rather than having one of
them connect to the remote master.

The Slony-1 approach will work, assuming I've got suffient network
bandwidth to support it plus the traffic from the remote office plus
exixting outside traffic from our public website.

That's one of those things you just don't know will work until you
have it built, so I'm looking for other options now while I have time
to consider them. Once I get on-site in two weeks it'll a lot more hectic.
--
Mike Nolan

#7Chris Browne
cbbrowne@acm.org
In reply to: David Fetter (#2)
Re: Is there a peer-to-peer server solution with PG?

Martha Stewart called it a Good Thing when david@fetter.org (David Fetter) wrote:

On Thu, Feb 03, 2005 at 07:03:36PM -0600, Mike Nolan wrote:

Slony-1 is perfectly capable of replicating to a slave database,
then letting you decide to promote it to master, which is just
what you'd need. Why are you asking about multi-master?

I am concerned that if I have to support the traffic to keep the
slave unit in sync PLUS support general database use from the
'slaved' office to the master one, on the same comm line, I might
start running into congestion issues.

Slony-1 does its level best to ship transactions in a compact way.
Any write operations are done as the net result of the write
transaction, not necessarily all the steps in between. IOW, don't
worry too much :)

Sorta. If there were SQL queries involving in _preparing_ for the
writes, those queries do not need to be run again.

On the other hand, if you run a SQL query like:

delete from t1 where id in (select id from t1 limit
7000);

(Which is a query I have recently used for some testing...)

you'll discover that ultimately turns into somewhere around 7000
delete statements when it hits the replica.

-> An insert of 7000 rows becomes 7000 insert statements
-> A delete of 7000 rows becomes 7000 delete statements
-> An update to 7000 rows becomes 7000 update statements

Mass updates can therefore get fairly expensive, alas.

We will have people actively working the database in both office
for a period of several weeks to several months, depending on how
the final transfer plan unfolds.

Sounds like a fit for Slony-1. Just make sure that nobody tries to
write to a slave, as such writes will fail.

... Which is actually a Remarkably Good Feature. I once pointed a
report that wanted to update data to a wrong node, and would have been
Seriously Chagrined if it had silently gone along with the updates...
--
output = ("cbbrowne" "@" "acm.org")
http://linuxdatabases.info/info/slony.html
"...In my phone conversation with Microsoft's lawyer I copped to the
fact that just maybe his client might see me as having been in the
past just a bit critical of their products and business
practices. This was too bad, he said with a sigh, because they were
having a very hard time finding a reporter who both knew the industry
well enough to be called an expert and who hadn't written a negative
article about Microsoft." -- Robert X. Cringely

#8Marco Colombo
pgsql@esiway.net
In reply to: Mike Nolan (#6)
Re: Is there a peer-to-peer server solution with PG?

On Fri, 4 Feb 2005, Mike Nolan wrote:

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

The update/query load isn't the real issue here, it's that these two
servers will be 800 miles apart and there are some advantages in having
each office connect to its local database rather than having one of
them connect to the remote master.

Considering that the two masters need to be connected, I don't see the
advantage. Any write on the _local_ master will have to wait for the
_remote_ master to complete as well.

The Slony-1 approach will work, assuming I've got suffient network
bandwidth to support it plus the traffic from the remote office plus
exixting outside traffic from our public website.

Local read-only access won't travel on the network, both with multi-master
and with master-slave.

Write traffic _will_ be transmitted over the wire, both with multi-master
and with master-slave. With multi-master _every write operation_ will be
remotely duplicated, _both ways_, _synchronously_. That is, master-1 has
to wait for master-2 and vice versa. If you're concerned with network
performances, multi-master will only make it worse.

With master-slave, _only_ the write operations performed on
the slave side need to travel, since clients will perform them directly
on the master. On the master side writes are only local.
Of course, the _results_ of the writes will have to be propagated to the
slave (and thus they travel on the network as well), but that's another
matter (delay is usually acceptable, and fits MVCC - the semantics not
broken).

If you're willing to break semantics, you may run two splitted masters
and find a way to synchronize them. But that requires application level
knowledge, if not human intervention, to resolve conflicts.

That's one of those things you just don't know will work until you
have it built, so I'm looking for other options now while I have time
to consider them. Once I get on-site in two weeks it'll a lot more hectic.
--
Mike Nolan

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it

#9Tzahi Fadida
tzahi_ml@myrealbox.com
In reply to: Mike Nolan (#6)
Re: Is there a peer-to-peer server solution with PG?

I am just a newbie but logically:
Maybe the answer to that is much simpler.
Ask your network officer to tell you whats the bandwidth you
have on your current office and remote office.
whats the avg:
a. website bandwidth.
b. current postgress office bandwidth.

I never used replication but it seems to me you'll need
a+2*b bandwidth at your current office and 2*b at your remote office
for the period of transition.
If your db size is C then you'll need (C/b)/3600 hrs in transition time.
do the math and if it fits great. If not, well...

Regards,
tzahi.

Show quoted text

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Mike Nolan
Sent: Friday, February 04, 2005 12:57 PM
To: Christopher Browne
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Is there a peer-to-peer server
solution with PG?

If you have so much update load that one server cannot

accomodate that

load, then you should wonder why you'd expect that causing

every one

of these updates to be applied to (say) 3 servers would "diminish"
this burden.

The update/query load isn't the real issue here, it's that
these two servers will be 800 miles apart and there are some
advantages in having each office connect to its local
database rather than having one of them connect to the remote
master.

The Slony-1 approach will work, assuming I've got suffient
network bandwidth to support it plus the traffic from the
remote office plus
exixting outside traffic from our public website.

That's one of those things you just don't know will work
until you have it built, so I'm looking for other options now
while I have time to consider them. Once I get on-site in
two weeks it'll a lot more hectic.
--
Mike Nolan

---------------------------(end of
broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index
scan if your
joining column's datatypes do not match

#10NTPT
ntpt@centrum.cz
In reply to: Tzahi Fadida (#9)
How many connections now ?

Is there a way to determine how many connections to the database are active
?
some form of select ? ie how many client application are connected to
server ? I need this value for client application ...

Thank for help.

#11Devrim GÜNDÜZ
devrim@gunduz.org
In reply to: NTPT (#10)
Re: How many connections now ?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

On Sat, 5 Feb 2005, NTPT wrote:

Is there a way to determine how many connections to the database are active
?
some form of select ? ie how many client application are connected to server
? I need this value for client application ...

Check pg_stat_activity table...

Regards,
- --
Devrim GUNDUZ
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.tdmsoft.com http://www.gunduz.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCBKz9tl86P3SPfQ4RAiTpAKDwUUuvju8xLhAjFzCKYsF4WQp7JACgrpQC
hhw0sDz2XDHElHCfBZFP8gs=
=pYgE
-----END PGP SIGNATURE-----

#12Jan Wieck
JanWieck@Yahoo.com
In reply to: Mike Nolan (#6)
Re: Is there a peer-to-peer server solution with PG?

On 2/4/2005 5:56 AM, Mike Nolan wrote:

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

The update/query load isn't the real issue here, it's that these two
servers will be 800 miles apart and there are some advantages in having
each office connect to its local database rather than having one of
them connect to the remote master.

You do realize that any multimaster replication system, that is designed
to avoind complex business process structure based conflict resolution
mechanisms, necessarily has to be based on 2 phase commit or similar? So
your global write transaction throughput will be limited by the latency
of your WAN, no matter what bandwidth you have. And as per RFC 1925: No
matter how hard you push and no matter what the priority, you can't
increase the speed of light.

I think what you are really looking for is an application internal
abstraction layer based multmaster replication approach.

Jan

The Slony-1 approach will work, assuming I've got suffient network
bandwidth to support it plus the traffic from the remote office plus
exixting outside traffic from our public website.

That's one of those things you just don't know will work until you
have it built, so I'm looking for other options now while I have time
to consider them. Once I get on-site in two weeks it'll a lot more hectic.
--
Mike Nolan

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#13Chris Browne
cbbrowne@acm.org
In reply to: Mike Nolan (#6)
Re: Is there a peer-to-peer server solution with PG?

Quoth JanWieck@Yahoo.com (Jan Wieck):

On 2/4/2005 5:56 AM, Mike Nolan wrote:

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

The update/query load isn't the real issue here, it's that these two
servers will be 800 miles apart and there are some advantages in having
each office connect to its local database rather than having one of
them connect to the remote master.

You do realize that any multimaster replication system, that is
designed to avoind complex business process structure based conflict
resolution mechanisms, necessarily has to be based on 2 phase commit
or similar? So your global write transaction throughput will be
limited by the latency of your WAN, no matter what bandwidth you
have. And as per RFC 1925: No matter how hard you push and no matter
what the priority, you can't increase the speed of light.

I think what you are really looking for is an application internal
abstraction layer based multmaster replication approach.

Note also that there can be some "embarassingly parallel" systems that
can scale _perfectly well_ given some reasonable 'global sequencing
system.' ("Global sequences" seemed to be the only 'surprise' that
popped up at the conference, which surprised me because it was one of
the few things I was entirely certain needed discussion ;-).)

Consider a general ledger transaction system for a retail operation
with 200 stores.

Each accounting transaction actually is pretty independent of the
others; with suitable application design, it's perfectly reasonable to
generate transactions locally at each site and [somehow; there lies
the grand detail] roll those together into the central G/L at the end.

Purely accounting transactions don't usually have any need to conflict
with anything at all.

Of course, as soon as you have _shared_ objects in your business, such
as account balances [that people can lay claim to], inventory [that we
can promise to customers], and 8 people that want to change prices,
then comes the trouble of conflict resolution.

You've got to think about which of these sorts of conflicts do and do
not exist before heading down any of the "multimaster" roads otherwise
trouble awaits...

It's quite likely for reality to need to be a mix. For account
balances, you more than likely need to go to only ONE source;
"conflict resolution" would be liable to 'break the bank.' For
inventory, it's probably not unreasonable for different sites to fight
over it :-).
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/slony.html
What's another word for synonym?

In reply to: Jan Wieck (#12)
Re: Is there a peer-to-peer server solution with PG?

You do realize that any multimaster replication system, that is
designed to avoind complex business process structure based conflict
resolution mechanisms, necessarily has to be based on 2 phase commit
or similar? So your global write transaction throughput will be
limited by the latency of your WAN, no matter what bandwidth you have.
And as per RFC 1925: No matter how hard you push and no matter what
the priority, you can't increase the speed of light.

I think what you are really looking for is an application internal
abstraction layer based multmaster replication approach.

Hi,

I found a paper about Clustra DB
http://www.nuug.no/pub/dist/20011017-clustra.pdf

Clustra is a cluster database for high-availability, any node has one
other fail-over node designed. But the interesting idea is that the
data is distributed across the node.
A transaction implies many nodes, but is managed by a two phase commit
and the log is written on ONLY few (two) nodes.
I suppose (because I never used it) that there is a speed boost.

Cordialement,
Jean-Gérard Pailloncy

#15Chris Browne
cbbrowne@acm.org
In reply to: Mike Nolan (#6)
Re: Is there a peer-to-peer server solution with PG?

jg@rilk.com (Pailloncy Jean-Gerard) writes:

I suppose (because I never used it) that there is a speed boost.

I would suppose (because synchronization has substantial costs) that
there is more than likely _no_ material improvement in performance.

"Clustering" can only provide material performance improvements if the
problem turns out to be "embarrassingly parallelizable." That is a
rather difficult condition to satisfy, and it is very likely to be
FALSE whenever there is need for conflict resolution between nodes.
--
"cbbrowne","@","ca.afilias.info"
<http://linuxdatabases.info/info/slony.html&gt;
Christopher Browne
(416) 673-4124 (land)

#16Robert Treat
xzilla@users.sourceforge.net
In reply to: Jan Wieck (#12)
Re: Is there a peer-to-peer server solution with PG?

On Sat, 2005-02-05 at 15:03, Jan Wieck wrote:

On 2/4/2005 5:56 AM, Mike Nolan wrote:

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

The update/query load isn't the real issue here, it's that these two
servers will be 800 miles apart and there are some advantages in having
each office connect to its local database rather than having one of
them connect to the remote master.

You do realize that any multimaster replication system, that is designed
to avoind complex business process structure based conflict resolution
mechanisms, necessarily has to be based on 2 phase commit or similar? So
your global write transaction throughput will be limited by the latency
of your WAN, no matter what bandwidth you have. And as per RFC 1925: No
matter how hard you push and no matter what the priority, you can't
increase the speed of light.

I think the advantage Mike is looking for is to not have his READ
traffic have to travel 1600 miles for the remote office. If the read's
outnumber the writes by enough, he might have something to gain.

Mike, I've yet to see a thorough review of daffodil replicator but it
may be able to help get you to a little closer to what your looking for.
If you have time please check it out and see if it can be of any help,
I'm sure many of us would be interested in hearing some feedback on it.
http://www.daffodildb.com/dbreplicator.html

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL