Re: replication from multiple "master" servers to a single read-only slave

Started by Omar Mehmoodabout 16 years ago4 messagesgeneral
Jump to latest
#1Omar Mehmood
omarmehmood@yahoo.com

Thanks for the suggestions.

I really don't want to use separate schemas for each master to logically partition the data. I ensure that the data on each master will not clash with each other (in terms of any DB level contraints such as PK), so I'd much prefer they all reside in a single schema. Also, my understanding is that Slony uses DB triggers to track changes (but I want to avoid using DB triggers).

Another additional constraint-- the master servers may not always have connectivity to the slave machine, so the chosen mechanism needs to be robust and not assume 100% uptime.

For truly simple, I could do a periodic data dump of the database on the server, ship to slave, and restore. However, this is way overkill since there won't be that many changes in the data for the period of time that it will run (e.g. every 15-20 minutes). Plus, I might end up in the situation where the slave will start to increasingly lag behind over time (depends on the number of master servers and amount of data).

I will check out Bucardo.

Omar

--- On Mon, 1/11/10, Ben Chobot <bench@silentmedia.com> wrote:
Show quoted text

From: Ben Chobot <bench@silentmedia.com>
Subject: Re: [GENERAL] replication from multiple "master" servers to a single read-only slave
To: "Omar Mehmood" <omarmehmood@yahoo.com>
Date: Monday, January 11, 2010, 6:58 PM
I'm pretty sure you can do this with
Bucardo, and I *think* you can do it via Slony, if you're
willing to use seperate schemas for each master. I know you
cannot do this with PITR.

On Jan 11, 2010, at 4:02 PM, Omar Mehmood wrote:

I'm wondering if it's possible to have a setup with

multiple "master" servers replicating to a single
slave.  I can guarantee that each server will generate
unique PK values for all tables and all the data is
partitioned (logically by server) across the servers. 
I would simply like to have a read-only slave that is a
picture of all the servers' data (relatively up to
date).  The individual "master" servers never need to
know about each other's data (i.e. they do not _need_ to
"sync" with each other, nor do I want them to be sync'd).

Would it be possible to use PostgreSQL PITR feature to

support this functionality ?  All of the data
created/updated/deleted per server is unique to that server,
so replaying the log to the slave should technically be safe
and the replaying logs from multiple servers should be safe
as well (as long as the relative order of replay is
preserved).  I'm just wondering how to get around the
numbering of the log (WAL) files and the slave's tracking of
the log files that it has already processed.

I can certainly write my own application log module

that runs on each server, ship over the log to the slave
machine and replay the logs to the slave (in the meanwhile
ensuring that the order of replay is preserved and all that
good stuff), but I'm trying to find a quick(er) solution for
the short term.

Please note that I'd like to avoid using PostgreSQL

data partitioning as well as any DB triggers (in case anyone
was going to go down that path as part of the solution).

Omar

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Omar Mehmood (#1)

Omar Mehmood <omarmehmood@yahoo.com> writes:

I really don't want to use separate schemas for each master to
logically partition the data. I ensure that the data on each master
will not clash with each other (in terms of any DB level contraints
such as PK), so I'd much prefer they all reside in a single schema.
Also, my understanding is that Slony uses DB triggers to track changes
(but I want to avoid using DB triggers).

If you want to avoid all PostgreSQL features… well I don't see that I
can help you here.

If you were to change your mind the following document talks about how
to federate data from several databases to the same central one, and
using either inheritance or triggers to move the incoming data from the
N origin schemas to the central one.

I guess how the data gets to being available for your central queries is
not solved by refusing to use a schema per origin server.

http://wiki.postgresql.org/wiki/Londiste_Tutorial#Federated_database

Another additional constraint-- the master servers may not always have
connectivity to the slave machine, so the chosen mechanism needs to be
robust and not assume 100% uptime.

Londiste fits this need.

Regards,
--
dim

#3Joshua Tolley
eggyknap@gmail.com
In reply to: Omar Mehmood (#1)
Re: replication from multiple "master" servers to a single read-only slave

On Mon, Jan 11, 2010 at 04:18:30PM -0800, Omar Mehmood wrote:

(but I want to avoid using DB triggers).

<snip>

I will check out Bucardo.

Bucardo uses triggers just like Slony does. That said, it seems strange that
you'd want to avoid them. Is there any particular reason you want to avoid
them?

Bucardo should handle the disconnection problems you described just fine.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#4Omar Mehmood
omarmehmood@yahoo.com
In reply to: Joshua Tolley (#3)

Thanks for note. I'm trying to avoid triggers based on past trauma from going into a client setup that uses a zillion of them and having to wait in order of seconds for a simple mod DML statement to execute :) I've been looking for an alternative that either sits above the database level (without building my own) or hoping there is a binary logging option. Anyway, I've been reading up on triggers, replication and synchronization options for PostgreSQL and it seems like the performance impact isn't too bad. In particular, I've been reading up on Bucardo and it seems to address all my needs-- I also fired off an email to Selena D on it and she's also affirmed that it looks to solve my problem. I'm going to setup an environment for testing and then post my questions directly to the bucardo-users mailing list.

Omar

--- On Tue, 1/12/10, Joshua Tolley <eggyknap@gmail.com> wrote:
Show quoted text

From: Joshua Tolley <eggyknap@gmail.com>
Subject: Re: [GENERAL] replication from multiple "master" servers to a single read-only slave
To: "Omar Mehmood" <omarmehmood@yahoo.com>
Cc: "Ben Chobot" <bench@silentmedia.com>, pgsql-general@postgresql.org
Date: Tuesday, January 12, 2010, 2:12 PM
On Mon, Jan 11, 2010 at 04:18:30PM
-0800, Omar Mehmood wrote:

(but I want to avoid using DB triggers).

<snip>

I will check out Bucardo.

Bucardo uses triggers just like Slony does. That said, it
seems strange that
you'd want to avoid them. Is there any particular reason
you want to avoid
them?

Bucardo should handle the disconnection problems you
described just fine.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com