Replicating changes

Started by Alban Hertroysover 19 years ago6 messagesgeneral
Jump to latest
#1Alban Hertroys
alban@magproductions.nl

Hello list,

We're investigating a ways to replicate changes to the database to
several "outside" systems.

Some filtering will need to take place, as not all changes are allowed
to go to all systems. Whether the system receiving the changes stores
them in a database or not isn't particularly relevant to us. We're only
interested in getting the data out in a specified format.

The changes in the data that needs to be replicated can come in bursts
of several thousands of records. Performance is important, but
reliability as well. We need to get those changes to a number of 3rd
parties, in a format that we're allowed to specify.

So far we've looked into a few solutions:
1.) Adding triggers to the relevant tables that send a NOTIFY and store
the changes in a local table. probably some meta-data will be required,
like for example a time stamp.

2.) Use one of the existing replication systems. We're currently under
the impression that (we've looked at Slony-I particularly) the slave
system is supposed to be another (postgresql?) database. This wouldn't
fit our needs, but maybe we're overlooking something?

3.) Somehow directly monitoring the WAL files, interpreting and
detecting changes. So far we haven't found any documentation on how WAL
files are stored, so we're not sure this is feasible.

This is a call for advice, as we're surely not the first who are trying
to accomplish this. If any clarification is needed, please ask. We want
to get this right.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

#2Taras Kopets
tkopets@gmail.com
In reply to: Alban Hertroys (#1)
Re: Replicating changes

Hi!

I think you should try to use triggers with
dbi-link<http://pgfoundry.org/projects/dbi-link/&gt;to achieve this.
AFAIK there is no such replication with other RDBMS as you need.

Taras Kopets

Show quoted text

On 10/27/06, Alban Hertroys <alban@magproductions.nl> wrote:

Hello list,

We're investigating a ways to replicate changes to the database to
several "outside" systems.

Some filtering will need to take place, as not all changes are allowed
to go to all systems. Whether the system receiving the changes stores
them in a database or not isn't particularly relevant to us. We're only
interested in getting the data out in a specified format.

The changes in the data that needs to be replicated can come in bursts
of several thousands of records. Performance is important, but
reliability as well. We need to get those changes to a number of 3rd
parties, in a format that we're allowed to specify.

So far we've looked into a few solutions:
1.) Adding triggers to the relevant tables that send a NOTIFY and store
the changes in a local table. probably some meta-data will be required,
like for example a time stamp.

2.) Use one of the existing replication systems. We're currently under
the impression that (we've looked at Slony-I particularly) the slave
system is supposed to be another (postgresql?) database. This wouldn't
fit our needs, but maybe we're overlooking something?

3.) Somehow directly monitoring the WAL files, interpreting and
detecting changes. So far we haven't found any documentation on how WAL
files are stored, so we're not sure this is feasible.

This is a call for advice, as we're surely not the first who are trying
to accomplish this. If any clarification is needed, please ask. We want
to get this right.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

#3Robert Treat
xzilla@users.sourceforge.net
In reply to: Alban Hertroys (#1)
Re: Replicating changes

On Friday 27 October 2006 09:59, Alban Hertroys wrote:

Hello list,

We're investigating a ways to replicate changes to the database to
several "outside" systems.

Some filtering will need to take place, as not all changes are allowed
to go to all systems. Whether the system receiving the changes stores
them in a database or not isn't particularly relevant to us. We're only
interested in getting the data out in a specified format.

The changes in the data that needs to be replicated can come in bursts
of several thousands of records. Performance is important, but
reliability as well. We need to get those changes to a number of 3rd
parties, in a format that we're allowed to specify.

So far we've looked into a few solutions:
1.) Adding triggers to the relevant tables that send a NOTIFY and store
the changes in a local table. probably some meta-data will be required,
like for example a time stamp.

If I had to guess, this is what your going to end up doing given the
granularity you need to control data changes.

2.) Use one of the existing replication systems. We're currently under
the impression that (we've looked at Slony-I particularly) the slave
system is supposed to be another (postgresql?) database. This wouldn't
fit our needs, but maybe we're overlooking something?

For Slony this is correct... but there are some 3rd party replication
solutions that will do cross database replication (I think Continuent's will
do this, but I'm not sure).

3.) Somehow directly monitoring the WAL fil
detecting changes. So far we haven't found any documentation on how WAL
files are stored, so we're not sure this is feasible.

In theory this would seem possible; you can do WAL log shipping to send
changes between PostgreSQL servers; but sending those changes to a different
database server will mean having to transform it into some form the other
database can read, which sounds rather tricky (when compared to doing this at
a higher level anyway). I think your best bet for learning how WAL files
work is to read the backend code and maybe take a look at
http://pgfoundry.org/projects/xlogviewer/

This is a call for advice, as we're surely not the first who are trying
to accomplish this. If any clarification is needed, please ask. We want
to get this right.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

#4Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Alban Hertroys (#1)
Re: Replicating changes

On Fri, Oct 27, 2006 at 03:59:02PM +0200, Alban Hertroys wrote:

2.) Use one of the existing replication systems. We're currently under
the impression that (we've looked at Slony-I particularly) the slave
system is supposed to be another (postgresql?) database. This wouldn't
fit our needs, but maybe we're overlooking something?

Slony doesn't do this now, but it always seemed to me that it ought
to be possible to do something like this, especially with the
log-shipping components in Slony. If you wanted to develop an
interface to another system in collaboration with the Slony
developers, I'm sure people would welcome it.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
If they don't do anything, we don't need their acronym.
--Josh Hamilton, on the US FEMA

#5Alban Hertroys
alban@magproductions.nl
In reply to: Taras Kopets (#2)
Re: Replicating changes

Taras Kopets wrote:

Hi!

I think you should try to use triggers with
dbi-link<http://pgfoundry.org/projects/dbi-link/&gt;to achieve this.
AFAIK there is no such replication with other RDBMS as you need.

Taras Kopets

Seems like I introduced a small misunderstanding here. The data doesn't
necessarilly go to a database; at least, none that we know about. The
goal here is to export (semi-)real time changes from our database to a
small number of interested parties[1]Those parties probably store our data in a database (they'd be nuts if they didn't), but that's something beyond our control.. It'll probably have to be in some
kind of XML format.

So far we've received some interesting suggestions. I particularly like
the idea of extending Slony-I with a "custom" frontend - though that'll
probably involve more work than a custom solution, it seems much more
powerful.

Thanks for the suggestions so far, everyone.

[1]: Those parties probably store our data in a database (they'd be nuts if they didn't), but that's something beyond our control.
if they didn't), but that's something beyond our control.

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

#6Peter Wilson
petew@yellowhawk.co.uk
In reply to: Alban Hertroys (#5)
Re: Replicating changes

I'd start with something fairly straightforward. dbmirror
is very simple but does a lot. There are two parts:

A trigger function and set of database tables to collect
replication data. The trigger function is written is 'C' and
performance is good.

The second part is a Perl script that attaches to source
and target database and replicates all the changes. The
Perl script isn't too efficient but it should be pretty easy
to modify to send the data somewhere other than a
database

I had performance issues with dbmirror replicating big
BYTEA fields so I wrote a 'C++' version which is now
part of Whitebeam (http://www.whitebeam.org) so if
you're likely to be doing the same then you might want to
re-write the C++ version rather than the Perl version.

dbmirror is in the Postgres contrib directory for releases
up until 8.1 at least.

Pete

Alban Hertroys wrote:

Show quoted text

Taras Kopets wrote:

Hi!

I think you should try to use triggers with
dbi-link<http://pgfoundry.org/projects/dbi-link/&gt;to achieve this.
AFAIK there is no such replication with other RDBMS as you need.

Taras Kopets

Seems like I introduced a small misunderstanding here. The data
doesn't necessarilly go to a database; at least, none that we know
about. The goal here is to export (semi-)real time changes from our
database to a small number of interested parties[1]. It'll probably
have to be in some kind of XML format.

So far we've received some interesting suggestions. I particularly
like the idea of extending Slony-I with a "custom" frontend - though
that'll probably involve more work than a custom solution, it seems
much more powerful.

Thanks for the suggestions so far, everyone.

[1] Those parties probably store our data in a database (they'd be
nuts if they didn't), but that's something beyond our control.

Regards,