dbmirror revisions

Started by Ed L.almost 23 years ago8 messages
#1Ed L.
pgsql@bluepolka.net

I've been modifying dbmirror and wanted to offer my changes to anyone that
cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
I make no claims of production readiness, and testing of this latest
version has been minimal, so I strongly advise you to conduct your own
thorough testing before considering a production deployment. That said,
it's a significantly improved solution for our async master-slave needs,
with a few caveats below, and shouldn't be too hard to setup.

There are enough changes that I would hardly consider this a patch, closer
to an overhaul, since I've removed files, renamed others, and added new
files. Among the changes I've made so far:

* Added script for easier setup of many tables/dbs/slaves;
* Added initial support for multiple master replicating distinct data to a
single slave;
* Added batching to minimize load on master and net traffic. You can grab
a configurable number of updates to replicate before hitting the master
again.

* Added port specification;
* Wrapped all replication in transactions;
* Bulletproofed against downed master or slave;
* Started modularization of DB access layer, added some error
handling;
* Added a number of config vars for sync delays, etc;
* Eliminated bug in transaction ordering for replay. Updates cannot
be replicated in the order of the transactions (see archives for discussion
of why).

* Eliminated need for clear_pending.pl by making dbmirror.pl
self-clearing;
* Collasped schema into 1 queue table for performance;
* Changed sequence ID column types to BIGINT for 64-bit sequence;
* Added reconnection handling for robustness;
* Added local tracking of last seq_id to help with recovery
robustness;
* Added master/slave compatibility checking;
* Enabled slave setup during production service so master does not
have to stop serving.
* Renamed tables to minimize namespace conflicts;
* Added lots of logging/debug messages;

* Maybe a few other things I've forgotten...

AFAICS, there are still at least a few major drawbacks to this approach:

* DML statements are not replicated (same for eRServer, AFAIK).

* SEQUENCE objects are not handled; nextval() will not be replicated, so
sequence objects (and serial columns) between master and slave can easily
get out of sync. I wonder if eRServer has this same issue?

* Mass updates/deletes/inserts of 5000 rows with a single SQL command on
the master will result in 5000 individual trigger-firings, and 5000
individual replication inserts on the slave. Rumor has it eRServer's
snapshot gets around this problem.

The code is here:

http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz

Ed

#2Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#1)
Re: dbmirror revisions

On Thursday April 3 2003 4:37, Ed L. wrote:

AFAICS, there are still at least a few major drawbacks to this approach:

* SEQUENCE objects are not handled; nextval() will not be replicated,
so sequence objects (and serial columns) between master and slave can
easily get out of sync. I wonder if eRServer has this same issue?

Clarification: sequence objects for serial columns get out of sync easily,
but the replication maintains correct serial column values.

Ed

#3Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#1)
Re: dbmirror revisions

On Thursday April 3 2003 4:37, Ed L. wrote:

AFAICS, there are still at least a few major drawbacks to this approach:

* DML statements are not replicated (same for eRServer, AFAIK).

D'oh. I meant DDL, not DML.

Ed

#4Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#1)
Re: dbmirror revisions

On Thursday April 3 2003 4:37, Ed L. wrote:

I've been modifying dbmirror and wanted to offer my changes to anyone
that cared to experiment, FWIW. My effort is ongoing, the docs aren't
perfect, I make no claims of production readiness, and testing of this
latest version has been minimal, so I strongly advise you to conduct your
own thorough testing before considering a production deployment. That
said, it's a significantly improved solution for our async master-slave
needs, with a few caveats below, and shouldn't be too hard to setup.
...
AFAICS, there are still at least a few major drawbacks to this approach:

* SEQUENCE objects are not handled; nextval() will not be replicated,
so sequence objects (and serial columns) between master and slave can
easily get out of sync. I wonder if eRServer has this same issue?

I've added code for brute-force replication of sequences in the tgz ball
below. At each sync, the replicator contacts both master and slave and
compares every important aspect of every sequence object on the master with
that of the slave. It then replicates any new sequence object or sequence
object change. This causes dbmirror to hit both master and slave at least
N times on each sync, where N is the number of sequence objects; the
queries are quick. I hate to hit the master like that, but I haven't
thought of a better option short of WAL-log replays. It'd be a nice boost
to query for all sequence values in one SQL query, but I don't know how to
do it in a generalized manner.

http://bluepolka.net/dbmirror/dbmirror-20030404-1446.tar.gz

I think a consistent view on the slave during active replication is not
quite guaranteed with this approach. Sequence updates are not
transactional, we really don't know how to order them with respect to tuple
updates. So someone reading the slave DB might possibly not see sequence
changes appear in the order in which they occurred on the master. For our
warm spare/slave needs, it appears adequate.

Re DDL statement detection, I am thinking about incorporating a schema-only
pg_dump from both master and slave to compare schemas to alert to DDL
changes that could foul replication. Maybe run it only every so often in
dbmirror. None too elegant, but maybe better than nothing...

Ed

#5Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#1)
Re: dbmirror revisions

On Thursday April 3 2003 4:37, some yahoo wrote:

* Eliminated bug in transaction ordering for replay. Updates
cannot be replicated in the order of the transactions (see archives for
discussion of why).

Upon further review, this bug report was the result of a misunderstanding of
the replication ordering. Both replication orderings, old and new, seem to
work fine so far.

Ed

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Ed L. (#1)
Re: [GENERAL] dbmirror revisions

Are any of these changes ready for CVS for 7.4?

---------------------------------------------------------------------------

Ed L. wrote:

I've been modifying dbmirror and wanted to offer my changes to anyone that
cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
I make no claims of production readiness, and testing of this latest
version has been minimal, so I strongly advise you to conduct your own
thorough testing before considering a production deployment. That said,
it's a significantly improved solution for our async master-slave needs,
with a few caveats below, and shouldn't be too hard to setup.

There are enough changes that I would hardly consider this a patch, closer
to an overhaul, since I've removed files, renamed others, and added new
files. Among the changes I've made so far:

* Added script for easier setup of many tables/dbs/slaves;
* Added initial support for multiple master replicating distinct data to a
single slave;
* Added batching to minimize load on master and net traffic. You can grab
a configurable number of updates to replicate before hitting the master
again.

* Added port specification;
* Wrapped all replication in transactions;
* Bulletproofed against downed master or slave;
* Started modularization of DB access layer, added some error
handling;
* Added a number of config vars for sync delays, etc;
* Eliminated bug in transaction ordering for replay. Updates cannot
be replicated in the order of the transactions (see archives for discussion
of why).

* Eliminated need for clear_pending.pl by making dbmirror.pl
self-clearing;
* Collasped schema into 1 queue table for performance;
* Changed sequence ID column types to BIGINT for 64-bit sequence;
* Added reconnection handling for robustness;
* Added local tracking of last seq_id to help with recovery
robustness;
* Added master/slave compatibility checking;
* Enabled slave setup during production service so master does not
have to stop serving.
* Renamed tables to minimize namespace conflicts;
* Added lots of logging/debug messages;

* Maybe a few other things I've forgotten...

AFAICS, there are still at least a few major drawbacks to this approach:

* DML statements are not replicated (same for eRServer, AFAIK).

* SEQUENCE objects are not handled; nextval() will not be replicated, so
sequence objects (and serial columns) between master and slave can easily
get out of sync. I wonder if eRServer has this same issue?

* Mass updates/deletes/inserts of 5000 rows with a single SQL command on
the master will result in 5000 individual trigger-firings, and 5000
individual replication inserts on the slave. Rumor has it eRServer's
snapshot gets around this problem.

The code is here:

http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz

Ed

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Ed L. (#1)
Re: dbmirror revisions

Ed, where are you on these dbmirror improvements?

---------------------------------------------------------------------------

Ed L. wrote:

I've been modifying dbmirror and wanted to offer my changes to anyone that
cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
I make no claims of production readiness, and testing of this latest
version has been minimal, so I strongly advise you to conduct your own
thorough testing before considering a production deployment. That said,
it's a significantly improved solution for our async master-slave needs,
with a few caveats below, and shouldn't be too hard to setup.

There are enough changes that I would hardly consider this a patch, closer
to an overhaul, since I've removed files, renamed others, and added new
files. Among the changes I've made so far:

* Added script for easier setup of many tables/dbs/slaves;
* Added initial support for multiple master replicating distinct data to a
single slave;
* Added batching to minimize load on master and net traffic. You can grab
a configurable number of updates to replicate before hitting the master
again.

* Added port specification;
* Wrapped all replication in transactions;
* Bulletproofed against downed master or slave;
* Started modularization of DB access layer, added some error
handling;
* Added a number of config vars for sync delays, etc;
* Eliminated bug in transaction ordering for replay. Updates cannot
be replicated in the order of the transactions (see archives for discussion
of why).

* Eliminated need for clear_pending.pl by making dbmirror.pl
self-clearing;
* Collasped schema into 1 queue table for performance;
* Changed sequence ID column types to BIGINT for 64-bit sequence;
* Added reconnection handling for robustness;
* Added local tracking of last seq_id to help with recovery
robustness;
* Added master/slave compatibility checking;
* Enabled slave setup during production service so master does not
have to stop serving.
* Renamed tables to minimize namespace conflicts;
* Added lots of logging/debug messages;

* Maybe a few other things I've forgotten...

AFAICS, there are still at least a few major drawbacks to this approach:

* DML statements are not replicated (same for eRServer, AFAIK).

* SEQUENCE objects are not handled; nextval() will not be replicated, so
sequence objects (and serial columns) between master and slave can easily
get out of sync. I wonder if eRServer has this same issue?

* Mass updates/deletes/inserts of 5000 rows with a single SQL command on
the master will result in 5000 individual trigger-firings, and 5000
individual replication inserts on the slave. Rumor has it eRServer's
snapshot gets around this problem.

The code is here:

http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz

Ed

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#8Ed L.
pgsql@bluepolka.net
In reply to: Bruce Momjian (#7)
Re: dbmirror revisions

On Saturday August 16 2003 5:20, Bruce Momjian wrote:

Ed, where are you on these dbmirror improvements?

My changes are complete as of late April, and I have not worked much on it
since. The changes are not in your CVS. Need to work with the dbmirror
author to see if that makes sense at this point.

Ed

Show quoted text

Ed L. wrote:

I've been modifying dbmirror and wanted to offer my changes to anyone
that cared to experiment, FWIW. My effort is ongoing, the docs aren't
perfect, I make no claims of production readiness, and testing of this
latest version has been minimal, so I strongly advise you to conduct
your own thorough testing before considering a production deployment.
That said, it's a significantly improved solution for our async
master-slave needs, with a few caveats below, and shouldn't be too hard
to setup.

There are enough changes that I would hardly consider this a patch,
closer to an overhaul, since I've removed files, renamed others, and
added new files. Among the changes I've made so far:

* Added script for easier setup of many tables/dbs/slaves;
* Added initial support for multiple master replicating distinct data
to a single slave;
* Added batching to minimize load on master and net traffic. You can
grab a configurable number of updates to replicate before hitting the
master again.

* Added port specification;
* Wrapped all replication in transactions;
* Bulletproofed against downed master or slave;
* Started modularization of DB access layer, added some error
handling;
* Added a number of config vars for sync delays, etc;
* Eliminated bug in transaction ordering for replay. Updates
cannot be replicated in the order of the transactions (see archives for
discussion of why).

* Eliminated need for clear_pending.pl by making dbmirror.pl
self-clearing;
* Collasped schema into 1 queue table for performance;
* Changed sequence ID column types to BIGINT for 64-bit
sequence; * Added reconnection handling for robustness;
* Added local tracking of last seq_id to help with recovery
robustness;
* Added master/slave compatibility checking;
* Enabled slave setup during production service so master does
not have to stop serving.
* Renamed tables to minimize namespace conflicts;
* Added lots of logging/debug messages;

* Maybe a few other things I've forgotten...

AFAICS, there are still at least a few major drawbacks to this
approach:

* DML statements are not replicated (same for eRServer, AFAIK).

* SEQUENCE objects are not handled; nextval() will not be
replicated, so sequence objects (and serial columns) between master and
slave can easily get out of sync. I wonder if eRServer has this same
issue?

* Mass updates/deletes/inserts of 5000 rows with a single SQL command
on the master will result in 5000 individual trigger-firings, and 5000
individual replication inserts on the slave. Rumor has it eRServer's
snapshot gets around this problem.

The code is here:

http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz

Ed

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading through
Usenet, please send an appropriate subscribe-nomail command to
majordomo@postgresql.org so that your message can get through to the
mailing list cleanly