FAQ: Current state of replication ?

Started by Peter Galbavyalmost 25 years ago8 messages

peter.galbavy@knowledge.com

almost 25 years ago

Been reading
http://www.postgresql.org/docs/pgsql/doc/TODO.detail/replication with
interest as we are now approaching a real requirement for it on a project we
have finally resurrected for a bit of a dormant state.

What is the current state-of-the-art WRT replication of any sort ? If anyone
has homebrew solutions that they can share, we would welcome tyring too.

Our requirements, which seem sort of reasonable, are:

1. One "writer", many "reader" PostgreSQL servers. We will want to write
provisioning / configuration information centrally and can tolerate a
"writer" failuer for a time.
2. Consitency at the transaction level. All changes to the "writer" server
will be wrapped in transactions, and there will be foreign key consistency
checking in many tables.
3. Delays from "writer" through to consistent state on "readers" can be
tolerated to within a few minutes or even more. All read-servers must be in
the same state when answering requests.

Our objective is to acheive performance and some fault tolerance as the data
is going to be used for near-real time configuration of various other
backend systems in an almost traditional 'net environment.

As we are coding various other stuff for this project over the next few
months, any help we can be in developing for this part of PostgreSQL, just
let me know. While knowing very little about PostgreSQL internals, we learn
quick.

rgds,
--
Peter Galbavy
Knowledge Matters Ltd.
http://www.knowledge.com/

Thomas Lockhart

lockhart@alumni.caltech.edu

almost 25 years ago

In reply to: Peter Galbavy (#1)

Re: FAQ: Current state of replication ?

What is the current state-of-the-art WRT replication of any sort ? If anyone
has homebrew solutions that they can share, we would welcome tyring too.

There is some code in contrib/rserv for 7.1 which does table
replication. It has some restrictions, but does implement the basic
concept. I think a tarball to do the same for 7.0 and earlier is
available at www.pgsql.com (just Makefile differences).

We are currently working through the issues involved with multi-slave
replication and the ramifications for failover to (one of) the slaves.
It looks like the rserv code may assume too much independence between
slaves and replication sync information, and failover may be
not-quite-right in those cases.

Will be posting to the list when we know the answer (though
contributions and inputs are of course always welcome!). afaict changes
in rserv schema, if necessary, will not be available for 7.1, but we'll
be posting patches and updating the CVS tree.

btw, it looks like TODO.detail/replication predates the replication
implementation, and has no real relationship with the implementation.
There is some thought that WAL/BAR features can help support replication
at a different level than is done now, but that is work for the future
afaik.

- Thomas

oleg@sai.msu.su

almost 25 years ago

In reply to: Thomas Lockhart (#2)

Re: Re: FAQ: Current state of replication ?

I found interesting paper http://citeseer.nj.nec.com/330257.html
"Don't be lazy, be consistent: Postgres-R, A new way to implement Database Replication"

Abstract:
Database designers often point out that eager, update everywhere replication suffers from
high deadlock rates, message overhead and poor response times. In this paper, we show that these
limitations can be circumvented by using a combination of known and novel techniques. Moreover, we
show how the proposed solution can be incorporated into a real database system. The paper discusses
the new protocols and their implementation in PostgreSQL. It also provides experimental results proving that many of the dangers and limitations of
replication can be avoided by using the appropriate techniques. 1 Introduction Existing replication protocols can be divided into eager and lazy

Regards,

Oleg
On Tue, 20 Mar 2001, Thomas Lockhart wrote:

What is the current state-of-the-art WRT replication of any sort ? If anyone
has homebrew solutions that they can share, we would welcome tyring too.

There is some code in contrib/rserv for 7.1 which does table
replication. It has some restrictions, but does implement the basic
concept. I think a tarball to do the same for 7.0 and earlier is
available at www.pgsql.com (just Makefile differences).

We are currently working through the issues involved with multi-slave
replication and the ramifications for failover to (one of) the slaves.
It looks like the rserv code may assume too much independence between
slaves and replication sync information, and failover may be
not-quite-right in those cases.

Will be posting to the list when we know the answer (though
contributions and inputs are of course always welcome!). afaict changes
in rserv schema, if necessary, will not be available for 7.1, but we'll
be posting patches and updating the CVS tree.

btw, it looks like TODO.detail/replication predates the replication
implementation, and has no real relationship with the implementation.
There is some thought that WAL/BAR features can help support replication
at a different level than is done now, but that is work for the future
afaik.

- Thomas

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

joe@conway-family.com

almost 25 years ago

In reply to: Peter Galbavy (#1)

1 attachment(s)

Re: FAQ: Current state of replication ?

1. One "writer", many "reader" PostgreSQL servers. We will want to write
provisioning / configuration information centrally and can tolerate a
"writer" failuer for a time.
2. Consitency at the transaction level. All changes to the "writer" server
will be wrapped in transactions, and there will be foreign key consistency
checking in many tables.
3. Delays from "writer" through to consistent state on "readers" can be
tolerated to within a few minutes or even more. All read-servers must be

in

the same state when answering requests.

Our objective is to acheive performance and some fault tolerance as the

data

is going to be used for near-real time configuration of various other
backend systems in an almost traditional 'net environment.

As we are coding various other stuff for this project over the next few
months, any help we can be in developing for this part of PostgreSQL, just
let me know. While knowing very little about PostgreSQL internals, we

learn

quick.

Peter,

I've been mostly a lurker here (at least on the hackers list) for a couple
of years, but I thought I would "de-lurk" for long enough to reply to your
question ;)

Attached is the source for a replication solution I recently wrote for a
project I'm working on. I think it meets your criteria. I was considering
sending it to the list as a possible contrib after 7.1 was released (if
anyone is interested, and the code is worthy), but since you asked, here it
is. A few disclaimers are in order. First, I am *not* an experienced C
programmer. The code works in the limited testing I've done so far but needs
to be reviewed and scrubbed by someone with more experience. Second, I have
not yet used this in production, so use at your own risk. Third, I have only
tested it under Red Hat 6.2 and 7.0. Finally, it will only work with >=
PostgreSQL 7.1 beta3.

Basic installation instructions:
copy pg_lnk.tgz to contrib under the PostgreSQL source tree
tar -xzvf pg_lnk.tgz
cd pg_lnk
./install.sh

I'll be happy to answer any questions to help you get it installed and
working. I would appreciate any feedback, improvements, general guidance if
you decide to use it.

Thanks,

Joe

<lurking once again . . .>

Christopher Masto

chris@netmonger.net

almost 25 years ago

In reply to: Peter Galbavy (#1)

Re: FAQ: Current state of replication ?

On Mon, Mar 19, 2001 at 11:00:20AM -0000, Peter Galbavy wrote:

1. One "writer", many "reader" PostgreSQL servers. We will want to write
provisioning / configuration information centrally and can tolerate a
"writer" failuer for a time.
2. Consitency at the transaction level. All changes to the "writer" server
will be wrapped in transactions, and there will be foreign key consistency
checking in many tables.
3. Delays from "writer" through to consistent state on "readers" can be
tolerated to within a few minutes or even more. All read-servers must be in
the same state when answering requests.

Our objective is to acheive performance and some fault tolerance as the data
is going to be used for near-real time configuration of various other
backend systems in an almost traditional 'net environment.

Your application sounds like a perfect fit for LDAP.

In other words, keep your database in Postgres, but export views of it
through for clients to query through LDAP. Rely on LDAP replication,
since it has the model you need and works today.
--
Christopher Masto Senior Network Monkey NetMonger Communications
chris@netmonger.net info@netmonger.net http://www.netmonger.net

Free yourself, free your machine, free the daemon -- http://www.freebsd.org/

Marek PUBLICEWICZ

M.Publicewicz@elka.pw.edu.pl

almost 25 years ago

In reply to: Peter Galbavy (#1)

Strange results of CURRENT_TIMESTAMP

Hello,

During repopulation of the database (using the results of the pg_dump
program), I spot two strange things:

- fields defined as TIMESTAMP DEFAULT CURRENT_TIMESTAMP sometimes generate
invalid format of the date, for instance:

2001-02-10 13:11:60.00+01 - which follows the records
2001-02-10 13:10:59.00+01

Which means, that the proper timestamp should look like:
2001-02-10 13:11:00.00+01

- I have a float4 field, which contains the value 3e-40 (approximately).
I know it's there - the queries return it without any problem. Problem
occurs again when I try to repopulate the table. Having such a value
in a line generated by pg_dump (in form of COPY from stdin) I get
the error:

Bad float4 input format -- underflow.

When I redefine the field as a float8 everything works fine. But why
does it occur during repopulation - when in fact such a value did exist
in the table before the table was drop.

I'am running Postgresql 7.0.2

thanks for help

Mark

Thomas Lockhart

lockhart@alumni.caltech.edu

almost 25 years ago

In reply to: Marek PUBLICEWICZ (#6)

Re: Strange results of CURRENT_TIMESTAMP

- fields defined as TIMESTAMP DEFAULT CURRENT_TIMESTAMP sometimes generate
invalid format of the date, for instance:
2001-02-10 13:11:60.00+01

You are running the Mandrake RPMs? Or have otherwise compiled using the
-ffast-math compiler flag?

- Thomas

Marek PUBLICEWICZ

M.Publicewicz@elka.pw.edu.pl

almost 25 years ago

In reply to: Thomas Lockhart (#7)

Re: Re: Strange results of CURRENT_TIMESTAMP

Hi,

On Thu, 22 Mar 2001, Thomas Lockhart wrote:

- fields defined as TIMESTAMP DEFAULT CURRENT_TIMESTAMP sometimes generate
invalid format of the date, for instance:
2001-02-10 13:11:60.00+01

You are running the Mandrake RPMs? Or have otherwise compiled using the
-ffast-math compiler flag?

I'm on the Mandrake but I'm using the compiled (from the sources) version
of PostgreSQL. As a matter of fact I dont't remember, wheter or not I put
the -ffast-math flag. Is this the reason for the inaccuracy?

If so - is it responsible also for the 'underflow' error?

Mark