Replication documentation addition

Started by Bruce Momjianover 19 years ago23 messageshackersdocs

Jump to latest

Bruce Momjian

bruce@momjian.us

over 19 years ago

hackersdocs

Here is my first draft of a new replication section for our
documentation. I am looking for any comments.

---------------------------------------------------------------------------

Replication
===========

Database replication allows multiple computers to work together, making
them appear as a single computer to user applications. This might
involve allowing a backup server to take over if the primary server
fails, or it might involve allowing several computers to work together
at the same time.

It would be ideal if database servers could be combined seamlessly. Web
servers serving static web pages can be combined quite easily by merely
load-balancing web requests to multiple machines. In fact, most
read-only servers can be combined relatively easily.

Unfortunately, most database servers have a read/write mix of requests,
and read/write servers are much harder to combine. This is because
though read-only data has to be placed on each each server only once, a
write to any server has to be seen by all other servers so that future
read requests to those servers return consistent results.

This "sync problem" is the fundamental difficulty of doing database
replication. Because there is no single solution that limits the impact
of the sync problem for all workloads, there are multiple replication
solutions. Each solution addresses the sync problem in a different way,
and minimizes its impact for a specific workload.

This section first outlines two important replication capabilities, and
then outlines various replication solutions.

Sychronous vs. Asynchronous Replication
---------------------------------------

The term sychronous replication means that a query is not considered
committed unless all servers have access to the committed records. In
that case, a failover to a backup server will lose no data records.
Asynchronous replication has a small delay between the time of commit
and its propogation to backup servers, opening the possibility that some
transactions might be lost in a switch to a backup server. Asynchronous
is used when sychronous replication would be too slow.

Full vs. Partial Replication
----------------------------

The term full replication means only a full database cluster can be
replicated, while partial replication means more fine-grained control
over replicated objects is possible.

Shared Disk Failover
--------------------

This replication solution avoids the sync problem by having only one
copy of the database. This is possible because a single disk array is
shared by multiple servers. If the main database server fails, the
backup server is able to mount and start the database as though it was
restarting after a database crash. This shared hardware functionality
is common in network storage devices. This allows sychronous, full
replication.

Warm Standby Using Point-In-Time Recovery
-----------------------------------------

A warm standby server (add doc xref) can be kept current by reading a
stream of WAL records. If the main server fails, the warm standby
contains almost all of the data as the main server, and can be used as
the new database server. This allows asychronous, full replication.

Point-In-Time Recovery [Asychronous, Full]
----------------------

A Point-In-Time Recovery is the same as a Warm Standby server except
that the standby server must go though a full restore and archive
recovery operation, delaying how quickly it can be used as the main
database server. This allows asychronous, full replication.

Continuously Running Failover Server
------------------------------------

A continuously running failover server allows the backup server to
answer read-only queries while the master server is running. It
receives a continuous stream of write activity from the master server.
Because the failover server can be used for read-only database requests,
it is ideal for data warehouse queries. Slony offers this as
asychronous, partial replication.

Data Partitioning
-----------------

Data partitioning partitions the database into data sets. To achieve
replication, each data set can only be modified by one server. For
example, data can be partitioned by main office, e.g. London and Paris.
While London and Paris servers have all data records, only London can
modify London records, and Paris can only modify Paris records. Such
partitioning is usually accomplished in application code, though rules
and triggers can help enforce such partitioning and keep the read-only
data sets current. Slony can also be used in such a setup. While Slony
replicates only entire tables, London and Paris can be placed in
separate tables, and inheritance can be used to pull from both tables at
the same time.

Query Broadcast Replication
---------------------------

This involves sending write queries to multiple servers. Read-only
queries can be sent to a single server because there is no need for all
servers to process it. This can be complex to setup because functions
like random() and CURRENT_TIMESTAMP will have different values on
different servers, and sequences should be consistent across servers.
Pgpool implements this type of replication.

Multi-Master Replication
------------------------

In multi-master replication, each server can accept write requests, and
these write requests are broadcast to all other servers before the
transaction commits. Under heavy load, this type of replication can
cause excessive locking and performance degradation. It is implemented
by Oracle in their RAC product. PostgreSQL does not offer this type of
replication, though PostgreSQL two-phase commit can be used to implement
this in application code.

Performance
-----------
Performance must be considered in any repliacation choice. There is
usually a tradeoff between functionality and performance. For example,
full sychronousreplication over a slow network might cut performance by
more than half, while asynchronous replication might have a minimal
performance imact.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Bruce Momjian (#1)

hackersdocs

Re: Replication documentation addition

Please disregard. I am redoing it and will post a URL with the most
recent version.

---------------------------------------------------------------------------

Bruce Momjian wrote:

Here is my first draft of a new replication section for our
documentation. I am looking for any comments.

---------------------------------------------------------------------------

Replication
===========

Database replication allows multiple computers to work together, making
them appear as a single computer to user applications. This might
involve allowing a backup server to take over if the primary server
fails, or it might involve allowing several computers to work together
at the same time.

It would be ideal if database servers could be combined seamlessly. Web
servers serving static web pages can be combined quite easily by merely
load-balancing web requests to multiple machines. In fact, most
read-only servers can be combined relatively easily.

Unfortunately, most database servers have a read/write mix of requests,
and read/write servers are much harder to combine. This is because
though read-only data has to be placed on each each server only once, a
write to any server has to be seen by all other servers so that future
read requests to those servers return consistent results.

This "sync problem" is the fundamental difficulty of doing database
replication. Because there is no single solution that limits the impact
of the sync problem for all workloads, there are multiple replication
solutions. Each solution addresses the sync problem in a different way,
and minimizes its impact for a specific workload.

This section first outlines two important replication capabilities, and
then outlines various replication solutions.

Sychronous vs. Asynchronous Replication
---------------------------------------

The term sychronous replication means that a query is not considered
committed unless all servers have access to the committed records. In
that case, a failover to a backup server will lose no data records.
Asynchronous replication has a small delay between the time of commit
and its propogation to backup servers, opening the possibility that some
transactions might be lost in a switch to a backup server. Asynchronous
is used when sychronous replication would be too slow.

Full vs. Partial Replication
----------------------------

The term full replication means only a full database cluster can be
replicated, while partial replication means more fine-grained control
over replicated objects is possible.

Shared Disk Failover
--------------------

This replication solution avoids the sync problem by having only one
copy of the database. This is possible because a single disk array is
shared by multiple servers. If the main database server fails, the
backup server is able to mount and start the database as though it was
restarting after a database crash. This shared hardware functionality
is common in network storage devices. This allows sychronous, full
replication.

Warm Standby Using Point-In-Time Recovery
-----------------------------------------

A warm standby server (add doc xref) can be kept current by reading a
stream of WAL records. If the main server fails, the warm standby
contains almost all of the data as the main server, and can be used as
the new database server. This allows asychronous, full replication.

Point-In-Time Recovery [Asychronous, Full]
----------------------

A Point-In-Time Recovery is the same as a Warm Standby server except
that the standby server must go though a full restore and archive
recovery operation, delaying how quickly it can be used as the main
database server. This allows asychronous, full replication.

Continuously Running Failover Server
------------------------------------

A continuously running failover server allows the backup server to
answer read-only queries while the master server is running. It
receives a continuous stream of write activity from the master server.
Because the failover server can be used for read-only database requests,
it is ideal for data warehouse queries. Slony offers this as
asychronous, partial replication.

Data Partitioning
-----------------

Data partitioning partitions the database into data sets. To achieve
replication, each data set can only be modified by one server. For
example, data can be partitioned by main office, e.g. London and Paris.
While London and Paris servers have all data records, only London can
modify London records, and Paris can only modify Paris records. Such
partitioning is usually accomplished in application code, though rules
and triggers can help enforce such partitioning and keep the read-only
data sets current. Slony can also be used in such a setup. While Slony
replicates only entire tables, London and Paris can be placed in
separate tables, and inheritance can be used to pull from both tables at
the same time.

Query Broadcast Replication
---------------------------

This involves sending write queries to multiple servers. Read-only
queries can be sent to a single server because there is no need for all
servers to process it. This can be complex to setup because functions
like random() and CURRENT_TIMESTAMP will have different values on
different servers, and sequences should be consistent across servers.
Pgpool implements this type of replication.

Multi-Master Replication
------------------------

In multi-master replication, each server can accept write requests, and
these write requests are broadcast to all other servers before the
transaction commits. Under heavy load, this type of replication can
cause excessive locking and performance degradation. It is implemented
by Oracle in their RAC product. PostgreSQL does not offer this type of
replication, though PostgreSQL two-phase commit can be used to implement
this in application code.

Performance
-----------
Performance must be considered in any repliacation choice. There is
usually a tradeoff between functionality and performance. For example,
full sychronousreplication over a slow network might cut performance by
more than half, while asynchronous replication might have a minimal
performance imact.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Josh Berkus

josh@agliodbs.com

over 19 years ago

In reply to: Bruce Momjian (#1)

hackersdocs

Re: [HACKERS] Replication documentation addition

Bruce,

Here is my first draft of a new replication section for our
documentation. I am looking for any comments.

Hmmm ... while the primer on different types of replication is fine, I
think what users were really looking for is a listing of the different
replication solutions which are available for PostgreSQL and how to get
them.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Markus Wanner

markus@bluegap.ch

over 19 years ago

In reply to: Josh Berkus (#3)

hackersdocs

Re: [HACKERS] Replication documentation addition

Hello Josh,

Josh Berkus wrote:

Hmmm ... while the primer on different types of replication is fine, I
think what users were really looking for is a listing of the different
replication solutions which are available for PostgreSQL and how to get
them.

Well, let's see what we have:

* Shared Disk Fail Over
* Warm Standby Using Point-In-Time Recovery
* Point-In-Time Recovery

these first three require quite some configuration, AFAIK there is no
tool or single solution you can download, install and be happy with. I
probably wouldn't even call them 'replication solutions'. For me those
are more like backups with fail-over capability.

* Continuously Running Fail-Over Server

(BTW, what is 'partial replication' supposed to mean here?)
Here we could link to Slony.

* Data Partitioning

Here we can't provide a link, it's just a way to handle the problem in
the application code.

* Query Broadcast Replication

Here we could link to PgPool.

* Multi-Master Replication
(or better: Distributed Shared Memory Replication)

No existing solution for PostgreSQL.

Looking at that, I'm a) missing PgCluster and b) arguing that we have to
admit that we simply can not 'list .. replication solutions ... and how
to get them' because all of the solutions mentioned need quite some
knowledge and require a more or less complex installation and configuration.

Regards

Markus

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Markus Wanner (#4)

hackersdocs

Re: [HACKERS] Replication documentation addition

Looking at that, I'm a) missing PgCluster and b) arguing that we have to
admit that we simply can not 'list .. replication solutions ... and how
to get them' because all of the solutions mentioned need quite some
knowledge and require a more or less complex installation and
configuration.

There is also the question if we should have a sub section:

Closed Source replication solutions:

Mammoth Replicator
Continuent P/Cluster
ExtenDB
Greenplum MPP (although this is kind of horizontal partitioning)

Joshua D. Drake

Regards

Markus

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

Simon Riggs

simon@2ndQuadrant.com

over 19 years ago

In reply to: Joshua D. Drake (#5)

hackersdocs

Re: [HACKERS] Replication documentation addition

On Tue, 2006-10-24 at 12:34 -0700, Joshua D. Drake wrote:

Looking at that, I'm a) missing PgCluster and b) arguing that we have to
admit that we simply can not 'list .. replication solutions ... and how
to get them' because all of the solutions mentioned need quite some
knowledge and require a more or less complex installation and
configuration.

There is also the question if we should have a sub section:

Closed Source replication solutions:

Mammoth Replicator
Continuent P/Cluster
ExtenDB
Greenplum MPP (although this is kind of horizontal partitioning)

Where do you draw the line? You maybe surprised about what other options
that includes. I'm happy to include a whole range of things, but please
be very careful and precise about what you wish for.

There's enough good solutions for open source PostgreSQL that it is easy
and straightforward to limit it to just that. New contributions welcome,
of course.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Simon Riggs (#6)

hackersdocs

Re: [HACKERS] Replication documentation addition

Simon Riggs wrote:

On Tue, 2006-10-24 at 12:34 -0700, Joshua D. Drake wrote:

Looking at that, I'm a) missing PgCluster and b) arguing that we have to
admit that we simply can not 'list .. replication solutions ... and how
to get them' because all of the solutions mentioned need quite some
knowledge and require a more or less complex installation and
configuration.

There is also the question if we should have a sub section:

Closed Source replication solutions:

Mammoth Replicator
Continuent P/Cluster
ExtenDB
Greenplum MPP (although this is kind of horizontal partitioning)

Where do you draw the line?

Well that is certainly a good question but we do include links to some
of the more prominent closed source software on the website as well.

You maybe surprised about what other options
that includes. I'm happy to include a whole range of things, but please
be very careful and precise about what you wish for.

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

Sincerely,

Joshua D. Drake

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

Simon Riggs

simon@2ndQuadrant.com

over 19 years ago

In reply to: Joshua D. Drake (#7)

hackersdocs

Re: [HACKERS] Replication documentation addition

On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

...and how do you define PostgreSQL exactly?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Simon Riggs (#8)

hackersdocs

Re: [HACKERS] Replication documentation addition

Simon Riggs wrote:

On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

...and how do you define PostgreSQL exactly?

I replication product or software defined to work with only PostgreSQL?

I know there are some other products out there that will work from one
db to another, but I am not sure if those would be considered HA
solutions or migration solutions (which we could certainly document).

Sincerely,

Joshua D. Drake

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#10

Simon Riggs

simon@2ndQuadrant.com

over 19 years ago

In reply to: Joshua D. Drake (#9)

hackersdocs

Re: [HACKERS] Replication documentation addition

On Tue, 2006-10-24 at 15:33 -0700, Joshua D. Drake wrote:

Simon Riggs wrote:

On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

...and how do you define PostgreSQL exactly?

I replication product or software defined to work with only PostgreSQL?

(again)... how do you define PostgreSQL exactly?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#11

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Simon Riggs (#10)

hackersdocs

Re: [HACKERS] Replication documentation addition

Simon Riggs wrote:

On Tue, 2006-10-24 at 15:33 -0700, Joshua D. Drake wrote:

Simon Riggs wrote:

On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

...and how do you define PostgreSQL exactly?

I replication product or software defined to work with only PostgreSQL?

(again)... how do you define PostgreSQL exactly?

What about PostgreSQL is unclear? Is your question do I consider
EnterpriseDB, PostgreSQL? I have no comment on that matter.

Sincerely,

Joshua D. Drake

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#12

Jim Nasby

Jim.Nasby@BlueTreble.com

over 19 years ago

In reply to: Bruce Momjian (#1)

hackersdocs

Re: [HACKERS] Replication documentation addition

On Mon, Oct 23, 2006 at 11:39:34PM -0400, Bruce Momjian wrote:

Query Broadcast Replication
---------------------------

This involves sending write queries to multiple servers. Read-only
queries can be sent to a single server because there is no need for all
servers to process it. This can be complex to setup because functions
like random() and CURRENT_TIMESTAMP will have different values on
different servers, and sequences should be consistent across servers.
Pgpool implements this type of replication.

Isn't there another active project that does this besides pgpool?

It's probably also worth mentioning the commercial replication schemes
that are out there.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#13

Jim Nasby

Jim.Nasby@BlueTreble.com

over 19 years ago

In reply to: Joshua D. Drake (#9)

hackersdocs

Re: [HACKERS] Replication documentation addition

On Tue, Oct 24, 2006 at 03:33:03PM -0700, Joshua D. Drake wrote:

Simon Riggs wrote:

On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

...and how do you define PostgreSQL exactly?

I replication product or software defined to work with only PostgreSQL?

AFAIK Continuent's product fails that test...

I don't see any reason to exclude things that work with databases other
than PostgreSQL, though I agree that replication that's actually in the
application space (ie: it ties you to TomCat or some other platform)
probably doesn't belong.

My feeling is that people reading this chapter are looking for solutions
and probably don't care as much about how exactly the solution works so
long as it meets their needs.

I know there are some other products out there that will work from one
db to another, but I am not sure if those would be considered HA
solutions or migration solutions (which we could certainly document).

--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#14

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Jim Nasby (#13)

hackersdocs

Re: [HACKERS] Replication documentation addition

Jim C. Nasby wrote:

On Tue, Oct 24, 2006 at 03:33:03PM -0700, Joshua D. Drake wrote:

Simon Riggs wrote:

On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

...and how do you define PostgreSQL exactly?

I replication product or software defined to work with only PostgreSQL?

AFAIK Continuent's product fails that test...

To my knowledge, p/cluster only works with PostgreSQL but I could be wrong.

I don't see any reason to exclude things that work with databases other
than PostgreSQL, though I agree that replication that's actually in the
application space (ie: it ties you to TomCat or some other platform)
probably doesn't belong.

I was just trying to have a defined criteria of some sort. We could fill
up pages and pages of possible replication solutions :)

Joshua D. Drake

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#15

Jeff Frost

jeff@frostconsultingllc.com

over 19 years ago

In reply to: Joshua D. Drake (#14)

hackersdocs

Re: [HACKERS] Replication documentation addition

On Tue, 24 Oct 2006, Joshua D. Drake wrote:

AFAIK Continuent's product fails that test...

To my knowledge, p/cluster only works with PostgreSQL but I could be wrong.

p/cluster was the old name for the PostgreSQL specific version. It's been
rebranded as uni/cluster and they have versions for both PostgreSQL and MySQL.
One of my customers is trying it out currently.

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

#16

Luke Lonergan

llonergan@greenplum.com

over 19 years ago

In reply to: Jeff Frost (#15)

hackersdocs

Re: Replication documentation addition

Bruce,

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
Sent: Tuesday, October 24, 2006 5:16 PM
To: Hannu Krosing
Cc: PostgreSQL-documentation; PostgreSQL-development
Subject: Re: [HACKERS] Replication documentation addition

OK, I have updated the URL. Please let me know how you like it.

There's a typo on line 8, first paragraph:

"perhaps with only one server allowing write rwork together at the same
time."

Also, consider this wording of the last description:

"Single-Query Clustering..."

Replaced by:

"Shared Nothing Clustering
-----------------------

This allows multiple servers with separate disks to work together on a
each query.
In shared nothing clusters, the work of answering each query is
distributed among
the servers to increase the performance through parallelism. These
systems will
typically feature high availability by using other forms of replication
internally.

While there are no open source options for this type of clustering,
there are several
commercial products available that implement this approach, making
PostgreSQL achieve
very high performance for multi-Terabyte business intelligence
databases."

- Luke

Import Notes

Resolved by subject fallback

#17

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Markus Wanner (#4)

hackersdocs

Re: [HACKERS] Replication documentation addition

Markus Schiltknecht wrote:

Looking at that, I'm a) missing PgCluster and b) arguing that we have to
admit that we simply can not 'list .. replication solutions ... and how
to get them' because all of the solutions mentioned need quite some
knowledge and require a more or less complex installation and configuration.

Where is pgcluster in terms of usability? Should I mention it?

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#18

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Luke Lonergan (#16)

hackersdocs

Re: [HACKERS] Replication documentation addition

I don't think the PostgreSQL documentation should be mentioning
commercial solutions.

---------------------------------------------------------------------------

Luke Lonergan wrote:

Bruce,

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
Sent: Tuesday, October 24, 2006 5:16 PM
To: Hannu Krosing
Cc: PostgreSQL-documentation; PostgreSQL-development
Subject: Re: [HACKERS] Replication documentation addition

OK, I have updated the URL. Please let me know how you like it.

There's a typo on line 8, first paragraph:

"perhaps with only one server allowing write rwork together at the same
time."

Also, consider this wording of the last description:

"Single-Query Clustering..."

Replaced by:

"Shared Nothing Clustering
-----------------------

This allows multiple servers with separate disks to work together on a
each query.
In shared nothing clusters, the work of answering each query is
distributed among
the servers to increase the performance through parallelism. These
systems will
typically feature high availability by using other forms of replication
internally.

While there are no open source options for this type of clustering,
there are several
commercial products available that implement this approach, making
PostgreSQL achieve
very high performance for multi-Terabyte business intelligence
databases."

- Luke

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#19

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Joshua D. Drake (#5)

hackersdocs

Re: [HACKERS] Replication documentation addition

Joshua D. Drake wrote:

Looking at that, I'm a) missing PgCluster and b) arguing that we have to
admit that we simply can not 'list .. replication solutions ... and how
to get them' because all of the solutions mentioned need quite some
knowledge and require a more or less complex installation and
configuration.

There is also the question if we should have a sub section:

Closed Source replication solutions:

Mammoth Replicator
Continuent P/Cluster
ExtenDB
Greenplum MPP (although this is kind of horizontal partitioning)

I vote no.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#20

Hannu Krosing

hannu@tm.ee

over 19 years ago

In reply to: Bruce Momjian (#18)

hackersdocs

Re: Replication documentation addition

Ühel kenal päeval, T, 2006-10-24 kell 22:57, kirjutas Bruce Momjian:

I don't think the PostgreSQL documentation should be mentioning
commercial solutions.

IMNSHO, having commercial solutions based on postgresql which extend
postgres in directions not (yet?) done by core postgres is nothing to be
ashamed of.

And we should at least mention the OSS version of Bizgres as a place
where quite a lot of initial development is done on performance
improvements considered too risky for mainline postgresql.

And if you need a more technical reason, you can use free libpq and psql
to connect to even Bizgres MPP ;)

---------------------------------------------------------------------------

Luke Lonergan wrote:

Bruce,

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
Sent: Tuesday, October 24, 2006 5:16 PM
To: Hannu Krosing
Cc: PostgreSQL-documentation; PostgreSQL-development
Subject: Re: [HACKERS] Replication documentation addition

OK, I have updated the URL. Please let me know how you like it.

There's a typo on line 8, first paragraph:

"perhaps with only one server allowing write rwork together at the same
time."

Also, consider this wording of the last description:

"Single-Query Clustering..."

Replaced by:

"Shared Nothing Clustering
-----------------------

This allows multiple servers with separate disks to work together on a
each query.
In shared nothing clusters, the work of answering each query is
distributed among
the servers to increase the performance through parallelism. These
systems will
typically feature high availability by using other forms of replication
internally.

While there are no open source options for this type of clustering,
there are several
commercial products available that implement this approach, making
PostgreSQL achieve
very high performance for multi-Terabyte business intelligence
databases."

- Luke