Clustering & Load Balancing & Replication
Hi,
I am currently investigating the preferred method of clustering a postgresql database on Redhat?
i would really appreciate some suggestions or experiences you guys have had.
note: performance & redundancy are both equally desirable and i have plenty of resources.
we already have licences for Redhat Enterprise Linux ES v4, i was wondering if Redhat's >Cluster Manager is a candidate? ~ or if either PGCluster or Slony1 have >advantages or drawbacks etc.
Couple of months ago, when I started looking at replication, and clustering I saw this as a fairly basic question, but the more I got into it the more I realized there's clustering, clustering and clustering, and replication, replication and replication.
I dont think anyone can give you a prefered method, not without a system spec.
I looked at Slony, even tried to help debug it on windows, and eventually realised its Master Slave arrangement would not work for me.
I looked at dBMirror, which I liked in concept but I didnt fancy the way the code was done in a project that seems to be dead.
Then I looked at some really weird and wonderful schemes that work on a core technology like a reliable message queue, not for me.
In the end I decided I wanted real time synchronous multi master replication, without conflict resolution, and that I didnt really want a dBCluster, I needed to cluster and load balance at web level, allowing for things like session management, multiple clusters, and remote sites.
Bad news is I couldnt find it, not for free anyway.
If you dont find a solution, try http://spar.orgfree.com/index.html
Please keep in mind, its new and written by a novice out of desparation, not one of the community super geeks, but it takes a very pragmatic approach to replication and maybe just what the doctor ordered.... Good Luck.
I assume you have read this new documentation for 8.2:
http://www.postgresql.org/docs/8.2/static/high-availability.html
---------------------------------------------------------------------------
org@kewlstuff.co.za wrote:
Hi,
I am currently investigating the preferred method of clustering a postgresql database on Redhat?
i would really appreciate some suggestions or experiences you guys have had.
note: performance & redundancy are both equally desirable and i have plenty of resources.
we already have licences for Redhat Enterprise Linux ES v4, i was wondering if Redhat's >Cluster Manager is a candidate? ~ or if either PGCluster or Slony1 have >advantages or drawbacks etc.
Couple of months ago, when I started looking at replication, and clustering I saw this as a fairly basic question, but the more I got into it the more I realized there's clustering, clustering and clustering, and replication, replication and replication.
I dont think anyone can give you a prefered method, not without a system spec.I looked at Slony, even tried to help debug it on windows, and eventually realised its Master Slave arrangement would not work for me.
I looked at dBMirror, which I liked in concept but I didnt fancy the way the code was done in a project that seems to be dead.
Then I looked at some really weird and wonderful schemes that work on a core technology like a reliable message queue, not for me.In the end I decided I wanted real time synchronous multi master replication, without conflict resolution, and that I didnt really want a dBCluster, I needed to cluster and load balance at web level, allowing for things like session management, multiple clusters, and remote sites.
Bad news is I couldnt find it, not for free anyway.
If you dont find a solution, try http://spar.orgfree.com/index.html
Please keep in mind, its new and written by a novice out of desparation, not one of the community super geeks, but it takes a very pragmatic approach to replication and maybe just what the doctor ordered.... Good Luck.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Thanks Bruce... just read it, still running 8.1 so didnt see 8.2
documentation.
Nice to know my work has not been duplicated... quote from
http://www.postgresql.org/docs/8.2/static/high-availability.html
"PostgreSQL does not offer this type of replication, though PostgreSQL
two-phase commit (PREPARE TRANSACTION and COMMIT PREPARED) can be used to
implement this in application code or middleware"
Suggest you download my little application and read the documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free multi-master
replication system :)
One comment they make.... "Heavy write activity can cause excessive locking,
leading to poor performance. In fact, write performance is often worse than
that of a single server. Read requests can be sent to any server."
I'm not sure I agree with that... or maybe MVCC is just fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at record level, so only
if the multimasters all hit the same record is there the potential for lock
conflict.
Why will dB's being randomly used, hit the same records, I think its a low
probability to begin with?
Not happy with that, I wrote a multithreaded routine and got them to all
smack the same record, it NEVER ROLLED BACK, and if there is performance
degradation, I didnt notice it... again probably a testament to the MVCC
design.
In any event if you look at the documentation, you'll see SPAR is not
multimaster or nothing. Can use say one server in an office and another to
pump data to a remote web site... not sure if you would even call that
multimaster, thats the point, I'm not sure SPAR fits any pure theory
category.
Anyway have a look, nice thing is you can play with it as well... becomes a
pragmatic discussion.
http://spar.orgfree.com/index.html
Now that I got replication going, I need to make a site mirror which I'll
piggy back on the replication... I'll dump it on the site as well when its
done... if you want it.
Then I'll probably be so "off topic"... I'll have to join another mailing
list :)
Before I go... just want to formally thank the Postgres community...
It may not be everything, and its easy to focus on what it doesnt do yet...
but what you have done is amazing, and what you have got, is fantastic.
I will use nothing else....
Thanks, Johnny
----- Original Message -----
From: "Bruce Momjian" <bruce@momjian.us>
To: <org@kewlstuff.co.za>
Cc: <pgsql-general@postgresql.org>
Sent: Saturday, December 23, 2006 6:34 PM
Subject: Re: [GENERAL] Clustering & Load Balancing & Replication
Show quoted text
I assume you have read this new documentation for 8.2:
http://www.postgresql.org/docs/8.2/static/high-availability.html
---------------------------------------------------------------------------
org@kewlstuff.co.za wrote:
Hi,
I am currently investigating the preferred method of clustering a
postgresql database on Redhat?i would really appreciate some suggestions or experiences you guys have
had.note: performance & redundancy are both equally desirable and i have
plenty of resources.we already have licences for Redhat Enterprise Linux ES v4, i was
wondering if Redhat's >Cluster Manager is a candidate? ~ or if either
PGCluster or Slony1 have >advantages or drawbacks etc.Couple of months ago, when I started looking at replication, and
clustering I saw this as a fairly basic question, but the more I got into
it the more I realized there's clustering, clustering and clustering, and
replication, replication and replication.
I dont think anyone can give you a prefered method, not without a system
spec.I looked at Slony, even tried to help debug it on windows, and eventually
realised its Master Slave arrangement would not work for me.
I looked at dBMirror, which I liked in concept but I didnt fancy the way
the code was done in a project that seems to be dead.
Then I looked at some really weird and wonderful schemes that work on a
core technology like a reliable message queue, not for me.In the end I decided I wanted real time synchronous multi master
replication, without conflict resolution, and that I didnt really want a
dBCluster, I needed to cluster and load balance at web level, allowing
for things like session management, multiple clusters, and remote sites.Bad news is I couldnt find it, not for free anyway.
If you dont find a solution, try http://spar.orgfree.com/index.html
Please keep in mind, its new and written by a novice out of desparation,
not one of the community super geeks, but it takes a very pragmatic
approach to replication and maybe just what the doctor ordered.... Good
Luck.--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com+ If your life is a hard drive, Christ can be your backup. +
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
Centuries ago, Nostradamus foresaw when org@kewlstuff.co.za would write:
Suggest you download my little application and read the documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free multi-master
replication system :)
It isn't systematically usable as such, without a whole lot of
end-user assembly.
One comment they make.... "Heavy write activity can cause excessive locking,
leading to poor performance. In fact, write performance is often worse than
that of a single server. Read requests can be sent to any server."
I'm not sure I agree with that... or maybe MVCC is just fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at record level, so only
if the multimasters all hit the same record is there the potential for lock
conflict.
Why will dB's being randomly used, hit the same records, I think its a low
probability to begin with?
That's only true if you are certain that the update pattern is NOT
involving a shared set of records. IN GENERAL, heavy write activity
can cause locking to become mighty expensive, which is certainly a
true statement.
Not happy with that, I wrote a multithreaded routine and got them to all
smack the same record, it NEVER ROLLED BACK, and if there is performance
degradation, I didnt notice it... again probably a testament to the MVCC
design.
It seems likely to me that this requires some careful validation of
testing.
An effect we see is that if a set of transactions are "fighting" over
a single "balance" record, they will essentially serialize over that.
On a system with a single CPU, it is not obvious that you'll see a
degradation there because, since you only have the single CPU, it
would be serializing the activity anyways.
Try it out on an 8-way SMP system and you may see things differently.
In any event if you look at the documentation, you'll see SPAR is not
multimaster or nothing. Can use say one server in an office and another to
pump data to a remote web site... not sure if you would even call that
multimaster, thats the point, I'm not sure SPAR fits any pure theory
category.
There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta see if I
can find them in a readily deployable form.
There are two notable anomalies which have been known to break
replication systems:
1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where some_attr='foo'
order by random() limit 5); -- Where there are 25 records with some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');
begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;
Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replication systems
(erServer) that this set of updates can, intermittently, cause to fall
over.
--
let name="cbbrowne" and tld="linuxfinances.info" in String.concat "@" [name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and the useless of
this driver will be redirected to /dev/null, oh no, it's full...)"
-- Michael Beck, describing the PC-speaker sound device
Thanks Chris,
I see you a core member of Slony team and a replication guru so I'll look
into it.
I'm not slamming Slony I think its probably the right tool for type of work
your company Afilias does. Just wish you would make an official Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good thing if
your site on replication, was also listed on Postgresql... good research.
Merry Xmas
----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: <pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & Replication
Show quoted text
Centuries ago, Nostradamus foresaw when org@kewlstuff.co.za would write:
Suggest you download my little application and read the documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.One comment they make.... "Heavy write activity can cause excessive
locking,
leading to poor performance. In fact, write performance is often worse
than
that of a single server. Read requests can be sent to any server."
I'm not sure I agree with that... or maybe MVCC is just fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at record level, so
only
if the multimasters all hit the same record is there the potential for
lock
conflict.
Why will dB's being randomly used, hit the same records, I think its a
low
probability to begin with?That's only true if you are certain that the update pattern is NOT
involving a shared set of records. IN GENERAL, heavy write activity
can cause locking to become mighty expensive, which is certainly a
true statement.Not happy with that, I wrote a multithreaded routine and got them to all
smack the same record, it NEVER ROLLED BACK, and if there is performance
degradation, I didnt notice it... again probably a testament to the MVCC
design.It seems likely to me that this requires some careful validation of
testing.An effect we see is that if a set of transactions are "fighting" over
a single "balance" record, they will essentially serialize over that.On a system with a single CPU, it is not obvious that you'll see a
degradation there because, since you only have the single CPU, it
would be serializing the activity anyways.Try it out on an 8-way SMP system and you may see things differently.
In any event if you look at the documentation, you'll see SPAR is not
multimaster or nothing. Can use say one server in an office and another
to
pump data to a remote web site... not sure if you would even call that
multimaster, thats the point, I'm not sure SPAR fits any pure theory
category.There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta see if I
can find them in a readily deployable form.There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where some_attr='foo'
order by random() limit 5); -- Where there are 25 records with
some_attr='foo'2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replication systems
(erServer) that this set of updates can, intermittently, cause to fall
over.
--
let name="cbbrowne" and tld="linuxfinances.info" in String.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and the useless of
this driver will be redirected to /dev/null, oh no, it's full...)"
-- Michael Beck, describing the PC-speaker sound device---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
I guess the latest 8.2 Windows PostgreSQL installer does come with a Slony
option and you can set it up easily using pgadmin too.
This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight
help you as well.
-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)
Show quoted text
On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru so I'll look
into it.
I'm not slamming Slony I think its probably the right tool for type of
work
your company Afilias does. Just wish you would make an official Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good thing if
your site on replication, was also listed on Postgresql... good research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: <pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.za would write:
Suggest you download my little application and read the documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.One comment they make.... "Heavy write activity can cause excessive
locking,
leading to poor performance. In fact, write performance is often worse
than
that of a single server. Read requests can be sent to any server."
I'm not sure I agree with that... or maybe MVCC is just fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at record level, so
only
if the multimasters all hit the same record is there the potential for
lock
conflict.
Why will dB's being randomly used, hit the same records, I think its a
low
probability to begin with?That's only true if you are certain that the update pattern is NOT
involving a shared set of records. IN GENERAL, heavy write activity
can cause locking to become mighty expensive, which is certainly a
true statement.Not happy with that, I wrote a multithreaded routine and got them to
all
smack the same record, it NEVER ROLLED BACK, and if there is
performance
degradation, I didnt notice it... again probably a testament to the
MVCC
design.
It seems likely to me that this requires some careful validation of
testing.An effect we see is that if a set of transactions are "fighting" over
a single "balance" record, they will essentially serialize over that.On a system with a single CPU, it is not obvious that you'll see a
degradation there because, since you only have the single CPU, it
would be serializing the activity anyways.Try it out on an 8-way SMP system and you may see things differently.
In any event if you look at the documentation, you'll see SPAR is not
multimaster or nothing. Can use say one server in an office and another
to
pump data to a remote web site... not sure if you would even call that
multimaster, thats the point, I'm not sure SPAR fits any pure theory
category.There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta see if I
can find them in a readily deployable form.There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where some_attr='foo'
order by random() limit 5); -- Where there are 25 records with
some_attr='foo'2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replication systems
(erServer) that this set of updates can, intermittently, cause to fall
over.
--
let name="cbbrowne" and tld="linuxfinances.info" in String.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and the useless of
this driver will be redirected to /dev/null, oh no, it's full...)"
-- Michael Beck, describing the PC-speaker sound device---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Hi,
I have just read the statement that Postgres does have (with end user
assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?
Cheers,
Andy
Show quoted text
On 24/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come with a Slony
option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru so I'll
look
into it.
I'm not slamming Slony I think its probably the right tool for type of
work
your company Afilias does. Just wish you would make an official Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good thing if
your site on replication, was also listed on Postgresql... good
research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: <pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.za would
write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.One comment they make.... "Heavy write activity can cause excessive
locking,
leading to poor performance. In fact, write performance is oftenworse
than
that of a single server. Read requests can be sent to any server."
I'm not sure I agree with that... or maybe MVCC is just fantastic....I
tested it.
The 2 phase commit locking is definitely happening at record level,so
only
if the multimasters all hit the same record is there the potentialfor
lock
conflict.
Why will dB's being randomly used, hit the same records, I think itsa
low
probability to begin with?That's only true if you are certain that the update pattern is NOT
involving a shared set of records. IN GENERAL, heavy write activity
can cause locking to become mighty expensive, which is certainly a
true statement.Not happy with that, I wrote a multithreaded routine and got them to
all
smack the same record, it NEVER ROLLED BACK, and if there is
performance
degradation, I didnt notice it... again probably a testament to the
MVCC
design.
It seems likely to me that this requires some careful validation of
testing.An effect we see is that if a set of transactions are "fighting" over
a single "balance" record, they will essentially serialize over that.On a system with a single CPU, it is not obvious that you'll see a
degradation there because, since you only have the single CPU, it
would be serializing the activity anyways.Try it out on an 8-way SMP system and you may see things differently.
In any event if you look at the documentation, you'll see SPAR is not
multimaster or nothing. Can use say one server in an office and
another
to
pump data to a remote web site... not sure if you would even callthat
multimaster, thats the point, I'm not sure SPAR fits any pure theory
category.There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta see if I
can find them in a readily deployable form.There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where some_attr='foo'
order by random() limit 5); -- Where there are 25 records with
some_attr='foo'2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replication systems
(erServer) that this set of updates can, intermittently, cause to fall
over.
--
let name="cbbrowne" and tld="linuxfinances.info" in String.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and the useless ofthis driver will be redirected to /dev/null, oh no, it's full...)"
-- Michael Beck, describing the PC-speaker sound device---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)
Show quoted text
On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with end user
assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come with a
Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru so I'll
look
into it.
I'm not slamming Slony I think its probably the right tool for type of
work
your company Afilias does. Just wish you would make an official
Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good thing
if
your site on replication, was also listed on Postgresql... good
research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.za would
write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.One comment they make.... "Heavy write activity can cause excessive
locking,
leading to poor performance. In fact, write performance is oftenworse
than
that of a single server. Read requests can be sent to any server."
I'm not sure I agree with that... or maybe MVCC is justfantastic.... I
tested it.
The 2 phase commit locking is definitely happening at record level,so
only
if the multimasters all hit the same record is there the potentialfor
lock
conflict.
Why will dB's being randomly used, hit the same records, I thinkits a
low
probability to begin with?That's only true if you are certain that the update pattern is NOT
involving a shared set of records. IN GENERAL, heavy write activity
can cause locking to become mighty expensive, which is certainly a
true statement.Not happy with that, I wrote a multithreaded routine and got them
to all
smack the same record, it NEVER ROLLED BACK, and if there is
performance
degradation, I didnt notice it... again probably a testament to the
MVCC
design.
It seems likely to me that this requires some careful validation of
testing.An effect we see is that if a set of transactions are "fighting"
over
a single "balance" record, they will essentially serialize over
that.
On a system with a single CPU, it is not obvious that you'll see a
degradation there because, since you only have the single CPU, it
would be serializing the activity anyways.Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll see SPAR is
not
multimaster or nothing. Can use say one server in an office and
another
to
pump data to a remote web site... not sure if you would even callthat
multimaster, thats the point, I'm not sure SPAR fits any pure
theory
category.
There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta see ifI
can find them in a readily deployable form.
There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where some_attr='foo'
order by random() limit 5); -- Where there are 25 records withsome_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replication systems
(erServer) that this set of updates can, intermittently, cause tofall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" in String.concat"@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and the uselessof
this driver will be redirected to /dev/null, oh no, it's full...)"
-- Michael Beck, describing the PC-speaker sound device---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
The issue i had with pgpool (1 or 2) was that (correct me if i am wrong) you
had to start the pgpool cluster with both nodes in the same state. I
thought this would mean that if you had a DB fail, before you could
re-introduce it into the pgpool cluster you would have to manually sync it
with the cluster state, is this correct ??
The system i need multi master sync for is highly transactional, so if the
behaviour i stated above is correct it is not suitable. I have tried
s-lony, and while i was pleased with the performance, it is only Single
Master - Multi Slave which is not acceptable as well.
Andy
Show quoted text
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with end user
assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir < shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come with a
Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru so I'll
look
into it.
I'm not slamming Slony I think its probably the right tool for type
of work
your company Afilias does. Just wish you would make an official
Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good
thing if
your site on replication, was also listed on Postgresql... good
research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.za would
write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.One comment they make.... "Heavy write activity can cause
excessive
locking,
leading to poor performance. In fact, write performance is oftenworse
than
that of a single server. Read requests can be sent to anyserver."
I'm not sure I agree with that... or maybe MVCC is just
fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at recordlevel, so
only
if the multimasters all hit the same record is there thepotential for
lock
conflict.
Why will dB's being randomly used, hit the same records, I thinkits a
low
probability to begin with?That's only true if you are certain that the update pattern is NOT
involving a shared set of records. IN GENERAL, heavy writeactivity
can cause locking to become mighty expensive, which is certainly a
true statement.Not happy with that, I wrote a multithreaded routine and got them
to all
smack the same record, it NEVER ROLLED BACK, and if there is
performance
degradation, I didnt notice it... again probably a testament to
the MVCC
design.
It seems likely to me that this requires some careful validation
of
testing.
An effect we see is that if a set of transactions are "fighting"
over
a single "balance" record, they will essentially serialize over
that.
On a system with a single CPU, it is not obvious that you'll see a
degradation there because, since you only have the single CPU, it
would be serializing the activity anyways.Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll see SPAR is
not
multimaster or nothing. Can use say one server in an office and
another
to
pump data to a remote web site... not sure if you would even callthat
multimaster, thats the point, I'm not sure SPAR fits any pure
theory
category.
There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta seeif I
can find them in a readily deployable form.
There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where some_attr='foo'
order by random() limit 5); -- Where there are 25 recordswith
some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not
null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replicationsystems
(erServer) that this set of updates can, intermittently, cause to
fall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" in String.concat"@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and theuseless of
this driver will be redirected to /dev/null, oh no, it's full...)"
-- Michael Beck, describing the PC-speaker sound device---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Yes, that is true with pgpool. I did face the same as well.
There is another as well Uni-Cluster (
http://www.continuent.com/index.php?option=com_content&task=view&id=213&Itemid=170),
haven't tried yet but it might help you there...
---------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)
Show quoted text
On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The issue i had with pgpool (1 or 2) was that (correct me if i am wrong)
you had to start the pgpool cluster with both nodes in the same state. I
thought this would mean that if you had a DB fail, before you could
re-introduce it into the pgpool cluster you would have to manually sync it
with the cluster state, is this correct ??The system i need multi master sync for is highly transactional, so if the
behaviour i stated above is correct it is not suitable. I have tried
s-lony, and while i was pleased with the performance, it is only Single
Master - Multi Slave which is not acceptable as well.Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with end user
assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir < shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come with a
Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru so
I'll look
into it.
I'm not slamming Slony I think its probably the right tool for
type of work
your company Afilias does. Just wish you would make an official
Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good
thing if
your site on replication, was also listed on Postgresql... good
research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.zawould write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE a free
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.One comment they make.... "Heavy write activity can cause
excessive
locking,
leading to poor performance. In fact, write performance isoften worse
than
that of a single server. Read requests can be sent to anyserver."
I'm not sure I agree with that... or maybe MVCC is just
fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at recordlevel, so
only
if the multimasters all hit the same record is there thepotential for
lock
conflict.
Why will dB's being randomly used, hit the same records, Ithink its a
low
probability to begin with?That's only true if you are certain that the update pattern is
NOT
involving a shared set of records. IN GENERAL, heavy write
activity
can cause locking to become mighty expensive, which is certainly
a
true statement.
Not happy with that, I wrote a multithreaded routine and got
them to all
smack the same record, it NEVER ROLLED BACK, and if there is
performance
degradation, I didnt notice it... again probably a testament to
the MVCC
design.
It seems likely to me that this requires some careful validation
of
testing.
An effect we see is that if a set of transactions are "fighting"
over
a single "balance" record, they will essentially serialize over
that.
On a system with a single CPU, it is not obvious that you'll see
a
degradation there because, since you only have the single CPU,
it
would be serializing the activity anyways.
Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll see SPAR
is not
multimaster or nothing. Can use say one server in an office and
another
to
pump data to a remote web site... not sure if you would evencall that
multimaster, thats the point, I'm not sure SPAR fits any pure
theory
category.
There are a few tests I could throw at it that tend to challenge
replication systems vis-a-vis "fidelity of results." I otta seeif I
can find them in a readily deployable form.
There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where
some_attr='foo'
order by random() limit 5); -- Where there are 25 records
with
some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not
null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will cause
replication to break? Note that there have been replicationsystems
(erServer) that this set of updates can, intermittently, cause
to fall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" inString.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and theuseless of
this driver will be redirected to /dev/null, oh no, it's
full...)"
-- Michael Beck, describing the PC-speaker sound device
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
The company i am working for has a trail/evaluation license for p/cluster,
but unfortunately i cannot get it to function as a JBoss datasource.
Cheers,
Andy
Show quoted text
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
Yes, that is true with pgpool. I did face the same as well.
There is another as well Uni-Cluster (http://www.continuent.com/index.php?option=com_content&task=view&id=213&Itemid=170
), haven't tried yet but it might help you there...---------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The issue i had with pgpool (1 or 2) was that (correct me if i am wrong)
you had to start the pgpool cluster with both nodes in the same state. I
thought this would mean that if you had a DB fail, before you could
re-introduce it into the pgpool cluster you would have to manually sync it
with the cluster state, is this correct ??The system i need multi master sync for is highly transactional, so if
the behaviour i stated above is correct it is not suitable. I have tried
s-lony, and while i was pleased with the performance, it is only Single
Master - Multi Slave which is not acceptable as well.Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com > wrote:
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with end
user assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir < shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come with
a Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru so
I'll look
into it.
I'm not slamming Slony I think its probably the right tool for
type of work
your company Afilias does. Just wish you would make an official
Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a good
thing if
your site on replication, was also listed on Postgresql... good
research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing & ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.zawould write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE afree
multi-master
replication system :)It isn't systematically usable as such, without a whole lot of
end-user assembly.
One comment they make.... "Heavy write activity can cause
excessive
locking,
leading to poor performance. In fact, write performance isoften worse
than
that of a single server. Read requests can be sent to anyserver."
I'm not sure I agree with that... or maybe MVCC is just
fantastic.... I
tested it.
The 2 phase commit locking is definitely happening at recordlevel, so
only
if the multimasters all hit the same record is there thepotential for
lock
conflict.
Why will dB's being randomly used, hit the same records, Ithink its a
low
probability to begin with?That's only true if you are certain that the update pattern is
NOT
involving a shared set of records. IN GENERAL, heavy write
activity
can cause locking to become mighty expensive, which is
certainly a
true statement.
Not happy with that, I wrote a multithreaded routine and got
them to all
smack the same record, it NEVER ROLLED BACK, and if there is
performance
degradation, I didnt notice it... again probably a testament
to the MVCC
design.
It seems likely to me that this requires some careful
validation of
testing.
An effect we see is that if a set of transactions are
"fighting" over
a single "balance" record, they will essentially serialize
over that.
On a system with a single CPU, it is not obvious that you'll
see a
degradation there because, since you only have the single CPU,
it
would be serializing the activity anyways.
Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll see
SPAR is not
multimaster or nothing. Can use say one server in an office
and another
to
pump data to a remote web site... not sure if you would evencall that
multimaster, thats the point, I'm not sure SPAR fits any pure
theory
category.
There are a few tests I could throw at it that tend to
challenge
replication systems vis-a-vis "fidelity of results." I otta
see if I
can find them in a readily deployable form.
There are two notable anomalies which have been known to break
replication systems:1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where
some_attr='foo'
order by random() limit 5); -- Where there are 25
records with
some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not
null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will
cause
replication to break? Note that there have been replication
systems
(erServer) that this set of updates can, intermittently, cause
to fall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" inString.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and theuseless of
this driver will be redirected to /dev/null, oh no, it's
full...)"
-- Michael Beck, describing the PC-speaker sound device
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Why are you going for a multimaster case here? are you doing it for load
balancing? if then you can also do it horizontally with a multi disk
setup... Slony can be a real good candidate here as well with Linux HA in
combination.
For going on a vertical solution you can try OpenSSI and see if that can
work for you, haven't tried that myself but will like to hear about
PostgreSQL configuration with OpenSSI
-------------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)
Show quoted text
On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The company i am working for has a trail/evaluation license for p/cluster,
but unfortunately i cannot get it to function as a JBoss datasource.Cheers,
Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
Yes, that is true with pgpool. I did face the same as well.
There is another as well Uni-Cluster (http://www.continuent.com/index.php?option=com_content&task=view&id=213&Itemid=170
), haven't tried yet but it might help you there...---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The issue i had with pgpool (1 or 2) was that (correct me if i am
wrong) you had to start the pgpool cluster with both nodes in the same
state. I thought this would mean that if you had a DB fail, before you
could re-introduce it into the pgpool cluster you would have to manually
sync it with the cluster state, is this correct ??The system i need multi master sync for is highly transactional, so if
the behaviour i stated above is correct it is not suitable. I have tried
s-lony, and while i was pleased with the performance, it is only Single
Master - Multi Slave which is not acceptable as well.Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com > wrote:
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with end
user assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir < shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come
with a Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru
so I'll look
into it.
I'm not slamming Slony I think its probably the right tool for
type of work
your company Afilias does. Just wish you would make an
official Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a
good thing if
your site on replication, was also listed on Postgresql...
good research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing &
ReplicationCenturies ago, Nostradamus foresaw when org@kewlstuff.co.zawould write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE afree
multi-master
replication system :)It isn't systematically usable as such, without a whole lot
of
end-user assembly.
One comment they make.... "Heavy write activity can cause
excessive
locking,
leading to poor performance. In fact, write performance isoften worse
than
that of a single server. Read requests can be sent to anyserver."
I'm not sure I agree with that... or maybe MVCC is just
fantastic.... I
tested it.
The 2 phase commit locking is definitely happening atrecord level, so
only
if the multimasters all hit the same record is there thepotential for
lock
conflict.
Why will dB's being randomly used, hit the same records, Ithink its a
low
probability to begin with?That's only true if you are certain that the update pattern
is NOT
involving a shared set of records. IN GENERAL, heavy write
activity
can cause locking to become mighty expensive, which is
certainly a
true statement.
Not happy with that, I wrote a multithreaded routine and
got them to all
smack the same record, it NEVER ROLLED BACK, and if there
is performance
degradation, I didnt notice it... again probably a
testament to the MVCC
design.
It seems likely to me that this requires some careful
validation of
testing.
An effect we see is that if a set of transactions are
"fighting" over
a single "balance" record, they will essentially serialize
over that.
On a system with a single CPU, it is not obvious that you'll
see a
degradation there because, since you only have the single
CPU, it
would be serializing the activity anyways.
Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll see
SPAR is not
multimaster or nothing. Can use say one server in an office
and another
to
pump data to a remote web site... not sure if you wouldeven call that
multimaster, thats the point, I'm not sure SPAR fits any
pure theory
category.
There are a few tests I could throw at it that tend to
challenge
replication systems vis-a-vis "fidelity of results." I otta
see if I
can find them in a readily deployable form.
There are two notable anomalies which have been known to
break
replication systems:
1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where
some_attr='foo'
order by random() limit 5); -- Where there are 25
records with
some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique not
null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will
cause
replication to break? Note that there have been replication
systems
(erServer) that this set of updates can, intermittently,
cause to fall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" inString.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english and theuseless of
this driver will be redirected to /dev/null, oh no, it's
full...)"
-- Michael Beck, describing the PC-speaker sound device
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
We are trying to achieve High Availability over load balancing, so basically
we always try and have 2 databases in the same state while both are active,
and if one goes down it should (hopefully) failover seemlessly. Thanks for
the info on OpenSSI, as i had not heard of this before and will look into it
further.
Cheers,
Andy
Show quoted text
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
Why are you going for a multimaster case here? are you doing it for load
balancing? if then you can also do it horizontally with a multi disk
setup... Slony can be a real good candidate here as well with Linux HA in
combination.For going on a vertical solution you can try OpenSSI and see if that can
work for you, haven't tried that myself but will like to hear about
PostgreSQL configuration with OpenSSI-------------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The company i am working for has a trail/evaluation license for
p/cluster, but unfortunately i cannot get it to function as a JBoss
datasource.Cheers,
Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
Yes, that is true with pgpool. I did face the same as well.
There is another as well Uni-Cluster (http://www.continuent.com/index.php?option=com_content&task=view&id=213&Itemid=170
), haven't tried yet but it might help you there...---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The issue i had with pgpool (1 or 2) was that (correct me if i am
wrong) you had to start the pgpool cluster with both nodes in the same
state. I thought this would mean that if you had a DB fail, before you
could re-introduce it into the pgpool cluster you would have to manually
sync it with the cluster state, is this correct ??The system i need multi master sync for is highly transactional, so
if the behaviour i stated above is correct it is not suitable. I have tried
s-lony, and while i was pleased with the performance, it is only Single
Master - Multi Slave which is not acceptable as well.Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com > wrote:
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with end
user assembly) multi-master replication system. Is this just PGCluster or
something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir < shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does come
with a Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za> wrote:
Thanks Chris,
I see you a core member of Slony team and a replication guru
so I'll look
into it.
I'm not slamming Slony I think its probably the right tool
for type of work
your company Afilias does. Just wish you would make an
official Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a
good thing if
your site on replication, was also listed on Postgresql...
good research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing &
ReplicationCenturies ago, Nostradamus foresaw when
org@kewlstuff.co.za would write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVE afree
multi-master
replication system :)It isn't systematically usable as such, without a whole
lot of
end-user assembly.
One comment they make.... "Heavy write activity can cause
excessive
locking,
leading to poor performance. In fact, write performanceis often worse
than
that of a single server. Read requests can be sent to anyserver."
I'm not sure I agree with that... or maybe MVCC is just
fantastic.... I
tested it.
The 2 phase commit locking is definitely happening atrecord level, so
only
if the multimasters all hit the same record is there thepotential for
lock
conflict.
Why will dB's being randomly used, hit the same records,I think its a
low
probability to begin with?That's only true if you are certain that the update
pattern is NOT
involving a shared set of records. IN GENERAL, heavy
write activity
can cause locking to become mighty expensive, which is
certainly a
true statement.
Not happy with that, I wrote a multithreaded routine and
got them to all
smack the same record, it NEVER ROLLED BACK, and if there
is performance
degradation, I didnt notice it... again probably a
testament to the MVCC
design.
It seems likely to me that this requires some careful
validation of
testing.
An effect we see is that if a set of transactions are
"fighting" over
a single "balance" record, they will essentially serialize
over that.
On a system with a single CPU, it is not obvious that
you'll see a
degradation there because, since you only have the single
CPU, it
would be serializing the activity anyways.
Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll see
SPAR is not
multimaster or nothing. Can use say one server in an
office and another
to
pump data to a remote web site... not sure if you wouldeven call that
multimaster, thats the point, I'm not sure SPAR fits any
pure theory
category.
There are a few tests I could throw at it that tend to
challenge
replication systems vis-a-vis "fidelity of results." I
otta see if I
can find them in a readily deployable form.
There are two notable anomalies which have been known to
break
replication systems:
1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where
some_attr='foo'
order by random() limit 5); -- Where there are 25
records with
some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique
not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there will
cause
replication to break? Note that there have been
replication systems
(erServer) that this set of updates can, intermittently,
cause to fall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" inString.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english andthe useless of
this driver will be redirected to /dev/null, oh no, it's
full...)"
-- Michael Beck, describing the PC-speaker sound device
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
On Tue, 2006-12-26 at 15:11 +0000, Andy Dale wrote:
...
You guys please avoid top-posting.
--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
We are trying to achieve High Availability over load balancing, so
basically we always try and have 2 databases in the same state while >>both
are active,
What problems do you see with Slony + Linux HA combo there? I think a Slony
failover can do the same and promote a slave node to master in case of a
master node failure, and then you can actually do the load balancing by
using the slave nodes in the system.
Please let me know on any updates for OpenSSI if you get a chance to try
that out
-------------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)
Show quoted text
On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
We are trying to achieve High Availability over load balancing, so
basically we always try and have 2 databases in the same state while both
are active, and if one goes down it should (hopefully) failover seemlessly.
Thanks for the info on OpenSSI, as i had not heard of this before and will
look into it further.Cheers,
Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
Why are you going for a multimaster case here? are you doing it for load
balancing? if then you can also do it horizontally with a multi disk
setup... Slony can be a real good candidate here as well with Linux HA in
combination.For going on a vertical solution you can try OpenSSI and see if that can
work for you, haven't tried that myself but will like to hear about
PostgreSQL configuration with OpenSSI-------------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale < andy.dale@gmail.com> wrote:
The company i am working for has a trail/evaluation license for
p/cluster, but unfortunately i cannot get it to function as a JBoss
datasource.Cheers,
Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com> wrote:
Yes, that is true with pgpool. I did face the same as well.
There is another as well Uni-Cluster (http://www.continuent.com/index.php?option=com_content&task=view&id=213&Itemid=170
), haven't tried yet but it might help you there...---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
The issue i had with pgpool (1 or 2) was that (correct me if i am
wrong) you had to start the pgpool cluster with both nodes in the same
state. I thought this would mean that if you had a DB fail, before you
could re-introduce it into the pgpool cluster you would have to manually
sync it with the cluster state, is this correct ??The system i need multi master sync for is highly transactional,
so if the behaviour i stated above is correct it is not suitable. I have
tried s-lony, and while i was pleased with the performance, it is only
Single Master - Multi Slave which is not acceptable as well.Andy
On 26/12/06, Shoaib Mir <shoaibmir@gmail.com > wrote:
pgpool-II might help you there too I guess...
---------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)On 12/26/06, Andy Dale <andy.dale@gmail.com> wrote:
Hi,
I have just read the statement that Postgres does have (with
end user assembly) multi-master replication system. Is this just PGCluster
or something else ? if it is not PGCluster, then how can this be achieved ?Cheers,
Andy
On 24/12/06, Shoaib Mir < shoaibmir@gmail.com> wrote:
I guess the latest 8.2 Windows PostgreSQL installer does
come with a Slony option and you can set it up easily using pgadmin too.This link -->
http://people.planetpostgresql.org/xzilla/index.php?/archives/200-Alpha-testing-Slony-on-win32-Crib-Notes.htmlmight help you as well.-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 12/25/06, org@kewlstuff.co.za <org@kewlstuff.co.za>
wrote:Thanks Chris,
I see you a core member of Slony team and a replication
guru so I'll look
into it.
I'm not slamming Slony I think its probably the right tool
for type of work
your company Afilias does. Just wish you would make an
official Windows
version of Slony as well.
Anyway thanks for the education, and I think it would be a
good thing if
your site on replication, was also listed on Postgresql...
good research.
Merry Xmas----- Original Message -----
From: "Christopher Browne" <cbbrowne@acm.org>
To: < pgsql-general@postgresql.org>
Sent: Sunday, December 24, 2006 4:23 AM
Subject: Re: [GENERAL] Clustering & Load Balancing &
ReplicationCenturies ago, Nostradamus foresaw when
org@kewlstuff.co.za would write:
Suggest you download my little application and read the
documentation,
you'll see its very different, maybe even interesting.
Maybe they should change that to.... Postgres DOES HAVEa free
multi-master
replication system :)It isn't systematically usable as such, without a whole
lot of
end-user assembly.
One comment they make.... "Heavy write activity can
cause excessive
locking,
leading to poor performance. In fact, write performanceis often worse
than
that of a single server. Read requests can be sent toany server."
I'm not sure I agree with that... or maybe MVCC is just
fantastic.... I
tested it.
The 2 phase commit locking is definitely happening atrecord level, so
only
if the multimasters all hit the same record is therethe potential for
lock
conflict.
Why will dB's being randomly used, hit the samerecords, I think its a
low
probability to begin with?That's only true if you are certain that the update
pattern is NOT
involving a shared set of records. IN GENERAL, heavy
write activity
can cause locking to become mighty expensive, which is
certainly a
true statement.
Not happy with that, I wrote a multithreaded routine
and got them to all
smack the same record, it NEVER ROLLED BACK, and if
there is performance
degradation, I didnt notice it... again probably a
testament to the MVCC
design.
It seems likely to me that this requires some careful
validation of
testing.
An effect we see is that if a set of transactions are
"fighting" over
a single "balance" record, they will essentially
serialize over that.
On a system with a single CPU, it is not obvious that
you'll see a
degradation there because, since you only have the
single CPU, it
would be serializing the activity anyways.
Try it out on an 8-way SMP system and you may see things
differently.
In any event if you look at the documentation, you'll
see SPAR is not
multimaster or nothing. Can use say one server in an
office and another
to
pump data to a remote web site... not sure if you wouldeven call that
multimaster, thats the point, I'm not sure SPAR fits
any pure theory
category.
There are a few tests I could throw at it that tend to
challenge
replication systems vis-a-vis "fidelity of results." I
otta see if I
can find them in a readily deployable form.
There are two notable anomalies which have been known to
break
replication systems:
1. Nondeterministic updates:
For instance, functions that are nondeterministic:
insert into rtable values (random(), now());
Or result sets that are nondeterministic:
insert into rtable2 (select * from mytable where
some_attr='foo'
order by random() limit 5); -- Where there are 25
records with
some_attr='foo'
2. Value swapping:
Consider the table:
create table t1 (mk integer primary key, val text unique
not null);
insert into t1 (mk, val) values (1, 'chris');
insert into t1 (mk, val) values (2, 'dave');
insert into t1 (mk, val) values (3, 'brad');begin;
update t1 set mk = 99 where mk = 1;
update t1 set mk = 1 where mk = 3;
update t1 set mk = 3 where mk = 99;
commit;Is there a condition where a pause somewhere in there
will cause
replication to break? Note that there have been
replication systems
(erServer) that this set of updates can, intermittently,
cause to fall
over.
--
let name="cbbrowne" and tld=" linuxfinances.info" inString.concat "@"
[name;tld];;
http://cbbrowne.com/info/slony.html
"Feel free to contact me (flames about my english andthe useless of
this driver will be redirected to /dev/null, oh no, it's
full...)"
-- Michael Beck, describing the PC-speaker sound device
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map
settings
On Tue, 26 Dec 2006, Shoaib Mir wrote:
We are trying to achieve High Availability over load balancing, so
basically we always try and have 2 databases in the same state while >>both
are active,What problems do you see with Slony + Linux HA combo there? I think a Slony
failover can do the same and promote a slave node to master in case of a
master node failure, and then you can actually do the load balancing by
using the slave nodes in the system.Please let me know on any updates for OpenSSI if you get a chance to try
that out-------------------
Shoaib Mir
EnterpriseDB ( www.enterprisedb.com)
I have a simillar problem, such that the HA aspect involves multiple data
centers. I can deploy multiple systems within a single data center and
achieve HA within that data center using the techniques discussed in this
thread. What I have not been able to come up with a good semi-general
purpose solution to is cross-data-center HA. As a result we are using very
application specific techniques to try to address this, but the longer
term maintenance we suspect will be high. How do others deal with this
problem?
- Marc
On Dec 26, 2006, at 7:30 AM, Marc Evans wrote:
What I have not been able to come up with a good semi-general
purpose solution to is cross-data-center HA.
Why does log shipping not work for you?
On Tue, 26 Dec 2006, Ben wrote:
On Dec 26, 2006, at 7:30 AM, Marc Evans wrote:
What I have not been able to come up with a good semi-general purpose
solution to is cross-data-center HA.Why does log shipping not work for you?
Well, it may, but is short, I believe that this comes down to how to deal
with various failure scenarios. For example, consider a situation where a
data center is cut off from communications with other data centers. Do all
data centers need to switch to a read-only operational mode, or is one of
the centers elected the master? What about a pair of islands, wherein 2
seperate sets of data centers are cut off from each other, but see some
subset of other data centers? For a general purpose solution, some form of
multi-master seems logical, but of course, doesn't yet exist.
- Marc