Open source databases '60 per cent cheaper'

Started by Kaare Rasmussenabout 19 years ago88 messages
#1Kaare Rasmussen
kaare@jasonic.dk

At leas Enterprise DB is mentioned:

http://www.itweek.co.uk/vnunet/news/2168971/open-source-databases-slice

--

Med venlig hilsen
Kaare Rasmussen, Jasonic

Jasonic Telefon: +45 3816 2582
Nordre Fasanvej 12
2000 Frederiksberg Email: kaare@jasonic.dk

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Kaare Rasmussen (#1)
Re: Open source databases '60 per cent cheaper'

On Tue, 2006-11-21 at 16:45 +0100, Kaare Rasmussen wrote:

At leas Enterprise DB is mentioned:

http://www.itweek.co.uk/vnunet/news/2168971/open-source-databases-slice

Too bad EnterpriseDB isn't Open Source.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#3David Fetter
david@fetter.org
In reply to: Joshua D. Drake (#2)
Re: Open source databases '60 per cent cheaper'

On Tue, Nov 21, 2006 at 07:58:22AM -0800, Joshua D. Drake wrote:

On Tue, 2006-11-21 at 16:45 +0100, Kaare Rasmussen wrote:

At leas Enterprise DB is mentioned:

http://www.itweek.co.uk/vnunet/news/2168971/open-source-databases-slice

Too bad EnterpriseDB isn't Open Source.

Too bad, also, that the word "PostgreSQL" doesn't appear anywhere in
that article.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#4Kaare Rasmussen
kaare@jasonic.dk
In reply to: David Fetter (#3)
Re: Open source databases '60 per cent cheaper'

Too bad, also, that the word "PostgreSQL" doesn't appear anywhere in
that article.

OK, so, this article does a little better:
http://www.internetnews.com/dev-news/article.php/3644946

--

Med venlig hilsen
Kaare Rasmussen, Jasonic

Jasonic Telefon: +45 3816 2582
Nordre Fasanvej 12
2000 Frederiksberg Email: kaare@jasonic.dk

#5Bruce Momjian
bruce@momjian.us
In reply to: David Fetter (#3)
Re: Open source databases '60 per cent

David Fetter wrote:

On Tue, Nov 21, 2006 at 07:58:22AM -0800, Joshua D. Drake wrote:

On Tue, 2006-11-21 at 16:45 +0100, Kaare Rasmussen wrote:

At leas Enterprise DB is mentioned:

http://www.itweek.co.uk/vnunet/news/2168971/open-source-databases-slice

Too bad EnterpriseDB isn't Open Source.

Too bad, also, that the word "PostgreSQL" doesn't appear anywhere in
that article.

I know EnterpriseDB is trying to have PostgreSQL mentioned in all their
articles, but I supposed becuase it is an article that includes all open
source databases (or based on open source databases like EnterpriseDB),
it wasn't possible.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#6Joshua D. Drake
jd@commandprompt.com
In reply to: Bruce Momjian (#5)
Re: Open source databases '60 per cent cheaper'

I know EnterpriseDB is trying to have PostgreSQL mentioned in all their
articles, but I supposed becuase it is an article that includes all open
source databases (or based on open source databases like EnterpriseDB),
it wasn't possible.

Honestly, I applaud the EnterpriseDB PR machine, and there is no reason
they should have to provide a mention to PostgreSQL. If they choose to,
I thank them as a community member. If not, it is their product and I
wish them the best.

However what does need to stop is the false statement that EnterpriseDB
is Open Source.

Sincerely,

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#7Bruce Momjian
bruce@momjian.us
In reply to: Joshua D. Drake (#6)
Re: Open source databases '60 per cent

Joshua D. Drake wrote:

I know EnterpriseDB is trying to have PostgreSQL mentioned in all their
articles, but I supposed because it is an article that includes all open
source databases (or based on open source databases like EnterpriseDB),
it wasn't possible.

Honestly, I applaud the EnterpriseDB PR machine, and there is no reason
they should have to provide a mention to PostgreSQL. If they choose to,
I thank them as a community member. If not, it is their product and I
wish them the best.

EnterpriseDB tries to get PostgreSQL mentioned if possible. There are
strong ethical and business reasons to do that.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#8Joshua D. Drake
jd@commandprompt.com
In reply to: Bruce Momjian (#7)
Re: Open source databases '60 per cent

On Tue, 2006-11-21 at 15:56 -0500, Bruce Momjian wrote:

Joshua D. Drake wrote:

I know EnterpriseDB is trying to have PostgreSQL mentioned in all their
articles, but I supposed because it is an article that includes all open
source databases (or based on open source databases like EnterpriseDB),
it wasn't possible.

Honestly, I applaud the EnterpriseDB PR machine, and there is no reason
they should have to provide a mention to PostgreSQL. If they choose to,
I thank them as a community member. If not, it is their product and I
wish them the best.

EnterpriseDB tries to get PostgreSQL mentioned if possible. There are
strong ethical and business reasons to do that.

I am not suggesting they don't. I was just saying from my perspective I
didn't have a problem if they did, or didn't. My only complaint is being
addressed off list :)

Sincerely,

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#9Simon Riggs
simon@2ndquadrant.com
In reply to: Joshua D. Drake (#6)
Re: Open source databases '60 per cent cheaper'

On Tue, 2006-11-21 at 11:21 -0800, Joshua D. Drake wrote:

However what does need to stop is the false statement that EnterpriseDB
is Open Source.

We need to differentiate between statements made by journalists and
claims made by companies.

Anyway, I'm very interested in getting Synchronous Replication into
PostgreSQL 8.3. Can I gauge your interest in making Mammoth Replicator
Open Source to assist with that project? I'd be very happy to work with
you in an open manner on that.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#10Alvaro Herrera
alvherre@commandprompt.com
In reply to: Simon Riggs (#9)
Re: Open source databases '60 per cent cheaper'

Simon Riggs wrote:

On Tue, 2006-11-21 at 11:21 -0800, Joshua D. Drake wrote:

However what does need to stop is the false statement that EnterpriseDB
is Open Source.

We need to differentiate between statements made by journalists and
claims made by companies.

Anyway, I'm very interested in getting Synchronous Replication into
PostgreSQL 8.3. Can I gauge your interest in making Mammoth Replicator
Open Source to assist with that project? I'd be very happy to work with
you in an open manner on that.

Mammoth Replicator is not synchronous anyway ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#11Simon Riggs
simon@2ndquadrant.com
In reply to: Alvaro Herrera (#10)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 10:44 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

On Tue, 2006-11-21 at 11:21 -0800, Joshua D. Drake wrote:

However what does need to stop is the false statement that EnterpriseDB
is Open Source.

We need to differentiate between statements made by journalists and
claims made by companies.

Anyway, I'm very interested in getting Synchronous Replication into
PostgreSQL 8.3. Can I gauge your interest in making Mammoth Replicator
Open Source to assist with that project? I'd be very happy to work with
you in an open manner on that.

Mammoth Replicator is not synchronous anyway ...

That's a shame. I thought we might be able get a head start in that way.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#12Markus Schiltknecht
markus@bluegap.ch
In reply to: Simon Riggs (#11)
Re: Open source databases '60 per cent cheaper'

Hi,

Simon Riggs wrote:

Mammoth Replicator is not synchronous anyway ...

That's a shame. I thought we might be able get a head start in that way.

Huh? Why should that be a shame? Do you have anything better to propose?

Maybe you can get away with Sequoia? Or support my efforts with Postgres-R?

Regards

Markus

#13Simon Riggs
simon@2ndquadrant.com
In reply to: Markus Schiltknecht (#12)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 15:28 +0100, Markus Schiltknecht wrote:

Simon Riggs wrote:

Mammoth Replicator is not synchronous anyway ...

That's a shame. I thought we might be able get a head start in that way.

Huh? Why should that be a shame?

Because I wanted it to be synchronous and it is not...

Do you have anything better to propose?

On -hackers, I think, but not yet.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#14Joshua D. Drake
jd@commandprompt.com
In reply to: Simon Riggs (#13)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 15:21 +0000, Simon Riggs wrote:

On Wed, 2006-11-22 at 15:28 +0100, Markus Schiltknecht wrote:

Simon Riggs wrote:

Mammoth Replicator is not synchronous anyway ...

That's a shame. I thought we might be able get a head start in that way.

Huh? Why should that be a shame?

Because I wanted it to be synchronous and it is not...

In theory, it wouldn't be too difficult (especially once 1.8 is done) to
make Replicator Synchronous. We haven't worked out all the gory details
but it is certainly plausible.

But to be honest, the demand for Synchronous is far less than the hype.

Sincerely,

Joshua D. Drake

Do you have anything better to propose?

On -hackers, I think, but not yet.

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#15Shane Ambler
pgsql@007Marketing.com
In reply to: Simon Riggs (#11)
Re: Open source databases '60 per cent cheaper'

Simon Riggs wrote:

On Wed, 2006-11-22 at 10:44 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

On Tue, 2006-11-21 at 11:21 -0800, Joshua D. Drake wrote:

However what does need to stop is the false statement that EnterpriseDB
is Open Source.

We need to differentiate between statements made by journalists and
claims made by companies.

Anyway, I'm very interested in getting Synchronous Replication into
PostgreSQL 8.3. Can I gauge your interest in making Mammoth Replicator
Open Source to assist with that project? I'd be very happy to work with
you in an open manner on that.

Mammoth Replicator is not synchronous anyway ...

That's a shame. I thought we might be able get a head start in that way.

Have you looked at PGCluster? - Synchronous multi master replication

It is listed at pgFoundry

--

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz

#16Joshua D. Drake
jd@commandprompt.com
In reply to: Shane Ambler (#15)
Re: Open source databases '60 per cent cheaper'

On Thu, 2006-11-23 at 03:30 +1030, Shane Ambler wrote:

Simon Riggs wrote:

On Wed, 2006-11-22 at 10:44 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

On Tue, 2006-11-21 at 11:21 -0800, Joshua D. Drake wrote:

However what does need to stop is the false statement that EnterpriseDB
is Open Source.

We need to differentiate between statements made by journalists and
claims made by companies.

Anyway, I'm very interested in getting Synchronous Replication into
PostgreSQL 8.3.

On a side note to this.. you said *into*.. my understanding is the
policy of the community is *no* replication is in core.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#17Markus Schiltknecht
markus@bluegap.ch
In reply to: Joshua D. Drake (#16)
Re: Open source databases '60 per cent cheaper'

Hi,

Joshua D. Drake wrote:

On a side note to this.. you said *into*.. my understanding is the
policy of the community is *no* replication is in core.

I don't think that policy is written in stone. But for a replication
solution to go into core, it should better cover a lot of use cases,
i.e. for sure sync *and* async replication.

Regards

Markus

#18Joshua D. Drake
jd@commandprompt.com
In reply to: Markus Schiltknecht (#17)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 18:21 +0100, Markus Schiltknecht wrote:

Hi,

Joshua D. Drake wrote:

On a side note to this.. you said *into*.. my understanding is the
policy of the community is *no* replication is in core.

I don't think that policy is written in stone. But for a replication
solution to go into core, it should better cover a lot of use cases,
i.e. for sure sync *and* async replication.

Perhaps we should re-read the archives. It has been a pretty solid
policy for *years* and it comes up before every release and it always
comes back to:

PostgreSQL doesn't ship a integrated replication solution, BECAUSE not
any one replication solution can fit the need.

Sincerely,

Joshua D. Drake

Regards

Markus

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#19Markus Schiltknecht
markus@bluegap.ch
In reply to: Joshua D. Drake (#18)
Re: Open source databases '60 per cent cheaper'

Hi,

Joshua D. Drake wrote:

PostgreSQL doesn't ship a integrated replication solution, BECAUSE not
any one replication solution can fit the need.

Yes, that's what I'm saying.

Please do not mix cause and effect: no replication solution got into
core because none fit all the needs. It's not that we have a policy
stating that we don't want any replication solution in core. What could
possibly be the reasons for such a policy?

Regards

Markus

#20Jeff Davis
pgsql@j-davis.com
In reply to: Joshua D. Drake (#18)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 09:30 -0800, Joshua D. Drake wrote:

On Wed, 2006-11-22 at 18:21 +0100, Markus Schiltknecht wrote:

Hi,

Joshua D. Drake wrote:

On a side note to this.. you said *into*.. my understanding is the
policy of the community is *no* replication is in core.

I don't think that policy is written in stone. But for a replication
solution to go into core, it should better cover a lot of use cases,
i.e. for sure sync *and* async replication.

Perhaps we should re-read the archives. It has been a pretty solid
policy for *years* and it comes up before every release and it always
comes back to:

PostgreSQL doesn't ship a integrated replication solution, BECAUSE not
any one replication solution can fit the need.

I always got the impression that it had more to do with whether it
needed to be in core to work or not.

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

Regards,
Jeff Davis

#21Joshua D. Drake
jd@commandprompt.com
In reply to: Markus Schiltknecht (#19)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 18:42 +0100, Markus Schiltknecht wrote:

Hi,

Joshua D. Drake wrote:

PostgreSQL doesn't ship a integrated replication solution, BECAUSE not
any one replication solution can fit the need.

Yes, that's what I'm saying.

Please do not mix cause and effect: no replication solution got into
core because none fit all the needs. It's not that we have a policy
stating that we don't want any replication solution in core. What could
possibly be the reasons for such a policy?

Because there will never be a solution that can fit the need?

Slony and replicator are only similar in that they both are asynchronous
master-slave solutions, other than that, we are vastly different and
serve very different needs.

PgCluster would be a good choice for specific workloads but is a bad
choice for many workloads.

uni/cluster is fine if you want a web app and no database logic.

The list goes on, and on.

Sincerely,

Joshua D. Drake

Regards

Markus

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#22Joshua D. Drake
jd@commandprompt.com
In reply to: Jeff Davis (#20)
Re: Open source databases '60 per cent cheaper'

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

There is definitely another reason though :). Adding a replication
solution that is integrated *will* increase development overhead in
terms of support.

Replication touches (alot) of places.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#23Markus Schiltknecht
markus@bluegap.ch
In reply to: Jeff Davis (#20)
Integrating Replication into Core

Hi,

[ moving to -hackers, that seems more appropriate. ]

Jeff Davis wrote:

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

That's what I'm saying. Although it's hypothetical.

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

I'd love to get into that one. Some of the people who have attended my
talk at the summit might know that I've introduced the following syntax
to Postgres-R:

ALTER DATABASE testdb START REPLICATION IN GROUP testgroup USING egcs;

And I'm using the system catalogs to store replication settings. What's
so wrong with that?

Joshua D. Drake wrote:

There is definitely another reason though :). Adding a replication
solution that is integrated *will* increase development overhead in
terms of support.

Sure. It's an additional feature after all. Refusing to add stuff to
core because it increases development overhead certainly is a dead end.

Replication touches (alot) of places.

Yes, that's exactly why I'm going the integrated way with Postgres-R.
:-)

Regards

Markus

#24Alvaro Herrera
alvherre@commandprompt.com
In reply to: Markus Schiltknecht (#23)
Re: Integrating Replication into Core

Markus Schiltknecht wrote:

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

I'd love to get into that one. Some of the people who have attended my
talk at the summit might know that I've introduced the following syntax
to Postgres-R:

ALTER DATABASE testdb START REPLICATION IN GROUP testgroup USING egcs;

And I'm using the system catalogs to store replication settings. What's
so wrong with that?

I don't know if there's anything wrong, but in Mammoth Replicator, the
syntax to enable replication of a single table is

ALTER TABLE foo ENABLE REPLICATION

and we store the replication settings in system catalogs as well.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#25Markus Schiltknecht
markus@bluegap.ch
In reply to: Alvaro Herrera (#24)
Re: Integrating Replication into Core

Hi,

Alvaro Herrera wrote:

I don't know if there's anything wrong, but in Mammoth Replicator, the
syntax to enable replication of a single table is

ALTER TABLE foo ENABLE REPLICATION

and we store the replication settings in system catalogs as well.

Oh, that's nice to know.

Regards

Markus

#26Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#24)
Re: Integrating Replication into Core

Alvaro Herrera wrote:

Markus Schiltknecht wrote:

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

I'd love to get into that one. Some of the people who have attended my
talk at the summit might know that I've introduced the following syntax
to Postgres-R:

ALTER DATABASE testdb START REPLICATION IN GROUP testgroup USING egcs;

And I'm using the system catalogs to store replication settings. What's
so wrong with that?

I don't know if there's anything wrong, but in Mammoth Replicator, the
syntax to enable replication of a single table is

ALTER TABLE foo ENABLE REPLICATION

and we store the replication settings in system catalogs as well.

Wasn't there supposed to be some discussion among replication authors to
try to come up with at least some common hooks?

If everybody invents their own grammar, GUC vars, etc. etc. it will be
impossible to handle down the track. We'd be faced with a choice of
never having any replication in core, or picking one and leaving the
others out in the cold. This is supposed to be a *community*.

cheers

andrew

#27Jeff Davis
pgsql@j-davis.com
In reply to: Joshua D. Drake (#22)
Re: Open source databases '60 per cent cheaper'

On Wed, 2006-11-22 at 10:07 -0800, Joshua D. Drake wrote:

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

There is definitely another reason though :). Adding a replication
solution that is integrated *will* increase development overhead in
terms of support.

Replication touches (alot) of places.

Yes, you're absolutely right.

However, hypothetically, if there was a great replication solution that
helped a lot of people, and it needed some changes to core, I don't
think the patch would be rejected as long as:

(1) The changes are as minimal as possible
(2) The changes are applicable to an entire class of replication, not
just a single implementation

Regards,
Jeff Davis

#28Jonah H. Harris
jonah.harris@gmail.com
In reply to: Andrew Dunstan (#26)
Re: Integrating Replication into Core

On 11/22/06, Andrew Dunstan <andrew@dunslane.net> wrote:

Wasn't there supposed to be some discussion among replication authors to
try to come up with at least some common hooks?

That was my understanding as well.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

#29Joshua D. Drake
jd@commandprompt.com
In reply to: Andrew Dunstan (#26)
Re: Integrating Replication into Core

Wasn't there supposed to be some discussion among replication authors to
try to come up with at least some common hooks?

Well yes, but as far as I know that never happen, and we have been
implementing the new version with the above syntax for a year and our
GUC variables have been around for over 4 years.

If everybody invents their own grammar, GUC vars, etc. etc. it will be
impossible to handle down the track. We'd be faced with a choice of
never having any replication in core, or picking one and leaving the
others out in the cold. This is supposed to be a *community*.

Agreed.

Joshua D. Drake

cheers

andrew

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#30Jeff Davis
pgsql@j-davis.com
In reply to: Markus Schiltknecht (#23)
Re: Integrating Replication into Core

On Wed, 2006-11-22 at 19:23 +0100, Markus Schiltknecht wrote:

Hi,

[ moving to -hackers, that seems more appropriate. ]

Jeff Davis wrote:

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

That's what I'm saying. Although it's hypothetical.

However, there may not be nifty syntax changes nor GUCs in core to
support a specific implementation of a replicator.

I'd love to get into that one. Some of the people who have attended my
talk at the summit might know that I've introduced the following syntax
to Postgres-R:

ALTER DATABASE testdb START REPLICATION IN GROUP testgroup USING egcs;

And I'm using the system catalogs to store replication settings. What's
so wrong with that?

Nothing's wrong with that approach. My prediction, however, is that:

(1) Similar replication solutions will first agree on some common hooks
they need in the backend that may have no actual SQL syntax associated,
and get patches in
(2) then agree on some implementations details
(3) then agree on the syntax

To talk about getting syntax in the backend now seems like putting the
cart before the horse, to me anyway. But there's nothing wrong with
having SQL syntax for the replication.

Regards,
Jeff Davis

#31Markus Schiltknecht
markus@bluegap.ch
In reply to: Andrew Dunstan (#26)
Re: Integrating Replication into Core

Hi,

Andrew Dunstan wrote:

Wasn't there supposed to be some discussion among replication authors to
try to come up with at least some common hooks?

Yes, Andrew Sullivan even opened a PgFoundry project and a mailing list.
But up to now, only the GORDA project has proposed some hooks.

For Postgres-R, I definitely don't want to settle for any hooks, yet,
because I want to keep flexible. Hooks would only get into my way and
serve no purpose.

If everybody invents their own grammar, GUC vars, etc. etc. it will be
impossible to handle down the track.

Why is that? I can very well change all of the configuration stuff, I
just don't see no use for that.

We'd be faced with a choice of
never having any replication in core, or picking one and leaving the
others out in the cold.

...or wait for *the one* superior set of hooks we never can come up with?

Remember that the problem in replication is not interfacing with the
database. That can and has been solved in multiple different ways. And
interfaces can change (especially as long as they are still part of
experimental software).

Regards

Markus

#32Markus Schiltknecht
markus@bluegap.ch
In reply to: Joshua D. Drake (#29)
Re: Integrating Replication into Core

Hi,

Joshua D. Drake wrote:

Well yes, but as far as I know that never happen, and we have been
implementing the new version with the above syntax for a year and our
GUC variables have been around for over 4 years.

Sorry, new version of what? what GUC variables?

Regards

Markus

#33Joshua D. Drake
jd@commandprompt.com
In reply to: Markus Schiltknecht (#32)
Re: Integrating Replication into Core

On Wed, 2006-11-22 at 20:23 +0100, Markus Schiltknecht wrote:

Hi,

Joshua D. Drake wrote:

Well yes, but as far as I know that never happen, and we have been
implementing the new version with the above syntax for a year and our
GUC variables have been around for over 4 years.

Sorry, new version of what? what GUC variables?

Our new version of replicator (1.7) has been in development for a year
and that is the version that supports ALTER TABLE.

The GUC variables we use have mostly been static for years. 1.7 has some
clean up etc..

Sincerely,

Joshua D. Drake

Regards

Markus

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#34Simon Riggs
simon@2ndquadrant.com
In reply to: Markus Schiltknecht (#23)
Re: Integrating Replication into Core

On Wed, 2006-11-22 at 19:23 +0100, Markus Schiltknecht wrote:

Jeff Davis wrote:

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

That's what I'm saying. Although it's hypothetical.

My interest is in extending Warm Standby [8.2] to include the following
forms of replication:
1. asynchronous WAL-record level transfer to Standby server
2. synchronous WAL-record level transfer to Standby server
My foresight includes that this would likely require some improvements
in Group Commit, but I've not done the design for this *yet*.

I would also like to include some performance optimisations into Core
that are specifically aimed at improving Slony performance. (I'm more
than happy if those things also increase performance of other
situations). That's slightly different thing to embedding Slony in Core,
which I am *not* suggesting. Suggestions welcome.

This will then give PostgreSQL:
- improved performance for the most popular production replication
system for PostgreSQL (Slony)
- a capability for Synchronous Replication, when it is requested

That's the limit of my ambitions for 8.3.

Personally, I won't be investing time in multi-master solutions for a
host of reasons; please just regard that as a personal time allocation
decision rather than a suggestion to prevent others from doing so.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#35Markus Schiltknecht
markus@bluegap.ch
In reply to: Jeff Davis (#30)
Re: Integrating Replication into Core

Hi,

Jeff Davis wrote:

Nothing's wrong with that approach. My prediction, however, is that:

(1) Similar replication solutions will first agree on some common hooks
they need in the backend that may have no actual SQL syntax associated,
and get patches in

Well, before that, you need to know what hooks you need. And that again
involves lots of implementation details. Thus better first implement
without hooks, otherwise you might later notice that there is something
you didn't think of.

(2) then agree on some implementations details
(3) then agree on the syntax

To talk about getting syntax in the backend now seems like putting the
cart before the horse, to me anyway.

That was just an example. Postgres-R actually already does a lot behind
the scenes if you type that command. So, yes, the horse definitely came
before the cart.

But there's nothing wrong with
having SQL syntax for the replication.

Okay. After reading the tsearch2 discussion I got another feeling, but
that might just have been me.

Regards

Markus

#36Markus Schiltknecht
markus@bluegap.ch
In reply to: Joshua D. Drake (#33)
Re: Integrating Replication into Core

Hi,

Joshua D. Drake wrote:

Joshua D. Drake wrote:

Well yes, but as far as I know that never happen, and we have been
implementing the new version with the above syntax for a year and our
GUC variables have been around for over 4 years.

Sorry, new version of what? what GUC variables?

Our new version of replicator (1.7) has been in development for a year
and that is the version that supports ALTER TABLE.

The GUC variables we use have mostly been static for years. 1.7 has some
clean up etc..

Aha. Well, could you name the places where you'd need hooks? Would you
like to use hooks? What purpose would that serve you?

Regards

Markus

#37Joshua D. Drake
jd@commandprompt.com
In reply to: Markus Schiltknecht (#36)
Re: Integrating Replication into Core

On Wed, 2006-11-22 at 20:35 +0100, Markus Schiltknecht wrote:

Hi,

Joshua D. Drake wrote:

Joshua D. Drake wrote:

Well yes, but as far as I know that never happen, and we have been
implementing the new version with the above syntax for a year and our
GUC variables have been around for over 4 years.

Sorry, new version of what? what GUC variables?

Our new version of replicator (1.7) has been in development for a year
and that is the version that supports ALTER TABLE.

The GUC variables we use have mostly been static for years. 1.7 has some
clean up etc..

Aha. Well, could you name the places where you'd need hooks? Would you
like to use hooks? What purpose would that serve you?

I would be the wrong person to ask however, I can say that I don't see a
need for the hooks. If we somehow (the community) created some
reasonable generic interface, we would likely make use of it but other
then that, I am happy with how we are doing it.

Sincerely,

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#38Andrew Dunstan
andrew@dunslane.net
In reply to: Markus Schiltknecht (#35)
Re: Integrating Replication into Core

Markus Schiltknecht wrote:

But there's nothing wrong with
having SQL syntax for the replication.

Okay. After reading the tsearch2 discussion I got another feeling, but
that might just have been me.

The objection then was that about 8 or 9 new commands were proposed, and
that a functional interface might be just as good.

What sort of grammar support do you want?

cheers

andrew

#39Markus Schiltknecht
markus@bluegap.ch
In reply to: Andrew Dunstan (#38)
Re: Integrating Replication into Core

Hi,

Andrew Dunstan wrote:

What sort of grammar support do you want?

Support? I would have just extended the bison gram.y myself. :-)

I don't yet know what I will need. I'll probably have to add settings
per database, some per table, others per transaction. I thought about
some additions to existing ALTER DATABASE and ALTER TABLE commands as
well as some SET variables, probably within the syntax of SET TRANSACTION...

Stuffing them into such a syntax seems more consistent to me than using
function calls.

Regards

Markus

#40Alvaro Herrera
alvherre@commandprompt.com
In reply to: Andrew Dunstan (#26)
Re: Integrating Replication into Core

Andrew Dunstan wrote:

Wasn't there supposed to be some discussion among replication authors to
try to come up with at least some common hooks?

If everybody invents their own grammar, GUC vars, etc. etc. it will be
impossible to handle down the track. We'd be faced with a choice of
never having any replication in core, or picking one and leaving the
others out in the cold. This is supposed to be a *community*.

I don't have the expectation that Mammoth Replicator will ever be
open-sourced (this is my personal opinion; the company owner may
differ). And even if it were, I doubt it would serve as a basis for
whatever community effort to build a replication engine. I don't think
it's in anybody's best interest to base design decisions on Mammoth
Replicator "experience". The projects that are already open source are
in a much better standing for that (GORDA, Postgres-R, Slony, etc).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#41Markus Schiltknecht
markus@bluegap.ch
In reply to: Bruce Momjian (#5)
Re: Integrating Replication into Core

Hi,

[ copying back to -hackers. ]

Andrew Dunstan wrote:

You have totally misunderstood me. I mean, what sort of grammar changes
do you need. Oh, and *you* might submit changes, but *we* have to
support them

Agreed, as long as *we* includes me. I'm not reading it like that, but I
don't know how you meant it.

if they go into the core.

Sure, if...

I think the group of developers working on PostgreSQL can be extended,
by accepting patches, in the hope that the original authors keep
supporting it - especially because it's easy to revert patches again.
Not accepting extensions because the current group thinks they can't
support it won't help in attracting more developers and enlarging that
group.

I'm taking Postgres-R along since 7.4, for example. You can be very sure
that I won't drop it because it got into core. (Bad example, because
this is not going to happen anytime soon, if at all.)

Regards

Markus

#42Jeff Davis
pgsql@j-davis.com
In reply to: Markus Schiltknecht (#35)
Re: Integrating Replication into Core

On Wed, 2006-11-22 at 20:31 +0100, Markus Schiltknecht wrote:

Hi,

Jeff Davis wrote:

Nothing's wrong with that approach. My prediction, however, is that:

(1) Similar replication solutions will first agree on some common hooks
they need in the backend that may have no actual SQL syntax associated,
and get patches in

Well, before that, you need to know what hooks you need. And that again
involves lots of implementation details. Thus better first implement
without hooks, otherwise you might later notice that there is something
you didn't think of.

I think you misunderstand my point. I was talking about replication
implementations that already exist. They already have patches on the
backend that are necessary for their solution to work.

The idea is to design a single set of hooks that can be used to
implement an entire class of replication. This only makes sense after
existing solutions come to some agreement. I view that as a first step,
assuming that it is necessary to alter the core in order to implement
the class of replication in question.

Once that step is complete, ideally you'd be able to implement Postgres-
R without having to patch the postgresql backend to accomplish it
(except for maybe adding the syntax for your solution). Then, when a
syntax is agreed upon, you won't need to patch the backend at all. Isn't
that the goal, to be able to implement your replication without patching
the backend?

Regards,
Jeff Davis

#43Markus Schiltknecht
markus@bluegap.ch
In reply to: Jeff Davis (#42)
Re: Integrating Replication into Core

Hi,

Jeff Davis wrote:

I think you misunderstand my point.

That may well be. Please keep in mind that I'm not a native English
speaker, thus please speak loud and clear ;-)

I was talking about replication
implementations that already exist. They already have patches on the
backend that are necessary for their solution to work.

Do they? I'm only aware of the GORDA patch. The old Postgres-R patches
are out of date. Sequoia, PgPool and PgPool-II obviously do not need
patches. Slony-II, Postgres-R (8) (mine) as well as PGCluster-II are not
open sourced, yet. And I haven't heard much regarding hooks from any of
the proprietary vendors (except Joshua's recent statement that he's
happy without such hooks).

The idea is to design a single set of hooks that can be used to
implement an entire class of replication. This only makes sense after
existing solutions come to some agreement. I view that as a first step,
assuming that it is necessary to alter the core in order to implement
the class of replication in question.

As there's not even *one* existing and open replication solution which
needs patching the backend, you are basing your statements on a false
premise. Thus, speaking of hooks as a "first step" is very confusing, at
least.

Once that step is complete, ideally you'd be able to implement Postgres-
R without having to patch the postgresql backend to accomplish it
(except for maybe adding the syntax for your solution). Then, when a
syntax is agreed upon, you won't need to patch the backend at all. Isn't
that the goal, to be able to implement your replication without patching
the backend?

No, it's not. What would that buy me? My goal is to write a widely
usable replication system. How that interacts with the backend is of
much less importance to me. And currently fiddling with the backend is
much easier than maintaining hooks and keep all the replication stuff
separate.

Postgres-R can be one of the solutions used to decide what hooks we
need. Waiting for hooks to establish before implementing Postgres-R
would be what you call 'putting the cart before the horse'.

Regards

Markus

#44José Orlando Pereira
jop@lsd.di.uminho.pt
In reply to: Markus Schiltknecht (#31)
Re: Integrating Replication into Core

On Wednesday 22 November 2006 7:21 pm, Markus Schiltknecht wrote:

Yes, Andrew Sullivan even opened a PgFoundry project and a mailing list.
But up to now, only the GORDA project has proposed some hooks.

For Postgres-R, I definitely don't want to settle for any hooks, yet,
because I want to keep flexible. Hooks would only get into my way and
serve no purpose.

If everybody invents their own grammar, GUC vars, etc. etc. it will be
impossible to handle down the track.

Why is that? I can very well change all of the configuration stuff, I
just don't see no use for that.

Indeed, we in GORDA have also came up with yet another set of changes to
grammar and GUC variables. This is not the ideal scenario. :-(

I understand that different people have different motives not to agree with
the GORDA-style hook based approach. Therefore, I suggest that we try to
agree on small but sure steps. The worst outcome of this would be that we all
end up with smaller patches to maintain...

The configuration stuff seems to be a good place to start. What about each of
us summarizing their changes to grammar and GUC to the hooks list to get the
discussion started on a solid ground?

BTW, we have released a new version of the GORDA platform. This version has
been completly rewritten (reusing some code from PL-J) and has a lot more
functionality. The annoucement will follow shortly.

--
Jose Orlando Pereira

#45Markus Schiltknecht
markus@bluegap.ch
In reply to: José Orlando Pereira (#44)
Re: Integrating Replication into Core

Hello Jose,

Jos� Orlando Pereira wrote:

Indeed, we in GORDA have also came up with yet another set of changes to
grammar and GUC variables. This is not the ideal scenario. :-(

I understand that different people have different motives not to agree with
the GORDA-style hook based approach. Therefore, I suggest that we try to
agree on small but sure steps.

I appreciate your efforts to come up with hooks. But as I've already
stated, I'm not ready to settle down for concrete hooks for Postgres-R
(8), so I probably can't help.

The worst outcome of this would be that we all
end up with smaller patches to maintain...

Do you really maintain patches? I'm maintaining a source tree and I'd
like to keep it that way, as of now.

I'd better like to work together in other areas, for example, what do
you use for testing? I've read that the Sequoia people use their
home-grown (and closed source) test suite. I'm about to write the third
generation of my own test suite...

For simulations, I'm using qemu, sometimes also trying Xen, but that
does not run on my laptop. :-(

Perhaps we can share test suites, or even automated benchmarks? IMO, we
would gain a whole lot more with that than with hooks.

Regards

Markus

#46alfranio correia junior
alfranio@lsd.di.uminho.pt
In reply to: Jeff Davis (#42)
Re: Integrating Replication into Core

The idea is to design a single set of hooks that can be used to
implement an entire class of replication. This only makes sense after
existing solutions come to some agreement. I view that as a first step,
assuming that it is necessary to alter the core in order to implement
the class of replication in question.

Once that step is complete, ideally you'd be able to implement Postgres-
R without having to patch the postgresql backend to accomplish it
(except for maybe adding the syntax for your solution). Then, when a
syntax is agreed upon, you won't need to patch the backend at all. Isn't
that the goal, to be able to implement your replication without patching
the backend?

We should go in that direction.

In a database life cycle, there are different events that may be useful
for different replication solutions. For instance, we may say:
- database startup and shutdown
- connection startup and shutdown
- transaction begin, commit, rollback
- statement request
- updates (i.e., insert, delete, update)
- logging

First, we should agree on which events we need to support a set of
replication protocols (e.g., gorda, postgres-r, slony-i and ii, etc).
Then, we should decide how such events will be notified.

In particular, the gorda project decided to use "special triggers" but
any sort of callback would be great for us. We adopted these hooks
because we thought that it would be useful to different applications
(e.g, materialized views).

Third we should discuss what interface would be provided to inject
information into remote replicas. Is the SPI_* interface good ? How
to inject binary data into tables ? I know that PostgreSQL allows to do
that. But is the interface provided enough ? Would not be interesting to
inject things directly into log ?

Fourth, we should have a discussion on locks, high priority
transactions, notifications on blocking, etc...

And finally, we may be able to discuss meta information, syntax, etc...

Regards,

Alfranio Junior.

#47José Orlando Pereira
jop@lsd.di.uminho.pt
In reply to: Markus Schiltknecht (#45)
Re: Integrating Replication into Core

On Thursday 23 November 2006 11:46 am, Markus Schiltknecht wrote:

I appreciate your efforts to come up with hooks.

Thank you. :-)

But as I've already
stated, I'm not ready to settle down for concrete hooks for Postgres-R
(8), so I probably can't help.

Sure, I know that you don't like hooks.

I just suggested that we should compare *interfaces* to configure replication
(i.e. variable names, grammar, etc), since it looks like we have a bunch of
different syntaxes to achieve the same.

It might turn out that there is no common ground, but it is worth trying it.

I'd better like to work together in other areas, for example, what do
you use for testing? I've read that the Sequoia people use their
home-grown (and closed source) test suite. I'm about to write the third
generation of my own test suite...

It is somewhat difficult to share a test-suite if we have to maintain multiple
versions of the code that sets up the replicated db.

See the point? ;)

The worst outcome of this would be that we all
end up with smaller patches to maintain...

Do you really maintain patches? I'm maintaining a source tree and I'd
like to keep it that way, as of now.

We do maintain a patch, as you do, unless you have forked from mainline for
good. Using a good revision control system helps (we use Cannonical's Bazaar,
BTW), but does not fundamentally change the problem.

The smaller the diff, the better.

--
Jose Orlando Pereira

#48Markus Schiltknecht
markus@bluegap.ch
In reply to: José Orlando Pereira (#47)
Re: Integrating Replication into Core

Hi,

[ I suggest to move from hackers to replica-hooks-discuss@pgfoundry.org,
as that's what that list has been created for. ]

Jos� Orlando Pereira wrote:

Sure, I know that you don't like hooks.

Yes, but that's yet another story. ;-)

I just suggested that we should compare *interfaces* to configure replication
(i.e. variable names, grammar, etc), since it looks like we have a bunch of
different syntaxes to achieve the same.

The same?

Let's see. I currently have these additional commands:

ALTER DATABASE testdb START REPLICATION
IN GROUP testgroup USING egcs;

and

ALTER DATABASE testdb ACCEPT REPLICATION
FROM GROUP testgroup USING egcs;

I've added a system table pg_replication_gcs to describe the different
group communication systems and connections to them:

Table "pg_catalog.pg_replication_gcs"
Column | Type | Modifiers
----------+---------+-----------
rgcsname | name | not null
rgcstype | integer | not null
rgcsport | integer | not null
rgcssock | text |

(Splitting into rgcsport and rgcssock prooved to be not very helpful.)

And I've added two fields to pg_database to define the GCS and the group
in which to replicate a database:

..
datreplgcs | oid | not null
..
datreplgrp | text |

But as I said: these might change any time. And I certainly will have to
add others, but no idea what those additions will look like.

When comparing to the Mammoth Replicator syntax that Alvaro posted, this
seems very different. PGCluster-II does not use a GCS at all. And I
haven't seen others.

It is somewhat difficult to share a test-suite if we have to maintain multiple
versions of the code that sets up the replicated db.

Well, we wouldn't have to share test cases, but at least the *suite*.
All the code which starts and stops postmasters, does initdb etc..

Probably that's just me, but I'm not aware of any (OSS) project which
can emulate a network (or even a GCS), start and stop processes as
requested and check how they react upon different inputs. If you know
such a thing, please email me! (I've looked at STAF, but that seems
overly complex and targeted at completely different use-case.)

See the point? ;)

Sure, but it's wishful thinking.

We do maintain a patch, as you do, unless you have forked from mainline for
good. Using a good revision control system helps (we use Cannonical's Bazaar,
BTW), but does not fundamentally change the problem.

I'm using monotone. And I don't need much time to fiddle with patches. A
simple 'mtn diff -r ${TRUNK_REVISION}' does all I need. That's why I'd
still say that I don't maintain a patch.

The smaller the diff, the better.

I disagree. Where exactly does size of the patch matter for you?

The number you mean, which is important, is the number of points in the
code where you need to interact with the database, i.e. the number of
hooks you would need. Because as PostgreSQL moves along, changes at
these points are probably necessary. But that number certainly has
nothing to do with the patch size.

Regards

Markus

#49Dimitri Fontaine
dim@dalibo.com
In reply to: Markus Schiltknecht (#45)
Re: Integrating Replication into Core

Hi Markus,

Le jeudi 23 novembre 2006 12:46, Markus Schiltknecht a écrit :

For simulations, I'm using qemu, sometimes also trying Xen, but that
does not run on my laptop. :-(

So you still only have your laptop as 'development facility' ?

At dalibo we have a couple of machines we're not using anymore, partly because
we don't have a need for them nowadays, mainly because it's end-of-life
hardware, not that trusty.

It's some bi pentium III, 2*700MHz, 1Go RAM, 2 ide disks (20Go system and
either 20Go or 120Go data), and two 100Mbps network card per machine.
Direct link should be possible to setup.

We can provide you access to those two servers for you to test postgres-r if
you want to,

Regards,
--
Dimitri Fontaine
http://www.dalibo.com/

#50Markus Schiltknecht
markus@bluegap.ch
In reply to: Dimitri Fontaine (#49)
Re: Integrating Replication into Core

Hello Dimitri,

Dimitri Fontaine wrote:

So you still only have your laptop as 'development facility' ?

Yes.

At dalibo we have a couple of machines we're not using anymore, partly because
we don't have a need for them nowadays, mainly because it's end-of-life
hardware, not that trusty.

It's some bi pentium III, 2*700MHz, 1Go RAM, 2 ide disks (20Go system and
either 20Go or 120Go data), and two 100Mbps network card per machine.
Direct link should be possible to setup.

Thank you very much. But I think two machines is not quite enough. :-(

Having a whole cluster emulated on my laptop allows me to work on the
road. That's a very nice thing (tm). And the emulated machines are
probably already faster than PIIIs... (Memory is the limiting factor,
unfortunately I can't stuff more than 2GB in my laptop.)

Regards

Markus

#51David Boreham
david_list@boreham.org
In reply to: Markus Schiltknecht (#48)
Re: Integrating Replication into Core

Markus Schiltknecht wrote:

Probably that's just me, but I'm not aware of any (OSS) project which
can emulate a network (or even a GCS), start and stop processes as
requested and check how they react upon different inputs.

I've worked on an emulated test rig for a replication system (not RDBMS
but for LDAP).
We used netem (OSS) for the network emulation and a pile of python and
shell scripts and
C client test apps.
Testing replication is hard, of course, and you have to roll most of it
yourself :(

If you know such a thing, please email me! (I've looked at STAF, but
that seems overly complex and targeted at completely different use-case.)

In my experience test frameworks tend to provide less useful
functionality than one might hope.
Sometimes to the point that they're hardly worth bothering with at all.

#52Markus Schiltknecht
markus@bluegap.ch
In reply to: David Boreham (#51)
Re: Integrating Replication into Core

Hi,

David Boreham wrote:

I've worked on an emulated test rig for a replication system (not RDBMS
but for LDAP).
We used netem (OSS)

Thanks. I've already heard about that one some while ago, but didn't
remember it. I'll have another look.

for the network emulation and a pile of python and
shell scripts and
C client test apps.
Testing replication is hard, of course, and you have to roll most of it
yourself :(

Yeah, I'm also using python for that.

If you know such a thing, please email me! (I've looked at STAF, but
that seems overly complex and targeted at completely different use-case.)

In my experience test frameworks tend to provide less useful
functionality than one might hope.
Sometimes to the point that they're hardly worth bothering with at all.

ACK. Same experience here.

Regards

Markus

#53Markus Schiltknecht
markus@bluegap.ch
In reply to: David Boreham (#51)
Re: Integrating Replication into Core

Hi,

David Boreham wrote:

We used netem (OSS) for the network emulation and a pile of python and
shell scripts and

LOL, I've just figured that netem is the project behind:

tc qdisc ... netem ...

I'm already using that, too ;-) Just wasn't aware it's called netem.
Sounds silly, since the name is in the command line, I know...

Regards

Markus

#54alfranio correia junior
alfranio@lsd.di.uminho.pt
In reply to: Markus Schiltknecht (#48)
Re: Integrating Replication into Core

I just suggested that we should compare *interfaces* to configure
replication (i.e. variable names, grammar, etc), since it looks like
we have a bunch of different syntaxes to achieve the same.

The same?

Let's see. I currently have these additional commands:

ALTER DATABASE testdb START REPLICATION
IN GROUP testgroup USING egcs;

and

ALTER DATABASE testdb ACCEPT REPLICATION
FROM GROUP testgroup USING egcs;

We have the following commands:

SET TRANSACTION MASTER

and

CREATE TRIGGER <name> for { STARTUP | SHUTDOWN |
BEGIN TRANSACTION | COMMIT TRANSACTION | ROLLBACK TRANSACTION }
execute procedure <func> ( <funcargs> )

It is worth noting that none of them have references to replication.
Metainformation on replication is stored in normal tables.

I think that we should discuss requirements first instead of going
towards syntax. The latter is the last step to achieve a common
set of ideas.

I suggest the following road map.

In a database life cycle, there are different events that may be useful
for different replication solutions. For instance, we may say:
- database startup and shutdown
- connection startup and shutdown
- transaction begin, commit, rollback
- statement request
- updates (i.e., insert, delete, update)
- logging

First, we should agree on which events we need to support a set of
replication protocols (e.g., gorda, postgres-r, slony-i and ii, etc).
Then, we should decide how such events will be notified.

In particular, the gorda project decided to use "special triggers" but
any sort of callback would be great for us. We adopted these hooks
because we thought that it would be useful to different applications
(e.g, materialized views).

Third we should discuss what interface would be provided to inject
information into remote replicas. Is the SPI_* interface good ? How
to inject binary data into tables ? I know that PostgreSQL allows to do
that. But is the interface provided enough ? Would not be interesting to
inject things directly into log ?

Fourth, we should have a discussion on locks, high priority
transactions, notifications on blocking, etc...

And finally, we may be able to discuss meta information, syntax, etc...

What do you think ?

#55Markus Schiltknecht
markus@bluegap.ch
In reply to: alfranio correia junior (#54)
Re: [Replica-hooks-discuss] Integrating Replication ino

Hi,

alfranio correia junior wrote:

We have the following commands:

SET TRANSACTION MASTER

and

CREATE TRIGGER <name> for { STARTUP | SHUTDOWN |
BEGIN TRANSACTION | COMMIT TRANSACTION | ROLLBACK TRANSACTION }
execute procedure <func> ( <funcargs> )

Okay.

I think that we should discuss requirements first instead of going
towards syntax. The latter is the last step to achieve a common
set of ideas.

I still maintain the point that I want to check requirements first. For
that I need a working prototype. And I'm easy with prototyping in C in
the backend code. If there's really a requirement for hooks, I can add
them and decouple from PostgreSQL source code later on.

What do you currently base your hooks on? IMO it's just naive to expect
to be able to define hooks now, especially hooks as general as you seem
to be heading to (I've read about sync and async multi master
replication, single master replication as well as materialized views).

Another point: modularization is nice and well, where appropriate. But
here I don't see how it could help the user. Or do you expect users to
plug in and out replication solutions like USB sticks? I think most
users want to have *one* replication solution that works. Out of the
box. Maybe they want one which can do sync as well as async replication,
sure. But hooks don't give you that, nor do they make it any easier.

I agree that it's helpful to modularize it in code. But you don't need
hooks for that.

I know I'm probably somewhat alone with that point of view.

Regards

Markus

#56alfranio correia junior
alfranio@lsd.di.uminho.pt
In reply to: Markus Schiltknecht (#55)
Re: [Replica-hooks-discuss] Integrating Replication ino

Hi !!!

I still maintain the point that I want to check requirements first. For
that I need a working prototype. And I'm easy with prototyping in C in
the backend code. If there's really a requirement for hooks, I can add
them and decouple from PostgreSQL source code later on.

I agree with you. You should build prototypes and try things in order to
figure out exactly what we need.
However, based on the experience that you already have in developing
such prototypes most likely there are different futures that would like
to see into PostgreSQL. What are they ?

What do you currently base your hooks on? IMO it's just naive to expect
to be able to define hooks now, especially hooks as general as you seem
to be heading to (I've read about sync and async multi master
replication, single master replication as well as materialized views).

You have "prototypes" built upon such hooks: sync and async, single
master and multi master. However, I am not arguing that hooks are the
solution to any problem. But they work for the limited view that we have
on the subject.

Another point: modularization is nice and well, where appropriate. But
here I don't see how it could help the user. Or do you expect users to
plug in and out replication solutions like USB sticks? I think most
users want to have *one* replication solution that works. Out of the
box. Maybe they want one which can do sync as well as async replication,
sure. But hooks don't give you that, nor do they make it any easier.

I don't expect that. But I would like to test different replication
protocols without patching the PostgreSQL. And I believe that we might
come up with a set of in-core features that would enable this.

Regards,

Alfranio.

In reply to: Markus Schiltknecht (#55)
Re: [Replica-hooks-discuss] Integrating Replication ino

Markus Schiltknecht wrote:

Another point: modularization is nice and well, where appropriate. But
here I don't see how it could help the user. Or do you expect users to
plug in and out replication solutions like USB sticks? I think most
users want to have *one* replication solution that works. Out of the
box. Maybe they want one which can do sync as well as async replication,
sure. But hooks don't give you that, nor do they make it any easier.

I, as a mostly-user, fully subscribe to that point of view. IMHO one
of the biggest mistakes mysql made were those "pluggable storage
managers". While all those different storage managers (innodb, bdb,
myisam, ...) _look_ interchangeable from an interface point of view
(You just specify which one to use when creating the table, right?),
they all have _different_ semantics. Just forgot to write "with innodb"
in _one_ of your table definitions, and transaction isolation goes
out of the window :-(.

I understand that different usecases need different replication
solutions - but I think "Hey, let's just make them plugins" is
not the way to go. It would work if all replication solutions
had _exactly_ the same semantics - but if they do, then what is
the point of all the different solutions anyway?

Just my 2 eurocents...
Greetings, Florian Pflug

#58Jim Nasby
jim.nasby@enterprisedb.com
In reply to: Shane Ambler (#15)
Re: Open source databases '60 per cent cheaper'

On Nov 22, 2006, at 11:00 AM, Shane Ambler wrote:

Simon Riggs wrote:

On Wed, 2006-11-22 at 10:44 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

Anyway, I'm very interested in getting Synchronous Replication
into PostgreSQL 8.3. Can I gauge your interest in making Mammoth
Replicator
Open Source to assist with that project? I'd be very happy to
work with
you in an open manner on that.

Mammoth Replicator is not synchronous anyway ...

That's a shame. I thought we might be able get a head start in
that way.

Have you looked at PGCluster? - Synchronous multi master replication

The downside to pgCluster (and the synchronous pgpool replication as
well) is that it's statement-based. That means something like

INSERT INTO table VALUES( ..., now(), ... );

Doesn't work right at all. Continuent has statement-based replication
that handles common cases (like now() and random()), but it doesn't
handle everything (and it's limited to JDBC connections).

I'll agree with Josh Drake that the demand for synchronous
replication isn't as great as the hype (or the demand for async), but
it definitely does exist.

Regarding putting stuff in core, ISTR a group was setup on pgFoundry
to discuss what features could be added to core to help various
replication systems (such as being able to put triggers on DDL), but
I don't know if anything every came of it.
--
Jim Nasby jim.nasby@enterprisedb.com
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#59Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Markus Schiltknecht (#17)
Re: Open source databases '60 per cent cheaper'

On Wed, Nov 22, 2006 at 06:21:49PM +0100, Markus Schiltknecht wrote:

I don't think that policy is written in stone. But for a replication
solution to go into core, it should better cover a lot of use cases,
i.e. for sure sync *and* async replication.

Some time ago, I set up a project on pgfoundry to try to work out
what those cases are. So far, exactly one person has sent anything
to the list about it. There were two messages copied to that list
just in the last day, though.

Anyway, I hereby re-affirm my commitment to try to write down what we
are trying to achieve, if I can get anyone to tell me what that is.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
If they don't do anything, we don't need their acronym.
--Josh Hamilton, on the US FEMA

#60Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Andrew Dunstan (#26)
Re: Integrating Replication into Core

On Wed, Nov 22, 2006 at 01:58:34PM -0500, Andrew Dunstan wrote:

Wasn't there supposed to be some discussion among replication authors to
try to come up with at least some common hooks?

That was what I was aiming at, yes.

http://pgfoundry.org/projects/replica-hooks/

A

--
Andrew Sullivan | ajs@crankycanuck.ca
Unfortunately reformatting the Internet is a little more painful
than reformatting your hard drive when it gets out of whack.
--Scott Morris

#61Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Markus Schiltknecht (#31)
Re: Integrating Replication into Core

On Wed, Nov 22, 2006 at 08:21:23PM +0100, Markus Schiltknecht wrote:

For Postgres-R, I definitely don't want to settle for any hooks, yet,
because I want to keep flexible. Hooks would only get into my way and
serve no purpose.

Let me make the following argument to the contrary. This is a
rationale argument for the other discussion, and not a discussion of
the hooks themselves, so I think it's still appropriate for -hackers.

The reason to write down what the _requirements_ are for hooks is so
that the community can get to work on any of the general approaches
to replication that they want. These hooks might, in fact, turn out
to be nothing more than a layer of indirection in the core PostgreSQL
code.

The reason the earlier attempts at Postgres-R didn't ever make it out
of testing was precisely, I argue, because there just wasn't an
interface for the rest of the PostgreSQL project (maybe not
interested in replication) to keep stable. So merely keeping up with
the pace of change in the core code turned into a significant
undertaking. Those are cycles stolen from the more useful work of
making the replication code work better.

The same thing is true of other pieces that have fallen by the side:
because the whole of the PostgreSQL project moves so quickly, a small
number of people working on a large feature set in relative isolation
can end up spending way too much time keeping up with the core, and
not enough time working on the features they desire. The result is a
loss to everyone.

So that's why I was trying to outline what, at least, the
requirements are.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
Users never remark, "Wow, this software may be buggy and hard
to use, but at least there is a lot of code underneath."
--Damien Katz

#62Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Florian G. Pflug (#57)
Re: [Replica-hooks-discuss] Integrating Replication ino

I'm responding with a short answer here. But more of this sort of
discussion would really help our meta discussion on what the problem
is we're trying to solve. I'm trying to host that on the other list
just on the grounds that -hackers has enough traffic about _actual_
features without cluttering it with discussion of wishlist items that
nobody is yet committed to do the work on.

On Fri, Nov 24, 2006 at 04:21:11PM +0100, Florian G. Pflug wrote:

managers". While all those different storage managers (innodb, bdb,
myisam, ...) _look_ interchangeable from an interface point of view
(You just specify which one to use when creating the table, right?),
they all have _different_ semantics.

Yes. But one way MySQL could have done that right was to identify in
their core that they needed an idea of storage management state.
Then BEGIN; INSERT INTO innodb_table; UPDATE myisam_table; COMMIT;
would fail in the way the ACID gods intended. But that, of course,
would have required writing down in advance how these things should
work. Which is what I'm proposing to do.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
--Brad Holland

#63Joshua D. Drake
jd@commandprompt.com
In reply to: Andrew Sullivan (#61)
Re: Integrating Replication into Core

The reason the earlier attempts at Postgres-R didn't ever make it out
of testing was precisely, I argue, because there just wasn't an
interface for the rest of the PostgreSQL project (maybe not
interested in replication) to keep stable. So merely keeping up with
the pace of change in the core code turned into a significant
undertaking. Those are cycles stolen from the more useful work of
making the replication code work better.

Actually I don't buy this argument. The only major change in
*postgresql* that has slowed down Replicator is the move from
users/groups to roles. We added a feature in the internal 1.6 release to
replicate users/groups.

We are currently behind because of things that have really nothing to do
with PostgreSQL and more to do with reworking an evolutionary code base
to be more manageable.

I don't know much (anything) about Postgres-R but my guess is that the
only major change that would have effected that project in recent years
would have been two phase commit and that is only if they chose to take
advantage of it.

Sincerely,

Joshua D. Drake

#64Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Joshua D. Drake (#63)
Re: Integrating Replication into Core

On Sat, Nov 25, 2006 at 11:05:34AM -0800, Joshua D. Drake wrote:

Actually I don't buy this argument. The only major change in

Ok, good. So why isn't Postgres-R something we have _now_? The work
that I've seen on it, so far (and I speak as someone who invested a
significant amount of staff time, cash money, and -- frankly --
"political" credibility in software based on that idea) is that there
isn't a way to make it production-grade without pretty severe
constraints on what it can do.

It was that unhappy discovery that led me to say, "Can we please
_write down_ what we think 'replication' might require, and what the
trade-offs can be?" I'm trying to write requirements in public here;
but all I get is silence. This frustrates me partly because, as
someone who stuck his neck out to make sure Slony was released as
free software, I hear a lot of demands for features people apparently
want without much in the way of design proposals -- never mind code --
to achieve those features. When Jan delivered the initial release of
Slony, it was preceded by a design doc. I note on -hackers long
emails from (for example) Tom doing something very similar when
proposing a major feature. What I'm trying to do is to get the
replication-interested community of PostgreSQL users to say "here's
what we mean by 'replication'" before we all go off inventing the
grammar. We need to have a clue about the domain of discourse before
we start settling the variable assignments.

It seems to me that every single replication discussion on -hackers
amounts to a bunch of futile attempts by colour blind people (of
which I am one) to describe the colour 'high note', while their
interlocutors describe the sound 'red'. I'm trying to get us to say
what it would mean even to do the describing.

Specifying requirements for what software is supposed to do is one of
those thankless tasks that everyone complains is never done in the
free software community. I am offering, earnestly, to do that. I
just need a few people to tell me what _they think_ the software in
question ought to do. I set up a mailing list. I have solicited
comments. I'm not sure what else to do, but so far, I have the
positive remarks of Jose (GORDA), the remarks of Markus (which amount
to "this is a waste of time", unless I misread him), and nothing
else.

Surely, in a community that spends time on the topic of whether
replication "should be in the back end", we oughta be able to come up
with 10 or so people who are willing to say what "being in the back
end" would mean. At the moment, this trivial goal is all I'm aiming
for.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes

#65Joshua D. Drake
jd@commandprompt.com
In reply to: Andrew Sullivan (#64)
Re: Integrating Replication into Core

Andrew Sullivan wrote:

On Sat, Nov 25, 2006 at 11:05:34AM -0800, Joshua D. Drake wrote:

Actually I don't buy this argument. The only major change in

Ok, good. So why isn't Postgres-R something we have _now_?

That's is a good question and as I mentioned, I don't know much about
Postgres-R. My point was directly to the argument that a fast moving
PostgreSQL somehow limits the ability for replication to be built. That
argument, I believe is false.

I originally responded to the rest of your email but thought better of
it. The only thing I can say is, my experience is that something like
replication will only be productively completed, outside the community.

Jan, for the most part created his own community with Slony. Postgres-R
is doing the same as is the others such as pgPool.

The nature that they are all their own communities, not to mention
several closed source products (Replicator, Unicluster) pretty much sets
the whole thing up to fail IMHO.

Otherwise you are just hearding cats.

Joshua D. Drake

Show quoted text

A

#66David Boreham
david_list@boreham.org
In reply to: Markus Schiltknecht (#53)
Re: Integrating Replication into Core

Markus Schiltknecht wrote:

LOL, I've just figured that netem is the project behind:

tc qdisc ... netem ...

I'm already using that, too ;-) Just wasn't aware it's called netem.
Sounds silly, since the name is in the command line, I know...

Heh. AFAIK netem is the tc stuff that isn't much use for production
router use (e.g.
introduce a 10ms packet delay on this kind of traffic...). We used a
mixture of
netem and regular tc kernel modules, in a Linux box that had 6 NICs,
with Python
driving it. Each replication node test machine was connected with a
straight-through
patch cable to one of the NICs on the 'spider' machine. The Python could
set up
the netem/tc on the router such that various test scenarios with
different banwidth/delay
values were implemented. Also of course loss of connectivity by dropping
all packets
on an interface. Each test machine had two NICs - the second one
being used to communicate with it out of band from the replication
traffic and
network emulation. Then on top of all this the actual replication tests
were run.
One of the things we were interested in was replication throughput vs
network latency, so we also measured performance and made that being
acceptable a test pass
condition.

If you want really fancy network emulation you'd need to use nistnet.
It can do some things that are not possible with netem (statistical packet
drop for example). However IMHO this is only appropriate for testing
TCP/IP stack implementation. Varying latency, throughput, and
introducing connectivity outages
is good enough for user mode code I believe. Nistnet is not in the stock
kernel,
wheras netem is.

#67Jonah H. Harris
jonah.harris@gmail.com
In reply to: Jim Nasby (#58)
Re: Open source databases '60 per cent cheaper'

On 11/24/06, Jim Nasby <jim.nasby@enterprisedb.com> wrote:

The downside to pgCluster (and the synchronous pgpool replication as
well) is that it's statement-based. That means something like

INSERT INTO table VALUES( ..., now(), ... );

Doesn't work right at all.

Umm, as this is the second incorrect statement you've made about
PGCluster, perhaps you really should spend some time reading the docs
before posting. Just for the record, PGCluster does support proper
replication of now().

I'll agree with Josh Drake that the demand for synchronous
replication isn't as great as the hype (or the demand for async), but
it definitely does exist.

Agreed.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

#68Markus Schaber
schabi@logix-tt.com
In reply to: Jim Nasby (#58)
Re: Open source databases '60 per cent cheaper'

Hi, Jim,

Jim Nasby wrote:

The downside to pgCluster (and the synchronous pgpool replication as
well) is that it's statement-based. That means something like

INSERT INTO table VALUES( ..., now(), ... );

Doesn't work right at all.

Hmm, I guess that the new RETURNING feature of PostgreSQL 8.2 can help
with this problem.

Regards,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

#69Bruce Momjian
bruce@momjian.us
In reply to: Andrew Sullivan (#64)
Re: Integrating Replication into Core

Have you looked at the new HA/load balancing section of the docs?

http://developer.postgresql.org/pgdocs/postgres/high-availability.html

I got a lot of feedback on that. Perhaps it can be a starting point for
you.

---------------------------------------------------------------------------

Andrew Sullivan wrote:

On Sat, Nov 25, 2006 at 11:05:34AM -0800, Joshua D. Drake wrote:

Actually I don't buy this argument. The only major change in

Ok, good. So why isn't Postgres-R something we have _now_? The work
that I've seen on it, so far (and I speak as someone who invested a
significant amount of staff time, cash money, and -- frankly --
"political" credibility in software based on that idea) is that there
isn't a way to make it production-grade without pretty severe
constraints on what it can do.

It was that unhappy discovery that led me to say, "Can we please
_write down_ what we think 'replication' might require, and what the
trade-offs can be?" I'm trying to write requirements in public here;
but all I get is silence. This frustrates me partly because, as
someone who stuck his neck out to make sure Slony was released as
free software, I hear a lot of demands for features people apparently
want without much in the way of design proposals -- never mind code --
to achieve those features. When Jan delivered the initial release of
Slony, it was preceded by a design doc. I note on -hackers long
emails from (for example) Tom doing something very similar when
proposing a major feature. What I'm trying to do is to get the
replication-interested community of PostgreSQL users to say "here's
what we mean by 'replication'" before we all go off inventing the
grammar. We need to have a clue about the domain of discourse before
we start settling the variable assignments.

It seems to me that every single replication discussion on -hackers
amounts to a bunch of futile attempts by colour blind people (of
which I am one) to describe the colour 'high note', while their
interlocutors describe the sound 'red'. I'm trying to get us to say
what it would mean even to do the describing.

Specifying requirements for what software is supposed to do is one of
those thankless tasks that everyone complains is never done in the
free software community. I am offering, earnestly, to do that. I
just need a few people to tell me what _they think_ the software in
question ought to do. I set up a mailing list. I have solicited
comments. I'm not sure what else to do, but so far, I have the
positive remarks of Jose (GORDA), the remarks of Markus (which amount
to "this is a waste of time", unless I misread him), and nothing
else.

Surely, in a community that spends time on the topic of whether
replication "should be in the back end", we oughta be able to come up
with 10 or so people who are willing to say what "being in the back
end" would mean. At the moment, this trivial goal is all I'm aiming
for.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#70Alvaro Herrera
alvherre@commandprompt.com
In reply to: Joshua D. Drake (#65)
Re: Integrating Replication into Core

Joshua D. Drake wrote:

Andrew Sullivan wrote:

On Sat, Nov 25, 2006 at 11:05:34AM -0800, Joshua D. Drake wrote:

Actually I don't buy this argument. The only major change in

Ok, good. So why isn't Postgres-R something we have _now_?

That's is a good question and as I mentioned, I don't know much about
Postgres-R. My point was directly to the argument that a fast moving
PostgreSQL somehow limits the ability for replication to be built. That
argument, I believe is false.

I originally responded to the rest of your email but thought better of
it. The only thing I can say is, my experience is that something like
replication will only be productively completed, outside the community.

This is like nVidia saying that "open source developers are not
competent enough to understand the coding of a graphics card driver".

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#71Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Bruce Momjian (#69)
Re: Integrating Replication into Core

On Mon, Nov 27, 2006 at 07:27:32AM -0500, Bruce Momjian wrote:

Have you looked at the new HA/load balancing section of the docs?

http://developer.postgresql.org/pgdocs/postgres/high-availability.html

I got a lot of feedback on that. Perhaps it can be a starting point for
you.

Yes, I have; and yes, it helps.

What I was hoping to do, though, as well, was come up with the list
of facilities that developers of these various systems say they need.
I don't expect this will happen quickly (which is why I figured it
needed a project -- if I could do in in six weeks, then we wouldn't
need a mailing list and the like). But it seemed to me that, with so
many projects on the go, getting a list together of what the
developers of those systems say they need would be the obvious way to
define, later, what hooks, if any, are needed in the core system.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
If they don't do anything, we don't need their acronym.
--Josh Hamilton, on the US FEMA

#72Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Jonah H. Harris (#67)
Re: Open source databases '60 per cent cheaper'

On Sun, Nov 26, 2006 at 04:09:20PM -0500, Jonah H. Harris wrote:

Umm, as this is the second incorrect statement you've made about
PGCluster, perhaps you really should spend some time reading the docs
before posting. Just for the record, PGCluster does support proper
replication of now().

It's sort of irrelevant whether there are tricks to solve things like
now(), because the central problem is that of shipping statements
around.

The problem with statement-based replication is that it can't provide
generic support for replication of those statements that are
sensitive to state. For example, a stored procedure that contains a
call to CURRENT_TIMESTAMP will not be replicated perfectly.
(PGCluster, in a test that I ran some time ago, replicated the call
to the stored procedure. That means that the value of
CURRENT_TIMESTAMP could differ on different members of the group.)

Unless you put a complete interpreter and planner in between the
clients and the replicas, this is a simple, in-principle limitation
of statement based replication. If you _do_ put a complete
interpreter in the way, then you have created a single point of
failure. This is why synchronous replication is hard. (The other
trick, of course, is network-based IPC, which is what PGCluster-II is
aiming at. I'm eagerly awaiting results!)

None of this is to say that PGCluster isn't useful for some purposes.
But I don't see any value in pretending that PGCluster can ever
represent a completely generic multimaster system, when it has such
an important limitation. It is certainly the best available
multimaster system at the moment, and I can think of applications
where it would probably be a very good choice for use.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

#73Neil Conway
neilc@samurai.com
In reply to: Markus Schaber (#68)
Re: Open source databases '60 per cent cheaper'

On Mon, 2006-11-27 at 11:39 +0100, Markus Schaber wrote:

Hmm, I guess that the new RETURNING feature of PostgreSQL 8.2 can help
with this problem.

I don't see how. The basic problem is that SQL is nondeterministic in
general; you can't safely assume that evaluating the same sequence of
SQL statements on all nodes will yield the same final database state.

-Neil

#74Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#70)
Re: Integrating Replication into Core

I originally responded to the rest of your email but thought better of
it. The only thing I can say is, my experience is that something like
replication will only be productively completed, outside the community.

This is like nVidia saying that "open source developers are not
competent enough to understand the coding of a graphics card driver".

I believe you misunderstood me. I am not saying that replication can not
be built in an Open Source manner. Slony is a perfect example of that. I
am saying that involving the larger, general PostgreSQL community in
such a task would be counter-productive.

Sincerely,

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#75Hannu Krosing
hannu@skype.net
In reply to: Joshua D. Drake (#74)
Re: Integrating Replication into Core

Ühel kenal päeval, E, 2006-11-27 kell 07:50, kirjutas Joshua D. Drake:

I originally responded to the rest of your email but thought better of
it. The only thing I can say is, my experience is that something like
replication will only be productively completed, outside the community.

This is like nVidia saying that "open source developers are not
competent enough to understand the coding of a graphics card driver".

I believe you misunderstood me. I am not saying that replication can not
be built in an Open Source manner. Slony is a perfect example of that. I
am saying that involving the larger, general PostgreSQL community in
such a task would be counter-productive.

As several different approaches to "replication" involve same
requirements and/or touching the same places in code it seems a good
idea to at least get some more or less formal descriptions from parties
involved.

Also, it seems that largely the same things are needed for other
projects, like precomputed/materialized views and auditing or some other
non-replication data moving methods.

While each of the replication and non-replication projects does do its
own thing, it may still be beneficial to try to provide some hooks in
right places for them. Not all projects need to use all of them but
having all projects patch the same places in core code will make it
pretty much impossible to use more than one at a time.

As an example, one may want to have both synchronous auditing data and
async replication to be done on the same live database, both gathered at
the point of data manipulation and both moved to different machines.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

#76Chris Browne
cbbrowne@acm.org
In reply to: Bruce Momjian (#5)
Re: Open source databases '60 per cent cheaper'

neilc@samurai.com (Neil Conway) writes:

On Mon, 2006-11-27 at 11:39 +0100, Markus Schaber wrote:

Hmm, I guess that the new RETURNING feature of PostgreSQL 8.2 can help
with this problem.

I don't see how. The basic problem is that SQL is nondeterministic in
general; you can't safely assume that evaluating the same sequence of
SQL statements on all nodes will yield the same final database state.

Well, RETURNING means that there's some place which receives the
return set, which means that there is, in principle, a place where the
data would be capturable.

But it is important to point out that the "nondeterministic"
possibilities do not end with NOW() or CURRENT_TIMESTAMP; other
nondeterministic things that can cause trouble include:

a) A sequence value, which is set *somewhat* invisibly,

b) A query of the form
insert into some_table (select * from a_table);
where the SELECT subquery is not of the form:
SELECT [fields] FROM [a_table] WHERE [Fully Deterministic Criterion]
ORDER BY [some criterion suitable to be a primary key for the result set]
LIMIT [anything];

If you could capture the RETURNING data, and replicate that, well,
that at least provides a way to get around the nondeterminism problem.
That would work out well for both INSERT and DELETE. I'm not sure
it'll work as well for UPDATE; that doesn't return both old and new
column values :-(.
--
"cbbrowne","@","acm.org"
http://linuxdatabases.info/info/internet.html
"Problem solving under linux has never been the circus that it is
under AIX." -- Pete Ehlke in comp.unix.aix

#77Jeff Davis
pgsql@j-davis.com
In reply to: Markus Schiltknecht (#43)
Re: Integrating Replication into Core

On Thu, 2006-11-23 at 08:50 +0100, Markus Schiltknecht wrote:

Hi,

Jeff Davis wrote:

I think you misunderstand my point.

That may well be. Please keep in mind that I'm not a native English
speaker, thus please speak loud and clear ;-)

I was talking about replication
implementations that already exist. They already have patches on the
backend that are necessary for their solution to work.

Do they? I'm only aware of the GORDA patch. The old Postgres-R patches
are out of date. Sequoia, PgPool and PgPool-II obviously do not need
patches. Slony-II, Postgres-R (8) (mine) as well as PGCluster-II are not
open sourced, yet. And I haven't heard much regarding hooks from any of
the proprietary vendors (except Joshua's recent statement that he's
happy without such hooks).

Because we're talking about replication, I don't think we can limit the
discussion to current open source solutions. I could be mistaken, but I
am under the impression that commercial replication solutions do patch
the backend.

The idea is to design a single set of hooks that can be used to
implement an entire class of replication. This only makes sense after
existing solutions come to some agreement. I view that as a first step,
assuming that it is necessary to alter the core in order to implement
the class of replication in question.

As there's not even *one* existing and open replication solution which
needs patching the backend, you are basing your statements on a false
premise. Thus, speaking of hooks as a "first step" is very confusing, at
least.

You're right, there is no agreement yet. When I say "first step," I mean
that it's the first step toward getting any form of replication support
in the _backend_, _not_ a first step toward a replication solution at
all. It may be a long time before the backend has replication-specific
support of any kind, but many replication projects have passed the first
step toward replication a long time ago.

I am not advocating replication support in the backend (since I don't
even know what form that would take), nor am I saying that it will
appear soon. I am just saying that replication-specific syntax is
unlikely to appear before other replication-specific details.

Regards,
Jeff Davis

#78Joshua D. Drake
jd@commandprompt.com
In reply to: Jeff Davis (#77)
Re: Integrating Replication into Core

Do they? I'm only aware of the GORDA patch. The old Postgres-R patches
are out of date. Sequoia, PgPool and PgPool-II obviously do not need
patches. Slony-II, Postgres-R (8) (mine) as well as PGCluster-II are not
open sourced, yet. And I haven't heard much regarding hooks from any of
the proprietary vendors (except Joshua's recent statement that he's
happy without such hooks).

Because we're talking about replication, I don't think we can limit the
discussion to current open source solutions. I could be mistaken, but I
am under the impression that commercial replication solutions do patch
the backend.

Quite.

I am not advocating replication support in the backend (since I don't
even know what form that would take), nor am I saying that it will

patch -p1 < replicator.diff

;)

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#79Markus Schiltknecht
markus@bluegap.ch
In reply to: Jeff Davis (#77)
Re: Integrating Replication into Core

Hi,

Jeff Davis wrote:

Because we're talking about replication, I don't think we can limit the
discussion to current open source solutions. I could be mistaken, but I
am under the impression that commercial replication solutions do patch
the backend.

Sure. But as you see, at least Joshua D. Drake is quite happy with

patch -p1 < all_his_replicator_changes.diff

I'm, too. Because I don't think I could get anything useful into core.
And as long as I'd still have to patch the backend, what would that
serve me?

I really think this decision should be left to the developers of
replication systems. We *will* ask core, if we want to have something
added (as did the GORDA project). I state the same in my FAQ at [1]http://www.postgres-r.org/about/faqs.

You're right, there is no agreement yet. When I say "first step," I mean
that it's the first step toward getting any form of replication support
in the _backend_, _not_ a first step toward a replication solution at
all.

Okay, sorry, then I misread you.

It may be a long time before the backend has replication-specific
support of any kind, but many replication projects have passed the first
step toward replication a long time ago.

Have they? Have you heard requests for specific additions into core from
any of them?

I am not advocating replication support in the backend (since I don't
even know what form that would take), nor am I saying that it will
appear soon. I am just saying that replication-specific syntax is
unlikely to appear before other replication-specific details.

Sure.

Regards

Markus

[1]: http://www.postgres-r.org/about/faqs

#80Markus Schiltknecht
markus@bluegap.ch
In reply to: Andrew Sullivan (#64)
Re: Integrating Replication into Core

Hello Andrew,

Andrew Sullivan wrote:

On Sat, Nov 25, 2006 at 11:05:34AM -0800, Joshua D. Drake wrote:

Actually I don't buy this argument.

Nether do I. I can only reiterate that interfacing with the database
backend is *not* the problem. I've been porting Postgres-R forward since
7.4 and only few changes were necessary since then. And using a decent
version control system simplifies the task of propagating from CVS HEAD
to my branch. The few conflicts that arose were mostly trivial to
resolve (renaming or slight calling convention changes).

Andrew Sullivan wrote:

Ok, good. So why isn't Postgres-R something we have _now_?

(I note you don't count my version of Postgres-R (8), that might be
reasonable depending on your definition of 'having Postgres-R'.)

I can't speak for others, but I just don't have much spare time left.
And it's a complex matter involving lots of corner cases like network
outages, crashes of the replication manager or GCS daemon, etc. Testing
and making it production grade software really takes a lot of time. IMO
this is where replication solutions could work together, because all of
them need to simulate a cluster somehow, to test their project. But this
certainly has nothing to do with PostgreSQL Core.

Another point for me is that the feedback I got on Postgres-R since
Toronto is very close to zero. Some people haven't even noticed that
there is Postgres-R code for 8.2. Or they don't count my variant for
some reasons. For example Tom Lane who recently pointed out Postgres-R
as an example of code drift in [1]Tom Lane: Re: Getting a move on for 8.2 beta: http://archives.postgresql.org/pgsql-hackers/2006-09/msg00139.php. No offense, it's just very
contradictory to the hype around replication.

The work that I've seen on it, so far (and I speak as someone who
invested a significant amount of staff time, cash money, and --
frankly -- "political" credibility in software based on that idea) is
that there isn't a way to make it production-grade without pretty
severe constraints on what it can do.

Right, the Postgres-R algorithm has limitations. And it certainly does
not fit all use cases. The Toronto Meeting has opened my eyes in that
aspect and I'm thankful for that.

It was that unhappy discovery that led me to say, "Can we please
_write down_ what we think 'replication' might require, and what the
trade-offs can be?" I'm trying to write requirements in public here;
but all I get is silence. This frustrates me partly because, as
someone who stuck his neck out to make sure Slony was released as
free software, I hear a lot of demands for features people apparently
want without much in the way of design proposals -- never mind code --
to achieve those features. When Jan delivered the initial release of
Slony, it was preceded by a design doc. I note on -hackers long
emails from (for example) Tom doing something very similar when
proposing a major feature. What I'm trying to do is to get the
replication-interested community of PostgreSQL users to say "here's
what we mean by 'replication'" before we all go off inventing the
grammar. We need to have a clue about the domain of discourse before
we start settling the variable assignments.

As you surely have noticed, I've been discussing forth and back with
Bruce about replication for the documentation. I've been doing that
because I wanted to clarify what 'replication' is, what we are talking
about when we say 'multi-master replication' or 'data partitioning', etc..

Sadly, only very few people from the 'replication interested community'
were discussing. I've even been trying to get more of them involved.

It seems to me that every single replication discussion on -hackers
amounts to a bunch of futile attempts by colour blind people (of
which I am one) to describe the colour 'high note', while their
interlocutors describe the sound 'red'. I'm trying to get us to say
what it would mean even to do the describing.

Specifying requirements for what software is supposed to do is one of
those thankless tasks that everyone complains is never done in the
free software community. I am offering, earnestly, to do that. I
just need a few people to tell me what _they think_ the software in
question ought to do. I set up a mailing list. I have solicited
comments. I'm not sure what else to do, but so far, I have the
positive remarks of Jose (GORDA), the remarks of Markus (which amount
to "this is a waste of time", unless I misread him), and nothing
else.

I'm sorry if this sounded that negative. Defining what software is
supposed to do is certainly necessary, especially as long as replication
discussions on -hackers look like what you described above. Thus we
should better first define what we mean to make sure we are talking
about the same when speaking of 'multi-master replication' for example.

Please note that I've never raised my voice against that. I'm just
saying: it's not time for hooks or any other framework, yet. We don't
even agree in that we need hooks to interface with the database. Even
having to define points in code where I could hook would limit me in an
unacceptable way, if I couldn't redefine them whenever I wanted.

Surely, in a community that spends time on the topic of whether
replication "should be in the back end", we oughta be able to come up
with 10 or so people who are willing to say what "being in the back
end" would mean. At the moment, this trivial goal is all I'm aiming
for.

Being in the back end for me means, I can code in C, use shared memory
and system catalogs, add another sub-process to PostgreSQL, introduce
another operation mode for (remote) backends, mess with the postmaster
and communicate to the backends via shared memory and signals (IPC).

IPC is even a good example for something which could be of use for me.
Back in April, I've sent a patch implementing internal messages passing
(see [2]My Patch for IMessages: http://archives.postgresql.org/pgsql-patches/2006-04/msg00047.php). It's a very general feature I need and, as pointed out in
the mail, it could even be of use for others. But I have no hope for it
to make it into core, because I've never seen something accepted which
could perhaps be of use in the future.

I've very well noticed that you and others offered to help in various
ways. Thank you for that. But I also got the impression that there's an
urge towards hooks or a framework or something so as PostgreSQL can
provide that and refer to it as "having everything needed" for
replication. That sounds marketing driven, IMO.

I can assure you that I will continue to work on Postgres-R. I think its
design has been described well enough already. I will post more
design ideas for extensions and additions on the Postgres-R or on the
replica-hooks mailing list as soon as I have them completely thought
through and written down. And for sure I'll let you know if and how you
or others can help me.

Regards

Markus

[1]: Tom Lane: Re: Getting a move on for 8.2 beta: http://archives.postgresql.org/pgsql-hackers/2006-09/msg00139.php
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00139.php

[2]: My Patch for IMessages: http://archives.postgresql.org/pgsql-patches/2006-04/msg00047.php
http://archives.postgresql.org/pgsql-patches/2006-04/msg00047.php

#81Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Markus Schiltknecht (#80)
Re: Integrating Replication into Core

On Tue, Nov 28, 2006 at 02:19:51PM +0100, Markus Schiltknecht wrote:

(I note you don't count my version of Postgres-R (8), that might be
reasonable depending on your definition of 'having Postgres-R'.)

Yes; what I meant was "production-grade, ready to go." I've played
with your code. I'm mightily impressed that you managed to get it
working. But I don't think it's ready for production use tomorrow in
the environments where this sort of availability is actually worth
the cost (think "money depends on this"). That's what I mean by
"have".

and making it production grade software really takes a lot of time. IMO
this is where replication solutions could work together, because all of
them need to simulate a cluster somehow, to test their project. But this
certainly has nothing to do with PostgreSQL Core.

I agree with you that such supporting tools would be a very good
thing. Maybe nothing else is needed. Like I said before, a negative
result is still a result.

Another point for me is that the feedback I got on Postgres-R since
Toronto is very close to zero. Some people haven't even noticed that
there is Postgres-R code for 8.2.

Well, part of the problem is there isn't much to say to code that I
can't look at. I can play with it on the live CD, but so far the
source isn't on the web page at postgres-r.org, which is the only
source I know for it. This makes the whole matter trickier for
potential adopters, because it's basically a black box.

As you surely have noticed, I've been discussing forth and back with
Bruce about replication for the documentation. I've been doing that
because I wanted to clarify what 'replication' is, what we are talking
about when we say 'multi-master replication' or 'data partitioning', etc..

Yes, I think those docs are very good. But it's one thing to say,
"This is what replication means," &c., and quite another to say,
"Here are the sorts of things we plan to do, which have to work with
that pile of code over there."

I'm sorry if this sounded that negative.

No, not negative. Remember, as I said, if it turns out that we can't
actually come up with an outline of replication framework necessary
conditions, we have also discovered something. That's a useful
result, because it tells us that the next thing we need to do
is figure out where the exclusive features are, so we can say "you
can have A or B, but not both."

through and written down. And for sure I'll let you know if and how you
or others can help me.

Ok, thanks.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes

#82Jeff Davis
pgsql@j-davis.com
In reply to: Markus Schiltknecht (#79)
Re: Integrating Replication into Core

On Tue, 2006-11-28 at 08:42 +0100, Markus Schiltknecht wrote:

You're right, there is no agreement yet. When I say "first step," I mean
that it's the first step toward getting any form of replication support
in the _backend_, _not_ a first step toward a replication solution at
all.

Okay, sorry, then I misread you.

It may be a long time before the backend has replication-specific
support of any kind, but many replication projects have passed the first
step toward replication a long time ago.

Have they? Have you heard requests for specific additions into core from
any of them?

I think you misread me again. I was again trying to make a distinction
between the progress of replication _for_ postgresql (which has been
very good, way past the first step) and the progress of replication
natively in the community version of the postgresql core, which has a
long way to go.

I wasn't very clear, but I don't think you actually disagree with me.

Regards,
Jeff Davis

#83Markus Schiltknecht
markus@bluegap.ch
In reply to: Andrew Sullivan (#81)
Re: Integrating Replication into Core

Hi,

Andrew Sullivan wrote:

Yes; what I meant was "production-grade, ready to go." I've played
with your code. I'm mightily impressed that you managed to get it
working. But I don't think it's ready for production use tomorrow in
the environments where this sort of availability is actually worth
the cost (think "money depends on this"). That's what I mean by
"have".

Agreed.

I agree with you that such supporting tools would be a very good
thing. Maybe nothing else is needed. Like I said before, a negative
result is still a result.

Okay.

Well, part of the problem is there isn't much to say to code that I
can't look at. I can play with it on the live CD, but so far the
source isn't on the web page at postgres-r.org, which is the only
source I know for it. This makes the whole matter trickier for
potential adopters, because it's basically a black box.

Very understandable. I'm trying to find ways to open source Postgres-R.

Yes, I think those docs are very good. But it's one thing to say,
"This is what replication means," &c., and quite another to say,
"Here are the sorts of things we plan to do, which have to work with
that pile of code over there."

ACK.

I'm sorry if this sounded that negative.

No, not negative. Remember, as I said, if it turns out that we can't
actually come up with an outline of replication framework necessary
conditions, we have also discovered something. That's a useful
result, because it tells us that the next thing we need to do
is figure out where the exclusive features are, so we can say "you
can have A or B, but not both."

Okay.

through and written down. And for sure I'll let you know if and how you
or others can help me.

Ok, thanks.

Thank you.

Markus

#84Brad Nicholson
bnichols@ca.afilias.info
In reply to: Simon Riggs (#34)
Re: Integrating Replication into Core

On Wed, 2006-11-22 at 19:27 +0000, Simon Riggs wrote:

On Wed, 2006-11-22 at 19:23 +0100, Markus Schiltknecht wrote:

Jeff Davis wrote:

If there is some great replication solution that a lot of people need
and it will only work with a change to core, that change might make it
in.

That's what I'm saying. Although it's hypothetical.

My interest is in extending Warm Standby [8.2] to include the following
forms of replication:
1. asynchronous WAL-record level transfer to Standby server
2. synchronous WAL-record level transfer to Standby server
My foresight includes that this would likely require some improvements
in Group Commit, but I've not done the design for this *yet*.

I would also like to include some performance optimisations into Core
that are specifically aimed at improving Slony performance. (I'm more
than happy if those things also increase performance of other
situations). That's slightly different thing to embedding Slony in Core,
which I am *not* suggesting. Suggestions welcome.

This will then give PostgreSQL:
- improved performance for the most popular production replication
system for PostgreSQL (Slony)
- a capability for Synchronous Replication, when it is requested

That's the limit of my ambitions for 8.3.

Very curious slony user here. Can I ask what you have planned for 8.3
in regards to Slony performance?

--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.

#85Simon Riggs
simon@2ndquadrant.com
In reply to: Brad Nicholson (#84)
Re: Integrating Replication into Core

On Tue, 2006-11-28 at 14:22 -0500, Brad Nicholson wrote:

On Wed, 2006-11-22 at 19:27 +0000, Simon Riggs wrote:

I would also like to include some performance optimisations into Core
that are specifically aimed at improving Slony performance. (I'm more
than happy if those things also increase performance of other
situations). That's slightly different thing to embedding Slony in Core,
which I am *not* suggesting. Suggestions welcome.

Very curious slony user here. Can I ask what you have planned for 8.3
in regards to Slony performance?

Discussion opened on slony-general list. See you there.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#86Jim Nasby
decibel@decibel.org
In reply to: Markus Schiltknecht (#83)
Re: Integrating Replication into Core

On Nov 28, 2006, at 10:18 AM, Markus Schiltknecht wrote:

Well, part of the problem is there isn't much to say to code that I
can't look at. I can play with it on the live CD, but so far the
source isn't on the web page at postgres-r.org, which is the only
source I know for it. This makes the whole matter trickier for
potential adopters, because it's basically a black box.

Very understandable. I'm trying to find ways to open source
Postgres-R.

Related to that, and your comment about people not using Postgres-
R... I think it's going to be very, very hard to get people to
seriously consider using Postgres-R while it's essentially a fork of
the community code, with little/no visibility into what changes have
been made and how they could affect data stored in the database.
Contrast this with Slony, where there are no back-end changes and the
trigger code (which is essentially the only thing that touches your
live data) is readily visible just via \df+. That makes it very easy
for people to convince themselves that Slony is unlikely to hose
their data. Of course at this point there's enough people using Slony
that that's no longer a concern, but back when it was introduced it
would have been.

Given the nature of Postgres-R, I suppose there's no real way people
could become comfortable without looking at most/all of the code,
since it does tie pretty deeply into the backend. But that's one way
that having published hooks would help; if you could at least put the
code that touches the guts of the database and the source data out in
the open, people might be more willing to give Postgres-R a try.

You also mentioned putting IPC in the backend, since it's something
that you need. I think breaking something as complex as replication
into smaller chunks that can stand on their own is a great idea.
Oracle's replication does this, and I wish Slony would. Having access
to the queuing/communications mechanism that the Slony folks have
built would be very useful. So I'd definitely encourage making
subsets of Postgres-R functionality available, and promoting them via
pgFoundry.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#87Markus Schiltknecht
markus@bluegap.ch
In reply to: Jim Nasby (#86)
Re: Integrating Replication into Core

Hi,

Jim Nasby wrote:

Related to that, and your comment about people not using Postgres-R...

I commented about the feedback I got, which would include rants about
why it's not open source on such. But I didn't even get such responses.

I'm not supposing anybody to use Postgres-R currently. I don't use it in
production myself. And the LiveCD currently serves mainly as an evidence
for real code behind my words. ;-)

I think it's going to be very, very hard to get people to seriously
consider using Postgres-R while it's essentially a fork of the community
code, with little/no visibility into what changes have been made and how
they could affect data stored in the database.

Agreed.

Given the nature of Postgres-R, I suppose there's no real way people
could become comfortable without looking at most/all of the code, since
it does tie pretty deeply into the backend.

Most *people* use PostgreSQL in production without having ever looked at
it's source code. Why should *they* want to look at Postgres-R sources?

I surely see that I could gain *developers* acceptance by opening up the
source code. Please note that I'm absolutely for open source software, I
always wanted to release my changes to Postgres-R under a BSD license
one day.

I'm so much for open source software that I want to make a living from
writing OSS. I simply don't know exactly how to do that, yet. So I'm
keeping Postgres-R closed to leave me more options open.

But that's one way that
having published hooks would help; if you could at least put the code
that touches the guts of the database and the source data out in the
open, people might be more willing to give Postgres-R a try.

I don't really buy that argument. It would be quite some work for me and
not really help other developers, because the real code is still hidden
away.

You also mentioned putting IPC in the backend, since it's something that
you need. I think breaking something as complex as replication into
smaller chunks that can stand on their own is a great idea.

Agreed.

But once again, responses on my trivial IMessages implementations
were... zero. Not even complaints about how lacking it is. Or discussing
performance of pipes vs. this shared memory message passing approach.
Nothing. Why should I work on something nobody else seems to be
interested in?

Oracle's
replication does this, and I wish Slony would. Having access to the
queuing/communications mechanism that the Slony folks have built would
be very useful. So I'd definitely encourage making subsets of Postgres-R
functionality available, and promoting them via pgFoundry.

Agreed.

I myself have thought about splitting some things out (i.e. this IPC
stuff, another chunk to split out could be the GCS interface). It could
make testing and development easier. But making it available via
pgFoundry and promoting it as a separate project is another story which
certainly depends on some interested people asking for it.

If Linus didn't get any answers to his famous post "What would you like
to see most in minix?" he most probably wouldn't have published Linux.

Regards

Markus

#88Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Jim Nasby (#86)
Re: Integrating Replication into Core

On Sun, Dec 03, 2006 at 10:04:46PM -0800, Jim Nasby wrote:

Oracle's replication does this, and I wish Slony would. Having access
to the queuing/communications mechanism that the Slony folks have
built would be very useful.

Abstraction patches are welcome ;-)

Seriously, though, part of what I'm attempting to achieve (and that
it keeps happening here suggests to me that another list was a bad
idea) is to identify these _elements_. Then we can recycle them,
after all.

A
--
Andrew Sullivan | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler