Few questions on postgresql (dblink, 2pc, clustering)

Started by Jim Workeover 21 years ago63 messagesgeneral
Jump to latest
#1Jim Worke
jimworke@inbox.lv

Hi list,

My boss would like to convert from MySQL to PostgreSQL, but there are a few
things that hold us back.

We'd like to have the databases in several physical servers, with referential
integrity between the tables. Reading from Oracle articles, their
referential integrity can be done using their dblink and triggers. What I'd
like to ask is whether contrib/dblink is stable? Seeing dblink in a contrib
directory does bring a doubt (though we've read many success stories with
dblink). And can the Oracle's way of doing referential integrity be used in
PostgreSQL? Is it recommended to do that?

I've read the PostgreSQL's TODO list, and I can't find anything about 2-phase
commit being implemented in 8.0. Is this correct? Because I also read
there's a 2PC patch being implemented (http://snaga.org/pgsql/). But
progress seems to have stopped (correct me if I'm wrong).

Basically, our concern is that dblink, 2PC implementation are there, but not
in the PostgreSQL mainstream.

Another thing that bothers us is that we can't find any multi-master
clustering solution in PostgreSQL. We're actually evaluating MySQL's own
clustering solution, but it's production quality release is still slated for
MySQL 5.0.

#2Bruce Momjian
bruce@momjian.us
In reply to: Jim Worke (#1)
Re: Few questions on postgresql (dblink, 2pc, clustering)

Jim Worke wrote:

Hi list,

My boss would like to convert from MySQL to PostgreSQL, but there are a few
things that hold us back.

We'd like to have the databases in several physical servers, with referential
integrity between the tables. Reading from Oracle articles, their
referential integrity can be done using their dblink and triggers. What I'd
like to ask is whether contrib/dblink is stable? Seeing dblink in a contrib

It is certainly stable, but it does not have transaction semantics like
local table queries.

directory does bring a doubt (though we've read many success stories with
dblink). And can the Oracle's way of doing referential integrity be used in
PostgreSQL? Is it recommended to do that?

Again, no persistent locking.,

I've read the PostgreSQL's TODO list, and I can't find anything about 2-phase
commit being implemented in 8.0. Is this correct? Because I also read
there's a 2PC patch being implemented (http://snaga.org/pgsql/). But
progress seems to have stopped (correct me if I'm wrong).

2-phase isn't in 8.0 but I expect it in 8.1.

Basically, our concern is that dblink, 2PC implementation are there, but not
in the PostgreSQL mainstream.

You need to understand the limitations of dblink and see if it will work
for you. I can't imagine MySQl is allowing you to do this cleanly so I
don't see why it would hold up a MySQL -> PostgreSQL migration.

Another thing that bothers us is that we can't find any multi-master
clustering solution in PostgreSQL. We're actually evaluating MySQL's own
clustering solution, but it's production quality release is still slated for
MySQL 5.0.

The only multi-master I know of is pgcluster. There is talking of
moving Slony from master/slave to multi-master but work has not started
yet.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Jim Worke
jimworke@inbox.lv
In reply to: Bruce Momjian (#2)
Re: Few questions on postgresql (dblink, 2pc, clustering)

On Sunday 22 August 2004 11:02, Bruce Momjian wrote:

2-phase isn't in 8.0 but I expect it in 8.1.

Is it possible to know when is 8.1 going to be released for production (an
estimate)?

Basically, our concern is that dblink, 2PC implementation are there, but
not in the PostgreSQL mainstream.

You need to understand the limitations of dblink and see if it will work
for you. I can't imagine MySQl is allowing you to do this cleanly so I
don't see why it would hold up a MySQL -> PostgreSQL migration.

Hmm... forgive me for saying it wrongly. We're actually "thinking" of
migrating to PostgreSQL. Here's our case:

We're going to do a major upgrading on our PHP code (from PHP 3 style to PHP
5.0), and was thinking of changing the database to PostgreSQL too.
Currently, the number of transaction is not high, but we'd like to have a
more scalable solution.

MySQL does not allow cross-server database connection such as dblink. So,
we're thinking of 3 alternatives:

1) Wait for MySQL clustering to be stable and put all our databases in the
cluster
2) Migrate to PostgreSQL and use dblink to solve the referential integrity
3) Migrate to PostgreSQL clustering solution

If (2) and (3) is not viable, then we'd rather not migrate the database to
PostgreSQL for now (if it ain't broke, don't fix it)...

So, it's not actually holding us up, but just that we're not able to make
decision yet.

Another thing that bothers us is that we can't find any multi-master
clustering solution in PostgreSQL. We're actually evaluating MySQL's own
clustering solution, but it's production quality release is still slated
for MySQL 5.0.

The only multi-master I know of is pgcluster. There is talking of
moving Slony from master/slave to multi-master but work has not started
yet.

I don't mean to be rude or anything, but having 3rd-party solution is a scary
option for a business enterprise. I know that they're stable and all, but if
it's not supported by PostgreSQL themselves (i.e. included in PostgreSQL as a
whole package), we're afraid that we have to change our code/design in case
the product has stopped progress.

For example, pgcluster's patch is for PostgreSQL 7.3.6. It's not in sync with
PostgreSQL's current version (I'm not blaming the guy... He's created a very
good solution and I'm thankful for that). It's just that for my company (and
I guess many other companies too), it's more appealing to have a database
solution that comes in a package.

#4Thomas Hallgren
thhal@mailblocks.com
In reply to: Jim Worke (#3)
Unsupported 3rd-party solutions (Was: Few questions on postgresql (dblink, 2pc, clustering))

Jim Worke wrote:

I don't mean to be rude or anything, but having 3rd-party solution is a scary
option for a business enterprise. I know that they're stable and all, but if
it's not supported by PostgreSQL themselves (i.e. included in PostgreSQL as a
whole package), we're afraid that we have to change our code/design in case
the product has stopped progress.

For example, pgcluster's patch is for PostgreSQL 7.3.6. It's not in sync with
PostgreSQL's current version (I'm not blaming the guy... He's created a very
good solution and I'm thankful for that). It's just that for my company (and
I guess many other companies too), it's more appealing to have a database
solution that comes in a package.

Those are very interesting remarks. I'm the author of PL/Java, a module
that also could be thought of as "not supported by PostgreSQL
themselves", and I've made the same reflection as you have. It would be
beneficial to have some organisational entity within Postgres where this
issue could be addressed (i.e. packaging/synchronization and supported
configurations). I think it could give a real boost to PostgreSQL as such.

Sure, an open source community does not make support commitments. But
the PostgreSQL community is large and that creates (a sense of) safety
and continuity. This sense is not automatically transferred to the
"3rd-party solutions".

From a users perspective and perhaps especially from the decision
makers perspective, the fact that you have to download various modules
from gborg etc. is indeed scary. Who will support your chosen solution a
year from now? IMHO, if PosgreSQL is aiming for larger business
acceptance, this has to be resolved. Contributors like myself must be
given the opportunity to get things "verified" and checked in as
"supported". It would do PostgreSQL an awful lot of good if there where
supported configurations including replication, server side language
support (Perl, Tcl, Java, etc.), JDBC and ODCB drivers, and other things
that you'd normally find in commercial enterprise solutions.

Regards,

Thomas Hallgren

#5Gaetano Mendola
mendola@bigfoot.com
In reply to: Jim Worke (#3)
Re: Few questions on postgresql (dblink, 2pc, clustering)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jim Worke wrote:
| On Sunday 22 August 2004 11:02, Bruce Momjian wrote:
|
|>2-phase isn't in 8.0 but I expect it in 8.1.
|
|
| Is it possible to know when is 8.1 going to be released for production (an
| estimate)?

Consider that 8.0 will be release *may be* during the end of this year.
Usually a development cycle between two release is 9 month and + 3 month
beta let me say: 8.1 will be release in 12 months. The core will try to
have a shortest cycle for 8.1 but I'll not bet on it.

|>>Basically, our concern is that dblink, 2PC implementation are there, but
|>>not in the PostgreSQL mainstream.
|>
|>You need to understand the limitations of dblink and see if it will work
|>for you. I can't imagine MySQl is allowing you to do this cleanly so I
|>don't see why it would hold up a MySQL -> PostgreSQL migration.
| Hmm... forgive me for saying it wrongly. We're actually "thinking" of
| migrating to PostgreSQL. Here's our case:
|
| We're going to do a major upgrading on our PHP code (from PHP 3 style to PHP
| 5.0), and was thinking of changing the database to PostgreSQL too.
| Currently, the number of transaction is not high, but we'd like to have a
| more scalable solution.
|
| MySQL does not allow cross-server database connection such as dblink. So,
| we're thinking of 3 alternatives:
|
| 1) Wait for MySQL clustering to be stable and put all our databases in the
| cluster
| 2) Migrate to PostgreSQL and use dblink to solve the referential integrity
| 3) Migrate to PostgreSQL clustering solution

May I know why are you sticky on the idea of spread your database among
various servers ? Free your mysql-minded. If you idea is an horizontal
scale solution then open your wallet and buy Oracle.
Postgresql scale very well vertically.

<SciencieFiction>
Another solution is hack the postmaster in order to have two parallel
postmaster running on the same server ( first phase ), when you did
this successfully then the second phase ( to hack too ) is buy the
hardware that permit more servers to share an unique shared memory
segment and then with the help of SAN you can have two postmaster that
are running on two different server that are belonging to a SAN and the
common shared memory segment.
</ScienceFiction>

Right now your only solution is buy a multiprocessor machine.

Regards
Gaetano Mendola

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBKG6x7UpzwH2SGd4RAn06AKCQ50Nbp8qvNlMQt2TZqCEcrsMWdgCgphRC
aAn1xCqgGYIh0KtSy3s4zSI=
=iDku
-----END PGP SIGNATURE-----

#6Jim Worke
jimworke@inbox.lv
In reply to: Thomas Hallgren (#4)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql (dblink, 2pc, clustering))

On Sunday 22 August 2004 16:45, you wrote:

Jim Worke wrote:

I don't mean to be rude or anything, but having 3rd-party solution is a
scary option for a business enterprise. I know that they're stable and
all, but if it's not supported by PostgreSQL themselves (i.e. included in
PostgreSQL as a whole package), we're afraid that we have to change our
code/design in case the product has stopped progress.

For example, pgcluster's patch is for PostgreSQL 7.3.6. It's not in sync
with PostgreSQL's current version (I'm not blaming the guy... He's
created a very good solution and I'm thankful for that). It's just that
for my company (and I guess many other companies too), it's more
appealing to have a database solution that comes in a package.

Those are very interesting remarks. I'm the author of PL/Java, a module
that also could be thought of as "not supported by PostgreSQL
themselves", and I've made the same reflection as you have. It would be
beneficial to have some organisational entity within Postgres where this
issue could be addressed (i.e. packaging/synchronization and supported
configurations). I think it could give a real boost to PostgreSQL as such.

Sure, an open source community does not make support commitments. But
the PostgreSQL community is large and that creates (a sense of) safety
and continuity. This sense is not automatically transferred to the
"3rd-party solutions".

From a users perspective and perhaps especially from the decision
makers perspective, the fact that you have to download various modules
from gborg etc. is indeed scary. Who will support your chosen solution a
year from now? IMHO, if PosgreSQL is aiming for larger business
acceptance, this has to be resolved. Contributors like myself must be
given the opportunity to get things "verified" and checked in as
"supported". It would do PostgreSQL an awful lot of good if there where
supported configurations including replication, server side language
support (Perl, Tcl, Java, etc.), JDBC and ODCB drivers, and other things
that you'd normally find in commercial enterprise solutions.

I'm CC'ing this to the postgresql mailing list.

I fully agree to your statement (to get things "verified" and checked in as
"supported"). Hopefully there's a way out for this...

#7Sim Zacks
sim@compulab.co.il
In reply to: Jim Worke (#6)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql (dblink, 2pc, clustering))

I think the major question would still remain, "What is that worth?",
if PostGreSQL says that the software works with their product. If there was a problem with it, or
if the developer decided not to upgrade to a new version with the main
product, there is still no warranty. PostGreSQL will probably not pick
up the development and upgrade it themselves.

This is very similar to asking the Linux distributions to officially support
PostGreSQL. They do to a level, but that includes just the
installation. If they upgrade and PostGreSQL in not compliant with
them anymore they will tell you they do not support it anymore. If you
think about it, PostGreSQL is a third-party product to the Operating
System. I don't think PostGreSQL is officially "REDHAT compliant" (as
an example). Then this supporting software is third-party to the DBMS. This
seems to be the way OpenSource works.

I agree with you that a third-party solution may not be the most
comfortable thing to do, but that is one of the inherent risks when
going Open Source. There is also no guarantee on PostGreSQL doing what
it is supposed to or what they say it does. The only difference
between using PostGreSQL and using a product that works with
PostGreSQL is that you have increased your risk. It was not like you
were in the safe-zone and were determining whether to take a risk or
not.

________________________________________________________________________________

On Sunday 22 August 2004 16:45, you wrote:

Jim Worke wrote:

I don't mean to be rude or anything, but having 3rd-party solution is a
scary option for a business enterprise. I know that they're stable and
all, but if it's not supported by PostgreSQL themselves (i.e. included in
PostgreSQL as a whole package), we're afraid that we have to change our
code/design in case the product has stopped progress.

For example, pgcluster's patch is for PostgreSQL 7.3.6. It's not in sync
with PostgreSQL's current version (I'm not blaming the guy... He's
created a very good solution and I'm thankful for that). It's just that
for my company (and I guess many other companies too), it's more
appealing to have a database solution that comes in a package.

Those are very interesting remarks. I'm the author of PL/Java, a module
that also could be thought of as "not supported by PostgreSQL
themselves", and I've made the same reflection as you have. It would be
beneficial to have some organisational entity within Postgres where this
issue could be addressed (i.e. packaging/synchronization and supported
configurations). I think it could give a real boost to PostgreSQL as such.

Sure, an open source community does not make support commitments. But
the PostgreSQL community is large and that creates (a sense of) safety
and continuity. This sense is not automatically transferred to the
"3rd-party solutions".

From a users perspective and perhaps especially from the decision
makers perspective, the fact that you have to download various modules
from gborg etc. is indeed scary. Who will support your chosen solution a
year from now? IMHO, if PosgreSQL is aiming for larger business
acceptance, this has to be resolved. Contributors like myself must be
given the opportunity to get things "verified" and checked in as
"supported". It would do PostgreSQL an awful lot of good if there where
supported configurations including replication, server side language
support (Perl, Tcl, Java, etc.), JDBC and ODCB drivers, and other things
that you'd normally find in commercial enterprise solutions.

I'm CC'ing this to the postgresql mailing list.

I fully agree to your statement (to get things "verified" and checked in as
"supported"). Hopefully there's a way out for this...

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#8Chris Browne
cbbrowne@acm.org
In reply to: Jim Worke (#3)
Re: Few questions on postgresql (dblink, 2pc, clustering)

After a long battle with technology, mendola@bigfoot.com (Gaetano Mendola), an earthling, wrote:

| MySQL does not allow cross-server database connection such as dblink. So,
| we're thinking of 3 alternatives:
|
| 1) Wait for MySQL clustering to be stable and put all our databases in the
| cluster
| 2) Migrate to PostgreSQL and use dblink to solve the referential integrity
| 3) Migrate to PostgreSQL clustering solution

May I know why are you sticky on the idea of spread your database
among various servers ? Free your mysql-minded. If you idea is an
horizontal scale solution then open your wallet and buy Oracle.
Postgresql scale very well vertically.

Indeed.

It seems quite unclear what the proposed merit of "clustering" is
_supposed_ to be, and I think it bears asking Jim Worke to explain in
more detail what they're trying to accomplish by it.

- If the goal is improved redundancy, then I'd argue that doing all
the updates on the "master" and having several "hot replicas" being
maintained using Slony-I would accomplish that without technology
changes.

- If the goal is to somehow get better performance by partitioning
work across multiple servers, then this can be attained by having
those several "hot replicas," and directing as many read-only
operations to replicas as possible. Updates have to go to the
"master;" by dropping out query load, that lets the "master" be
occupied primarily with write operations.

- System reliability does NOT improve if write operations are
spread across several servers. We have been setting up extra
replicas of some databases on some new servers lately, and people
that _haven't_ thought it out have briefly imagined it a good idea
to spread the 'masters' across more servers, which would _hurt_
reliability, in fact.

The situation we have is that our app needs access to two
databases: one that stores "state," and another that logs activity.

The theory that people come up with is that when we have 3 main "big
servers," we should write the "state" to one, and logs to another.

Reality rears its ugly head: Doing that makes the system more
vulnerable, as if _either_ of those servers goes down, that will
cause the application to go into convulsions. Putting "state" and
"logs" on the same server, and replicating everywhere else is, in
fact, the more reliable choice. If the ONE server that's "master"
goes down, the application won't be happy, but that was always going
to be the case.

The point: Distributing writes across many hosts makes the system
vulnerable to the possibility of _any_ of them going down.

Furthermore, it is not evident that distributing writes will be able
to actually improve performance, because it introduces substantial
additional communications overhead.

At some point, there will be parts of the write activities that have
to be handled in a serial manner, in one place. For instance,
updating and reporting account balances must ultimately be thus
handled. If there are 10 customers fighting over who gets to order
your last 20 pallets of "Grapple Grommets," the handling of who
actually gets that inventory has _got_ to be serialized in one place
if you don't want to run the risk of the over-commitment of perhaps
promising 200 pallets worth when you only have 20.

2PC provides a way of managing that serialization across multiple
databases; it has two inherent problems relating to what is already
said above:

1. The communications overhead involved in coordinating requests
across multiple databases means that it's unlikely to be a
"performance win";

2. There are deadlock situations that 2PC is vulnerable to that
make it _less_ reliable than having just one database, when
there is the possibility of hardware or communications
failure.

Note: These aren't PostgreSQL-specific issues; they apply to the
various classes of "clustering solutions." It may be that Oracle or
DB2 or Informix have some particular features that partially relieve
some of the performance problems with "partitioning," but that still
leaves a big bill to pay, and you'd better be sure you're getting
some actual value for the extra coin...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www3.sympatico.ca/cbbrowne/spiritual.html
Signs of a Klingon Programmer #8: "What is this talk of 'release'?
Klingons do not make software 'releases.' Our software 'escapes'
leaving a bloody trail of designers and quality assurance people in
its wake."

#9Thomas Hallgren
thhal@mailblocks.com
In reply to: Sim Zacks (#7)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql

Sim,

I think the major question would still remain, "What is that worth?",
if PostGreSQL says that the software works with their product. If there was a problem with it, or
if the developer decided not to upgrade to a new version with the main
product, there is still no warranty. PostGreSQL will probably not pick
up the development and upgrade it themselves.

I agree that if that was the case, the whole exercise would be pointless.

I agree with you that a third-party solution may not be the most
comfortable thing to do, but that is one of the inherent risks when
going Open Source. There is also no guarantee on PostGreSQL doing what
it is supposed to or what they say it does. The only difference
between using PostGreSQL and using a product that works with
PostGreSQL is that you have increased your risk. It was not like you
were in the safe-zone and were determining whether to take a risk or
not.

I agree here too. But I think there's more to it.

There's a lot of PostgreSQL users out there today. The community is very
large and the risk of it going down the drain any time soon is extremely
small. The risk of using a third party product however, is on the
opposite side of that scale. This is bad for all partys involved. It's
bad for PostgreSQL as a whole, for the third party product, and for all
users, potential or existing.

So, my suggestion is that the "verify" to get a "supported" status would
actually mean something. It would mean that the PostgreSQL community has
a real interest in the product and that they have every intention to
maintain it. Such a "commitment" would of course not be binding in any
way but it would state an intent. I think that would be very valuable to
everyone!

Combine that with pre-packaged distributions containing the supported
modules to be found at the PostgreSQL web site. You ask "What is that
worth?". I for one claim it's worth a great deal. It would really
motivate me (and I imagine others) to go further with my product. It
would give potential users a "product" that contain everything they want
it to contain. It strengthen PostgreSQL's position as the most advanced
free database in the market even more.

Regards,

Thomas Hallgren

#10Mike Nolan
nolan@gw.tssi.com
In reply to: Sim Zacks (#7)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql (dblink, 2pc, clustering)) (fwd)

I agree with you that a third-party solution may not be the most
comfortable thing to do, but that is one of the inherent risks when
going Open Source. There is also no guarantee on PostGreSQL doing what
it is supposed to or what they say it does. The only difference
between using PostGreSQL and using a product that works with
PostGreSQL is that you have increased your risk. It was not like you
were in the safe-zone and were determining whether to take a risk or
not.

Having been an Oracle user, there's no guarantee from them that things will
work either, and a boatload of lawyers have made sure that their licensing
forms absolve them of any contingent liability for undiagnosed or unfixed
bugs.

Being a paid customer (with support fees that quickly add up to far more
than the initial licensing fees) just gives you more of a right to complain
to somebody, it doesn't guarantee you that things get fixed, or certainly
not in a particular time frame. (I remember complaining at Oracle
user conventions about a particular deficiency in Oracle 7 for several
years, I don't think it got resolved until Oracle 8.)

In my 30 years or so of MIS experience, most of the last decade as a DBA,
I've seen many more serious problems caused by improper use of properly
working tools than by tools that don't work properly.
--
Mike Nolan

#11The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#9)
Re: Unsupported 3rd-party solutions (Was: Few questions

On Sun, 22 Aug 2004, Thomas Hallgren wrote:

Combine that with pre-packaged distributions containing the supported
modules to be found at the PostgreSQL web site. You ask "What is that
worth?". I for one claim it's worth a great deal. It would really
motivate me (and I imagine others) to go further with my product. It
would give potential users a "product" that contain everything they want
it to contain. It strengthen PostgreSQL's position as the most advanced
free database in the market even more.

Its actually worth alot less then you think ... for instance, if we were
to import Slony as part of 8.0 (we won't, but its an example) since it is
"todays fashion for replication", there is no guarantee that for 8.1, we
don't replace it with something that happens to be better then ... then
the person that deployed Slony because it was 'the fashion of the day' is
now using something that is no longer considered "supported"? Do they
change then?

libpq++ is a better example, because it happened ... nobody was working on
it, and someone came along and decided that since he felt it was too
broken, he started to develop his own which became better then what we had
in the tree, even though the one in the tree was considered "the supported
one" ...

The benefit (and drawback) to OSS is that it is possible to have several
potential solutions to the same problem ... in 6 months time, who says
there is no second (or third) implementation of PL/Java out there? Or
pl/PHP?

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Hallgren (#4)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql (dblink, 2pc, clustering))

Thomas Hallgren <thhal@mailblocks.com> writes:

... the fact that you have to download various modules
from gborg etc. is indeed scary. Who will support your chosen solution a
year from now? IMHO, if PosgreSQL is aiming for larger business
acceptance, this has to be resolved. Contributors like myself must be
given the opportunity to get things "verified" and checked in as
"supported". It would do PostgreSQL an awful lot of good if there where
supported configurations including replication, server side language
support (Perl, Tcl, Java, etc.), JDBC and ODCB drivers, and other things
that you'd normally find in commercial enterprise solutions.

Supported by *whom* exactly? It won't be the core committee; we have
more than enough to do managing the server itself.

Whoever is actually doing this "verifying" and "supporting" can take
on the work of producing the "supported configuration" package too;
IMHO it would really be pretty meaningless to do otherwise.

I think the place where this most naturally falls is with the commercial
Linux distributors (Red Hat, Suse, etc). They're already in the
business of assembling disparate upstream sources and making sure those
bits play nicely together.

regards, tom lane

#13Thomas Hallgren
thhal@mailblocks.com
In reply to: The Hermit Hacker (#11)
Re: Unsupported 3rd-party solutions (Was: Few questions

Marc,

Its actually worth alot less then you think ... for instance, if we
were to import Slony as part of 8.0 (we won't, but its an example)
since it is "todays fashion for replication", there is no guarantee
that for 8.1, we don't replace it with something that happens to be
better then ... then the person that deployed Slony because it was
'the fashion of the day' is now using something that is no longer
considered "supported"? Do they change then?

libpq++ is a better example, because it happened ... nobody was
working on it, and someone came along and decided that since he felt
it was too broken, he started to develop his own which became better
then what we had in the tree, even though the one in the tree was
considered "the supported one" ...

I never said that the conglomerate of "supported" add-ons should be
static. On the contrary! The entity behind what I'm proposing should
strive to provide "best of breed" solutions. The fact that you actually
replace a dying module with a superior vital one is not negative to
anyone as long as you provide a migration path.

What do you think would happen if the libpq++ replacement was
acknowledged and supported by PostgreSQL provided the people behind it
made migration simple for existing libpq++ users (the question is
hypothetical, I have no clue how easy or difficult such a migration is
in real life)?

The benefit (and drawback) to OSS is that it is possible to have
several potential solutions to the same problem ... in 6 months time,
who says there is no second (or third) implementation of PL/Java out
there? Or pl/PHP?

PL/Java is a good example since it's one of two server side Java
solutions. The other one is PL/J. While they provide the same
functionality, they are fundamentally different in implementation and
the best choice is likely to vary depending on the intended use (this is
b.t.w. very true for different replication solutions also, so the
remainder of this argument is not specific to server side java).

There's nothing wrong with this. PostgreSQL could support more than one
solution and provide guidelines describing when one is a better solution
then the other(s). Further more, PostgreSQL could put up some
requirements to the contributors so that they where inclined to expose
somewhat similar interfaces to the end user, thereby making it easy for
the end user to try out more than one.

Should yet another solution pop up, well then the contributor of that
one has to motivate why it too should be supported. Perhaps the
motivation is that it's very similar to an existing solution but it
shows superior performance and stability. When that happens (and is
proven), the existing solution is replaced. The contributor of the new
solution must of course ensure ease of migration.

A solution that is judged equal or not as good should be rejected on the
base that a viable solution that perfoms same task is already supported.
It's no different from accepting or rejecting patches really. It's just
on a larger scale.

Some efforts could be made to encourage potential contributors of new
modules to get involved with existing work instead of providing new
solutions for already existing ones (such as with the libpg++). The
advantage with OSS is that an offer to contribute is very often gladly
accepted by the project owner :-)

Regards,

Thomas Hallgren

#14Thomas Hallgren
thhal@mailblocks.com
In reply to: Tom Lane (#12)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql

Tom,

Supported by *whom* exactly? It won't be the core committee; we have
more than enough to do managing the server itself.

I don't doubt that for a second. What I'm suggesting must be staffed
somehow. The core committee must be involved though or the whole idea
falls apart. You *are* PostgreSQL (at least to me).

Whoever is actually doing this "verifying" and "supporting" can take
on the work of producing the "supported configuration" package too;
IMHO it would really be pretty meaningless to do otherwise.

Agree.

I think the place where this most naturally falls is with the commercial
Linux distributors (Red Hat, Suse, etc). They're already in the
business of assembling disparate upstream sources and making sure those
bits play nicely together.

Here I don't agree. It's very important that the packaging is made by
PostgreSQL. I'm not contributing PL/Java for the benefit of Red Hat or
Suse. I'm doing it because I want to improve the database. Also, when a
Solaris or Windows customer wants a database solution, it's higly
unlikely that they'd consult a commercial Linux distributor.

Regards,

Thomas Hallgren

#15Chris Browne
cbbrowne@acm.org
In reply to: Jim Worke (#3)
Re: Unsupported 3rd-party solutions (Was: Few questions on postgresql

Oops! thhal@mailblocks.com (Thomas Hallgren) was seen spray-painting on a wall:

So, my suggestion is that the "verify" to get a "supported" status
would actually mean something. It would mean that the PostgreSQL
community has a real interest in the product and that they have
every intention to maintain it. Such a "commitment" would of course
not be binding in any way but it would state an intent. I think that
would be very valuable to everyone!

I'd suggest that the "pgxs" work is pointed at making it easier to do
this sort of thing.

Make it easier to add in third party packages in a repeatable,
testable way, and that makes it easier to make the "3rd party"
material supportable.
--
select 'cbbrowne' || '@' || 'ntlug.org';
http://www3.sympatico.ca/cbbrowne/emacs.html
"What if you slept? And what if, in your sleep, you dreamed?
And what if, in your dream, you went to heaven and there
plucked a strange and beautiful flower? And what if, when
you awoke, you had the flower in your hand? Ah, what then?"
--Coleridge

#16Bruce Momjian
bruce@momjian.us
In reply to: Thomas Hallgren (#14)
Re: Unsupported 3rd-party solutions (Was: Few questions on

Thomas Hallgren wrote:

Tom,

Supported by *whom* exactly? It won't be the core committee; we have
more than enough to do managing the server itself.

I don't doubt that for a second. What I'm suggesting must be staffed
somehow. The core committee must be involved though or the whole idea
falls apart. You *are* PostgreSQL (at least to me).

Have you seen the discussion about the MONEY type on the hackers list
today? We don't even have the manpower to _remove_ or _adjust_ an
existing data type that we know has been broken for perhaps +3 years.
How are we going to find time to do more about integrating and
supporting more software?

Answer me that and we can move forward.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#17Thomas Hallgren
thhal@mailblocks.com
In reply to: Thomas Hallgren (#13)
Re: Unsupported 3rd-party solutions (Was: Few questions

Bruce Momjian wrote:

Have you seen the discussion about the MONEY type on the hackers list

today? We don't even have the manpower to _remove_ or _adjust_ an
existing data type that we know has been broken for perhaps +3 years.
How are we going to find time to do more about integrating and
supporting more software?

Answer me that and we can move forward.

I see PostgreSQL taking a major leap forward with 8.0. You will
undoubtedly appeal to a much larger market. That in turn will increase
the pressure. You are getting increasingly popular and somehow you must
see to that this popularity is reflected in the amount of sponsoring
that you receive. If you don't, you will face even bigger resource
problems in the future.

A good way to appeal to sponsors is to extend what you do today to
include more features so what I'm suggesting would probably be a good
thing in that respect.

Perhaps I'm a bit to visionary. But I really think you (the core
commitee) need to consider this.

Regards,

Thomas Hallgren

#18Bruce Momjian
bruce@momjian.us
In reply to: Thomas Hallgren (#17)
Re: Unsupported 3rd-party solutions (Was: Few questions on

Thomas Hallgren wrote:

I see PostgreSQL taking a major leap forward with 8.0. You will
undoubtedly appeal to a much larger market. That in turn will increase
the pressure. You are getting increasingly popular and somehow you must
see to that this popularity is reflected in the amount of sponsoring
that you receive. If you don't, you will face even bigger resource
problems in the future.

A good way to appeal to sponsors is to extend what you do today to
include more features so what I'm suggesting would probably be a good
thing in that respect.

Perhaps I'm a bit to visionary. But I really think you (the core
commitee) need to consider this.

I have already heard from one company-sponsored guy who might want to
take this on, so there is hope.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#19Jim Worke
jimworke@inbox.lv
In reply to: Chris Browne (#8)
Re: Few questions on postgresql (dblink, 2pc, clustering)

On Sunday 22 August 2004 21:44, you wrote:

After a long battle with technology, mendola@bigfoot.com (Gaetano Mendola),

an earthling, wrote:

| MySQL does not allow cross-server database connection such as dblink.
| So, we're thinking of 3 alternatives:
|
| 1) Wait for MySQL clustering to be stable and put all our databases in
| the cluster
| 2) Migrate to PostgreSQL and use dblink to solve the referential
| integrity 3) Migrate to PostgreSQL clustering solution

May I know why are you sticky on the idea of spread your database
among various servers ? Free your mysql-minded. If you idea is an
horizontal scale solution then open your wallet and buy Oracle.
Postgresql scale very well vertically.

Indeed.

It seems quite unclear what the proposed merit of "clustering" is
_supposed_ to be, and I think it bears asking Jim Worke to explain in
more detail what they're trying to accomplish by it.

Sorry. Been rather confused with so many things that need to be done,
sometimes I focused on what I want rather than what I need.

- If the goal is improved redundancy, then I'd argue that doing all
the updates on the "master" and having several "hot replicas" being
maintained using Slony-I would accomplish that without technology
changes.

Yes indeed it would and I'm going to try it out soon :)

Actually we're also planning an exercise to replicate the database to a data
recovery site. When everything is ready, we'd like the new site to be the
primary site (the clients will be switched to the new site at an instance by
my ISP). That's why I was thinking that, if there is a multi-master
replication, the clients can be switched to the new site and the old site
still have the new data. So any transaction consolidation would not be
disrupted.

I guess I should've said the above earlier rather than ask ABOUT the
solution :)

- If the goal is to somehow get better performance by partitioning
work across multiple servers, then this can be attained by having
those several "hot replicas," and directing as many read-only
operations to replicas as possible. Updates have to go to the
"master;" by dropping out query load, that lets the "master" be
occupied primarily with write operations.

I was hoping not having to change the application code by changing queries to
one server and updates to another server.

- System reliability does NOT improve if write operations are
spread across several servers. We have been setting up extra
replicas of some databases on some new servers lately, and people
that _haven't_ thought it out have briefly imagined it a good idea
to spread the 'masters' across more servers, which would _hurt_
reliability, in fact.

The situation we have is that our app needs access to two
databases: one that stores "state," and another that logs activity.

The theory that people come up with is that when we have 3 main "big
servers," we should write the "state" to one, and logs to another.

Reality rears its ugly head: Doing that makes the system more
vulnerable, as if _either_ of those servers goes down, that will
cause the application to go into convulsions. Putting "state" and
"logs" on the same server, and replicating everywhere else is, in
fact, the more reliable choice. If the ONE server that's "master"
goes down, the application won't be happy, but that was always going
to be the case.

The point: Distributing writes across many hosts makes the system
vulnerable to the possibility of _any_ of them going down.

Yes, that is true. So separating databases across servers shouldn't be the
way to go. Thanks.

#20The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#13)
Re: Unsupported 3rd-party solutions (Was: Few questions

On Sun, 22 Aug 2004, Thomas Hallgren wrote:

PL/Java is a good example since it's one of two server side Java
solutions. The other one is PL/J. While they provide the same
functionality, they are fundamentally different in implementation and
the best choice is likely to vary depending on the intended use (this is
b.t.w. very true for different replication solutions also, so the
remainder of this argument is not specific to server side java).

Since I (and I don't believe anyone else on core) uses Java ... shouldn't
it be up to the developer of the PL/J* modules to do this? We can't weigh
which one is better then the other, as we don't use it ...

Also, how does someone support something that they don't use? Again, that
is the developer of PL/J*'s job to do, not ours ...

Should yet another solution pop up, well then the contributor of that
one has to motivate why it too should be supported. Perhaps the
motivation is that it's very similar to an existing solution but it
shows superior performance and stability. When that happens (and is
proven), the existing solution is replaced. The contributor of the new
solution must of course ensure ease of migration.

At that rate, we'll have to distribute via CD to anyone that wants
PostgreSQL ... cause downloading it via FTP won't be a viable option
anymore :)

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Hallgren (#17)
#22The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#21)
#23Thomas Hallgren
thhal@mailblocks.com
In reply to: The Hermit Hacker (#20)
#24Jan Wieck
JanWieck@Yahoo.com
In reply to: Jim Worke (#1)
#25Bruce Momjian
bruce@momjian.us
In reply to: Thomas Hallgren (#23)
#26The Hermit Hacker
scrappy@hub.org
In reply to: Jim Worke (#3)
#27Csaba Nagy
nagy@ecircle-ag.com
In reply to: Bruce Momjian (#25)
#28Bruce Momjian
bruce@momjian.us
In reply to: Csaba Nagy (#27)
#29Gaetano Mendola
mendola@bigfoot.com
In reply to: Bruce Momjian (#28)
#30The Hermit Hacker
scrappy@hub.org
In reply to: Csaba Nagy (#27)
#31The Hermit Hacker
scrappy@hub.org
In reply to: Gaetano Mendola (#29)
#32Gaetano Mendola
mendola@bigfoot.com
In reply to: The Hermit Hacker (#30)
#33Thomas Hallgren
thhal@mailblocks.com
In reply to: Jim Worke (#3)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Hallgren (#33)
#35The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#33)
#36Thomas Hallgren
thhal@mailblocks.com
In reply to: Tom Lane (#34)
#37Thomas Hallgren
thhal@mailblocks.com
In reply to: Bruce Momjian (#25)
#38Thomas Hallgren
thhal@mailblocks.com
In reply to: Bruce Momjian (#25)
#39Ben
bench@silentmedia.com
In reply to: Thomas Hallgren (#38)
#40Joshua D. Drake
jd@commandprompt.com
In reply to: Thomas Hallgren (#37)
#41Joshua D. Drake
jd@commandprompt.com
In reply to: Ben (#39)
#42Bruce Momjian
bruce@momjian.us
In reply to: Thomas Hallgren (#38)
#43The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#36)
#44The Hermit Hacker
scrappy@hub.org
In reply to: Joshua D. Drake (#40)
#45Jim Worke
jimworke@inbox.lv
In reply to: Bruce Momjian (#25)
#46Bruce Momjian
bruce@momjian.us
In reply to: Jim Worke (#45)
#47Jan Wieck
JanWieck@Yahoo.com
In reply to: Bruce Momjian (#42)
#48Bruce Momjian
bruce@momjian.us
In reply to: Jan Wieck (#47)
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#47)
#50Thomas Hallgren
thhal@mailblocks.com
In reply to: The Hermit Hacker (#35)
#51Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Thomas Hallgren (#37)
#52Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#49)
#53The Hermit Hacker
scrappy@hub.org
In reply to: Jan Wieck (#52)
#54Thomas Hallgren
thhal@mailblocks.com
In reply to: Karsten Hilbert (#51)
#55The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#54)
#56Chris Browne
cbbrowne@acm.org
In reply to: Thomas Hallgren (#4)
#57Thomas Hallgren
thhal@mailblocks.com
In reply to: Chris Browne (#56)
#58Thomas Hallgren
thhal@mailblocks.com
In reply to: Thomas Hallgren (#57)
#59The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#58)
#60Thomas Hallgren
thhal@mailblocks.com
In reply to: The Hermit Hacker (#59)
#61The Hermit Hacker
scrappy@hub.org
In reply to: Thomas Hallgren (#58)
#62Jan Wieck
JanWieck@Yahoo.com
In reply to: The Hermit Hacker (#59)
#63Chris Browne
cbbrowne@acm.org
In reply to: Thomas Hallgren (#58)