Justifying a PG over MySQL approach to a project

Started by Gauthier, Daveover 16 years ago51 messagesgeneral
Jump to latest
#1Gauthier, Dave
dave.gauthier@intel.com

Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know nothing about DBs) why I chose to use PG over MySQL in a project, MySQL being the more popular DB choice with other engineers, and managers fearing things that are "different" (risk). I have a few hard tecnical reasons (check constraint, deferred constraint checking, array data type), but I'm looking for a "it's more reliable" reasons. Again, the audience is managers. Is there an impartial, 3rd party evaluation of the 2 DBs out there that identifies PG as being more reliable? It might mention things like fewer incidences of corrupt tables/indexes, fewer deamon crashes, better recovery after system crashes, etc... ?

Thanks !

#2Frank Heikens
frankheikens@mac.com
In reply to: Gauthier, Dave (#1)
Re: Justifying a PG over MySQL approach to a project

Managers want support, they can't live without. Every piece of
software has its flaws and needs patches. PostgreSQL is supported for
5 years, the latest version (8.4) will be supported at least until
2014. In total there are 6 supported version as we speak, 7.4 - 8.4.
MySQL has active support for 5.0 and 5.1 but 5.0 will only be
supported for the next two weeks and 5.1 until december next year.
Unless you pay for an extended support contract. After 5.1 there is no
other stable version at this moment, nobody knows what comes next.

http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy
http://www.mysql.com/about/legal/lifecycle/#calendar

Good luck!

Op 16 dec 2009, om 22:02 heeft Gauthier, Dave het volgende geschreven:

Show quoted text

Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know
nothing about DBs) why I chose to use PG over MySQL in a project,
MySQL being the more popular DB choice with other engineers, and
managers fearing things that are “different” (risk). I have a few
hard tecnical reasons (check constraint, deferred constraint
checking, array data type), but I’m looking for a “it’s more
reliable” reasons. Again, the audience is managers. Is there an
impartial, 3rd party evaluation of the 2 DBs out there that
identifies PG as being more reliable? It might mention things like
fewer incidences of corrupt tables/indexes, fewer deamon crashes,
better recovery after system crashes, etc... ?

Thanks !

#3Thomas Kellerer
spam_eater@gmx.net
In reply to: Gauthier, Dave (#1)
Re: Justifying a PG over MySQL approach to a project

Gauthier, Dave wrote on 16.12.2009 22:02:

Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know
nothing about DBs) why I chose to use PG over MySQL in a project,

What kind of project is that?

If you are developing something that you are selling to other people, MySQL's GPL license will force you to buy a commercial license in order to distribute your application unless it is GPL as well.

You don't have such constraints with PostgreSQL

There are some features that you might want to mention as well

- ANSI standard windowing functions
- ANSI standard common table expressions
- XML support (not necessarily important, but can potentially be nice)

Something that drives me nuts with MySQL: it behaves differently depending on the configuration settings, different defaults with different OS (regarding case sensitivity for example) or the default storage engine selected (thinking about ANSI mode, strict tables, the ability to store invalid dates, insert 0 instead of null and all those little things...).

That makes the QA for a project much more complicated, especially if you don't have control over the installation at the customer's site

PostgreSQL behaves the same ("syntactically"), regardless on where or how it was installed

Thomas

#4Greg Smith
gsmith@gregsmith.com
In reply to: Gauthier, Dave (#1)
Re: Justifying a PG over MySQL approach to a project

You've probably already found
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
which was my long treatment of this topic (and overdue for an update).

The main thing I intended to put into such an update when I get to it is
talking about the really deplorable bug handling situation for MySQL,
which is part of how all the data corruption issues show up. There's a
good overview of its general weirdness at
http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
and the following series of pages lead you through my favorite set of bugs:

http://www.mysqlperformanceblog.com/2007/10/04/mysql-quality-of-old-and-new-features/
http://bugs.mysql.com/bug.php?id=28591
http://bugs.mysql.com/bug.php?id=31001
http://bugs.mysql.com/bug.php?id=37830

Basically, they made a performance optimization *in the stable release*
and fundamentally broke very basic behavior which didn't get caught by
their internal QA at all. That's a disaster that opens up serious
questions about both their project planning/structure and their QA too,
far as I'm concerned.

They do have a regression test suite:
http://dev.mysql.com/doc/refman/5.0/en/mysql-test-suite.html

But it's not really clear that they run it on every platform, i.e.
http://ourdelta.org/hidden-tests-of-the-mysql-testsuite

This supports the rumors I've heard that the development on the database
regularly cheats by just disabling tests that don't work right in some
situations, just so they can ship saying "there's no know issues!".
Obviously that's hearsay, but it sure seems to fit the facts we do know.

Meanwhile, PostgreSQL never does anything but bug fixes in their stable
version updates: http://www.postgresql.org/support/versioning

While the PostgreSQL regression testing build farm is completely public
and there is no tolerance for failed tests in the community:
http://buildfarm.postgresql.org/cgi-bin/show_status.pl

The main other reason why PostgreSQL has less corruption issues IMHO is
that there's exactly one "storage engine" and everybody works on it.
What the MySQL community calls options in storage engines I call split
QA, and the source of new types of failures not possible if you only
have one underlying storage codebase to worry about.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#5Thomas Kellerer
spam_eater@gmx.net
In reply to: Greg Smith (#4)
Re: Justifying a PG over MySQL approach to a project

Greg Smith wrote on 16.12.2009 22:44:

You've probably already found
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
which was my long treatment of this topic (and overdue for an update).

There is an update:

http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL_2009

#6Greg Smith
gsmith@gregsmith.com
In reply to: Thomas Kellerer (#5)
Re: Justifying a PG over MySQL approach to a project

Thomas Kellerer wrote:

Greg Smith wrote on 16.12.2009 22:44:

You've probably already found
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007

which was my long treatment of this topic (and overdue for an update).

There is an update:

http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL_2009

You just found where my work in progress on producing an update is at.
There's minimal changes there so far.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#4)
Re: Justifying a PG over MySQL approach to a project

Greg Smith <greg@2ndquadrant.com> writes:

They do have a regression test suite:
http://dev.mysql.com/doc/refman/5.0/en/mysql-test-suite.html

But it's not really clear that they run it on every platform, i.e.
http://ourdelta.org/hidden-tests-of-the-mysql-testsuite

They definitely don't run it on every combination of allegedly-supported
options. I had to turn on --with-big-tables in the Red Hat build awhile
ago, which is probably a good thing anyway (though if so, why isn't it
default?); the reason I had to do it was the regression tests started
showing obvious failures without it, proving that they don't bother to
run any internal tests without it.

I'm not sure how thorough our buildfarm coverage is for different option
combinations, but the fact that their test suite takes circa four hours
to run is *not* an advantage for them in the comparison. They clearly
haven't got the resources to run all the cases they ought to. (BTW,
that's 4 hours for standard "make check", not any of the optional tests
referred to in the above-cited blog entry.)

This supports the rumors I've heard that the development on the database
regularly cheats by just disabling tests that don't work right in some
situations, just so they can ship saying "there's no know issues!".

Oh, absolutely. They actually have a standard mechanism built into the
test harness for disabling tests that are currently failing, and the set
that are so disabled changes with every update. Compare the contents
of mysql-test/t/disabled.def in various releases sometime.

regards, tom lane

#8Scott Marlowe
scott.marlowe@gmail.com
In reply to: Greg Smith (#4)
Re: Justifying a PG over MySQL approach to a project

On Wed, Dec 16, 2009 at 2:44 PM, Greg Smith <greg@2ndquadrant.com> wrote:

You've probably already found
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
which was my long treatment of this topic (and overdue for an update).

The main thing I intended to put into such an update when I get to it is
talking about the really deplorable bug handling situation for MySQL, which
is part of how all the data corruption issues show up.  There's a good
overview of its general weirdness at
http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
and the following series of pages lead you through my favorite set of bugs:

http://www.mysqlperformanceblog.com/2007/10/04/mysql-quality-of-old-and-new-features/
http://bugs.mysql.com/bug.php?id=28591
http://bugs.mysql.com/bug.php?id=31001
http://bugs.mysql.com/bug.php?id=37830

Basically, they made a performance optimization *in the stable release* and
fundamentally broke very basic behavior which didn't get caught by their
internal QA at all.  That's a disaster that opens up serious questions about
both their project planning/structure and their QA too, far as I'm
concerned.

The important point here is that the bug was introduced to a stable
branch, fixed halfway, then detected again, then fixed yet again.
This does not instil confidence in their QA or code review.

As a for instance of who runs PostgreSQL and who runs MySQL, we have
slashdot and the .info and .org TLDs. When you go to slashdot.org and
it's not working right, that's MySQL acting up. When you can't get to
any .info or .org domains, that's PostgreSQL.

I've had slashdot have a non-functioning database underneath it quite
a few times (note that the site stays up, but you can't edit anything
because it's all static). I've never once had the .org or .info TLDs
go down on me.

#9Steve Atkins
steve@blighty.com
In reply to: Scott Marlowe (#8)
Re: Justifying a PG over MySQL approach to a project

On Dec 16, 2009, at 3:05 PM, Scott Marlowe wrote:

On Wed, Dec 16, 2009 at 2:44 PM, Greg Smith <greg@2ndquadrant.com> wrote:

You've probably already found
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
which was my long treatment of this topic (and overdue for an update).

The main thing I intended to put into such an update when I get to it is
talking about the really deplorable bug handling situation for MySQL, which
is part of how all the data corruption issues show up. There's a good
overview of its general weirdness at
http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
and the following series of pages lead you through my favorite set of bugs:

http://www.mysqlperformanceblog.com/2007/10/04/mysql-quality-of-old-and-new-features/
http://bugs.mysql.com/bug.php?id=28591
http://bugs.mysql.com/bug.php?id=31001
http://bugs.mysql.com/bug.php?id=37830

Basically, they made a performance optimization *in the stable release* and
fundamentally broke very basic behavior which didn't get caught by their
internal QA at all. That's a disaster that opens up serious questions about
both their project planning/structure and their QA too, far as I'm
concerned.

The important point here is that the bug was introduced to a stable
branch, fixed halfway, then detected again, then fixed yet again.
This does not instil confidence in their QA or code review.

As a for instance of who runs PostgreSQL and who runs MySQL, we have
slashdot and the .info and .org TLDs. When you go to slashdot.org and
it's not working right, that's MySQL acting up. When you can't get to
any .info or .org domains, that's PostgreSQL.

My information is quite dated, but as I understand it that's not actually
true.

Postgresql is used for domain registration management at those domains
(amongst others). It's not used for anything related to resolution of those
domains in real time that I'm aware of. If you were unable to register
or transfer a .org domain that would be a postgresql failure.

I've had slashdot have a non-functioning database underneath it quite
a few times (note that the site stays up, but you can't edit anything
because it's all static). I've never once had the .org or .info TLDs
go down on me.

Lets not draw too much attention to the database that's responsible
for that stability. :)

Cheers,
Steve

In reply to: Steve Atkins (#9)
Re: Justifying a PG over MySQL approach to a project

EnterpriseDB wrote a white paper called "PostgreSQL vs. MySQL: A
Comparison of Enterprise Suitability", which is fairly accessible:

http://downloads.enterprisedb.com/whitepapers/White_Paper_PostgreSQL_MySQL.pdf

Regards,
Peter Geoghegan

#11Craig Ringer
craig@2ndquadrant.com
In reply to: Gauthier, Dave (#1)
Re: Justifying a PG over MySQL approach to a project

On 17/12/2009 5:02 AM, Gauthier, Dave wrote:

Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know
nothing about DBs) why I chose to use PG over MySQL in a project, MySQL
being the more popular DB choice with other engineers, and managers
fearing things that are �different� (risk). I have a few hard tecnical
reasons (check constraint, deferred constraint checking, array data
type), but I�m looking for a �it�s more reliable� reasons. Again, the
audience is managers. Is there an impartial, 3^rd party evaluation of
the 2 DBs out there that identifies PG as being more reliable? It might
mention things like fewer incidences of corrupt tables/indexes, fewer
deamon crashes, better recovery after system crashes, etc... ?

In all honesty, I don't know if there's much out there in terms of
impartial analysis. Most of it is done by someone with some sort of a
preference that tends to make its self known.

It also depends a _lot_ on what you are doing with the database. What
sorts of data are you storing? How important to you is that data? What
sorts of client workloads do you expect - huge numbers of clients
running frequent simple queries, or fewer clients with big complex
queries? How much data do you expect to store? etc. All these have a
real bearing on database choice, and it's hard to give good answers
without some knowledge of those details.

One thing I'd like to highlight now: when people say "MySQL is faster"
or "Pg is slow" they tend to (a) be referring to very old versions of
Pg, and (b) be using the very fast but very unsafe MyISAM table type in
MySQL, which is great until it eats your data. So beware of speed claims
not backed by very solid configuration details.

Anyway, just to be different let's try to look at why you might choose
MySQL over PostgreSQL, instead of getting all us Pg folks listing why
you should pick Pg. To me, Pg is the default safe and sane choice, and I
need to seek reasons why I might use MySQL instead for a particular
task. So:

*scratches head*

- MySQL is horizontally scalable via clustering and multi-master
replication (though you must beware of numerous gotchas). PostgreSQL can
be used with read-only slaves via Slony/Bucardo/etc replication, but is
limited to a single authoriative master.

(There's work ongoing to enable readonly hot standby slaves with
failover, but no multi-master is on the horizion).

- If you don't care about your data, MySQL used with MyISAM is *crazy*
fast for lots of small simple queries. Big enough apps will still need
something like memcached on top of that, though. If using MySQL+MyISAM
this way you must be prepared to deal with table corruption on
crashes/outages/powerloss, lack of any transactional behaviour, etc.
There's also some bizarre error "handling" they use to avoid aborting a
non-transactional operation on a MyISAM table half-way though, so you
must be very careful to make sure your updates are valid before
attempting them. But.... why not just use memcached over something
somewhat slower but a lot safer? I guess this one isn't a plus.

- It's a cool tool when you want to query and integrate data from all
sorts of disparate sources, thanks to its support for pluggable storage
engines. If you want something for data analysis and integration rather
than safe storage it's well worth looking at.

--
Craig Ringer

#12Mike Christensen
mike@kitchenpc.com
In reply to: Craig Ringer (#11)
Re: Justifying a PG over MySQL approach to a project

Quick question about the following statement:

"but no multi-master is on the horizion"

From what I understand, there's several multi-master solutions such as
Bucardo, rubyrep, PgPool and PgPool II, PgCluster and Sequoia. Also
Postgres-R, which is still in development. Perhaps you just meant there's
nothing available out of the box? Thanks!

Mike

On Wed, Dec 16, 2009 at 10:30 PM, Craig Ringer
<craig@postnewspapers.com.au>wrote:

Show quoted text

On 17/12/2009 5:02 AM, Gauthier, Dave wrote:

Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know
nothing about DBs) why I chose to use PG over MySQL in a project, MySQL
being the more popular DB choice with other engineers, and managers
fearing things that are “different” (risk). I have a few hard tecnical
reasons (check constraint, deferred constraint checking, array data
type), but I’m looking for a “it’s more reliable” reasons. Again, the
audience is managers. Is there an impartial, 3^rd party evaluation of

the 2 DBs out there that identifies PG as being more reliable? It might
mention things like fewer incidences of corrupt tables/indexes, fewer
deamon crashes, better recovery after system crashes, etc... ?

In all honesty, I don't know if there's much out there in terms of
impartial analysis. Most of it is done by someone with some sort of a
preference that tends to make its self known.

It also depends a _lot_ on what you are doing with the database. What sorts
of data are you storing? How important to you is that data? What sorts of
client workloads do you expect - huge numbers of clients running frequent
simple queries, or fewer clients with big complex queries? How much data do
you expect to store? etc. All these have a real bearing on database choice,
and it's hard to give good answers without some knowledge of those details.

One thing I'd like to highlight now: when people say "MySQL is faster" or
"Pg is slow" they tend to (a) be referring to very old versions of Pg, and
(b) be using the very fast but very unsafe MyISAM table type in MySQL, which
is great until it eats your data. So beware of speed claims not backed by
very solid configuration details.

Anyway, just to be different let's try to look at why you might choose
MySQL over PostgreSQL, instead of getting all us Pg folks listing why you
should pick Pg. To me, Pg is the default safe and sane choice, and I need to
seek reasons why I might use MySQL instead for a particular task. So:

*scratches head*

- MySQL is horizontally scalable via clustering and multi-master
replication (though you must beware of numerous gotchas). PostgreSQL can be
used with read-only slaves via Slony/Bucardo/etc replication, but is limited
to a single authoriative master.

(There's work ongoing to enable readonly hot standby slaves with failover,
but no multi-master is on the horizion).

- If you don't care about your data, MySQL used with MyISAM is *crazy* fast
for lots of small simple queries. Big enough apps will still need something
like memcached on top of that, though. If using MySQL+MyISAM this way you
must be prepared to deal with table corruption on crashes/outages/powerloss,
lack of any transactional behaviour, etc. There's also some bizarre error
"handling" they use to avoid aborting a non-transactional operation on a
MyISAM table half-way though, so you must be very careful to make sure your
updates are valid before attempting them. But.... why not just use memcached
over something somewhat slower but a lot safer? I guess this one isn't a
plus.

- It's a cool tool when you want to query and integrate data from all sorts
of disparate sources, thanks to its support for pluggable storage engines.
If you want something for data analysis and integration rather than safe
storage it's well worth looking at.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

In reply to: Gauthier, Dave (#1)
Re: Justifying a PG over MySQL approach to a project

Dave,

please also check out the licence and costs terms in detail.
Especially: is it given that the planned usage willl continue to be
within the allowed bounds for MySQL-GPL? Are otherwise the costs for
MySQL-commercial budgeted or a reserve founded?

PostgreSQL has here a GIANT advantage with a very very clear licence
which allows basically anything relevant; without the need to buy
commerical licences.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

#14Erik Jones
ejones@engineyard.com
In reply to: Craig Ringer (#11)
Re: Justifying a PG over MySQL approach to a project

On Dec 16, 2009, at 10:30 PM, Craig Ringer wrote:

- If you don't care about your data, MySQL used with MyISAM is *crazy* fast for lots of small simple queries.

This one causes me no end of grief as too often it's simply touted as "MyISAM is fast(er)" while leaving of the bit about "for lots of small, simple queries". Developers then pick MySQL with MyISAM storage and then scratch their heads saying, "But! I heard it was faster...," when I tell them the reason their app is crawling is because they have even moderately complex reads or writes starving out the rest of their app thanks to the table locks required by MyISAM. As you mentioned, for the type of active workloads that MyISAM is good for, you might as well just use memcache over something more reliable and/or concurrent, or even a simple key-value or document store if you really don't need transactions.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k

#15Greg Sabino Mullane
greg@turnstep.com
In reply to: Craig Ringer (#11)
Re: Justifying a PG over MySQL approach to a project

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

- MySQL is horizontally scalable via clustering and multi-master
replication (though you must beware of numerous gotchas). PostgreSQL can
be used with read-only slaves via Slony/Bucardo/etc replication, but is
limited to a single authoriative master.

(There's work ongoing to enable readonly hot standby slaves with
failover, but no multi-master is on the horizion).

Well that's refreshing: usually Bucardo is mistaken for a system that
only does master-master and not master-slave, rather than vice-versa. :)
You can have two authoritative masters with Bucardo, in addition to
any number of slaves radiating from one or both of those (as well as
just simple master->slaves).

- It's a cool tool when you want to query and integrate data from all
sorts of disparate sources, thanks to its support for pluggable storage
engines. If you want something for data analysis and integration rather
than safe storage it's well worth looking at.

What sort of sources? I'm curious here to find areas we can improve upon.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912170927
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksqP9kACgkQvJuQZxSWSshbUQCg3CfvpeivDi6gg2bkr74I17Qe
RKAAnRu3GTUQ3Bg3R2Fq3eOsgK4N0xd1
=5r9R
-----END PGP SIGNATURE-----

#16Merlin Moncure
mmoncure@gmail.com
In reply to: Gauthier, Dave (#1)
Re: Justifying a PG over MySQL approach to a project

On Wed, Dec 16, 2009 at 4:02 PM, Gauthier, Dave <dave.gauthier@intel.com> wrote:

Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know nothing
about DBs) why I chose to use PG over MySQL in a project, MySQL being the
more popular DB choice with other engineers, and managers fearing things
that are “different” (risk).  I have a few hard tecnical reasons (check
constraint, deferred constraint checking, array data type), but I’m looking
for a “it’s more reliable” reasons.  Again, the audience is managers.  Is
there an impartial,  3rd party evaluation of the 2 DBs out there that
identifies PG as being more reliable?  It might mention things like fewer
incidences of corrupt tables/indexes, fewer deamon crashes, better recovery
after system crashes, etc... ?

The #1 useful/practical/business sense feature that postgresql has
over mysql and afaik, most commercial databases even, is transaction
DDL. You can update live systems and if anything goes wrong your
changes roll back.

merlin

#17Gauthier, Dave
dave.gauthier@intel.com
In reply to: Massa, Harald Armin (#13)
Re: Justifying a PG over MySQL approach to a project

They just called the meeting, or at least that part of it. There seems to be a battle brewing, some MySQL advocates are angry, concerned, fearful, ... I dont know why for sure. My managers, who advocate my position and PG are preparing, but the decision will be made by higher-ups who really don't know anything about DBs. They just talk in terms of risk and cost and schedules and yes, licenses. So I'll let them articulate the defense of PG on those terms. I'm just an engineer. I've been feeding them the valuable input I've been getting from this forumn and thanks to all who have contributed. Really!

-----Original Message-----
From: Massa, Harald Armin [mailto:chef@ghum.de]
Sent: Thursday, December 17, 2009 3:14 AM
To: Gauthier, Dave
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project

Dave,

please also check out the licence and costs terms in detail.
Especially: is it given that the planned usage willl continue to be
within the allowed bounds for MySQL-GPL? Are otherwise the costs for
MySQL-commercial budgeted or a reserve founded?

PostgreSQL has here a GIANT advantage with a very very clear licence
which allows basically anything relevant; without the need to buy
commerical licences.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

#18Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Gauthier, Dave (#17)
Re: Justifying a PG over MySQL approach to a project

On Thu, Dec 17, 2009 at 3:55 PM, Gauthier, Dave <dave.gauthier@intel.com> wrote:

They just called the meeting, or at least that part of it.  There seems to be a battle brewing, some MySQL advocates are angry, concerned, fearful, ... I dont know why for sure.

in places like that it is inevitable. there's always going to be crowd
that will fear of change. They don't generate reasonable opinions, it
is the fear of change.
It might be hard to fight that, since managers will make the decision
based on reports that they can trust. Scared folks often generate a
lot of feedback. Just like in politics :)

My managers, who advocate my position and PG are preparing, but the decision will be made by higher-ups who really don't know anything about DBs.  They just talk in terms of risk and cost and schedules and yes, licenses.  So I'll let them articulate the defense of PG on those terms.  I'm just an engineer.  I've been feeding them the valuable input I've been getting from this forumn and t hanks to all who have contributed.  Really!

Well, give them the best report ever. Also, skip the crap they won't
understand. Try writing first the stuff they will understand, than
give them reason why they would want to consider it - in their own
language. Skip the engineering stuff. Managers often have a very short
focus span. As soon as it smells like something they don't understand,
they will stop reading it.

--
GJ

#19Gauthier, Dave
dave.gauthier@intel.com
In reply to: Erik Jones (#14)
Re: Justifying a PG over MySQL approach to a project

Actually, the DB I'm working on is rather small but has a somewhat complex system of constraints and triggers that maintain the data. Queries will outnumber writes (20x at least). And the DB has to be mirrorred at a sister site a couple thousand miles away, so I'm looking for a robust DB replication system for that.

These are the key points they will be worried about...
- DB up time (most important), including recovery time after disasters (e.g. power outages)
- Data integrity. I'm addressing this with constraints and using triggers to populate columns with derived data.
- Data Quality. NO CORRUPT TABLES / INDEXES
- Retrofitting existing apps to work with PG. Perl/DBI is a subtle change in the DBD designation. Some Tcl-MySQL code is tougher. I'm proposing changing everything to go through ODBC as a standard now, and for the future.
- Cost of maintainence. Do I have to babysit this DB 4 hours every day, or does it run by itself? Is this like Oracle where we have to hire professional 24x7 DBAs, or is this hands-off? That kind of question.

I have a DB up and working. Runs great, no problems, but very lightly loaded and/or used at this time. Having worked with PG in the past, I'm not worried about this piece.

I am more concerned with getting a robust DB replication system up and running. Bucardo looks pretty good, but I've just started looking at the options. Any suggestions?

Thanks!

-----Original Message-----
From: Erik Jones [mailto:ejones@engineyard.com]
Sent: Thursday, December 17, 2009 4:42 AM
To: Craig Ringer
Cc: Gauthier, Dave; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project

On Dec 16, 2009, at 10:30 PM, Craig Ringer wrote:

- If you don't care about your data, MySQL used with MyISAM is *crazy* fast for lots of small simple queries.

This one causes me no end of grief as too often it's simply touted as "MyISAM is fast(er)" while leaving of the bit about "for lots of small, simple queries". Developers then pick MySQL with MyISAM storage and then scratch their heads saying, "But! I heard it was faster...," when I tell them the reason their app is crawling is because they have even moderately complex reads or writes starving out the rest of their app thanks to the table locks required by MyISAM. As you mentioned, for the type of active workloads that MyISAM is good for, you might as well just use memcache over something more reliable and/or concurrent, or even a simple key-value or document store if you really don't need transactions.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k

#20Gauthier, Dave
dave.gauthier@intel.com
In reply to: Greg Sabino Mullane (#15)
Re: Justifying a PG over MySQL approach to a project

How difficult is it to switch the master's hat from one DB instance to another? Let's say the master in a master-slave scenario goes down but the slave is fine. Can I designate the slave as being the new master, use it for read/write, and then just call the broken master the new slave once it comes back to life (something like that)?

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Greg Sabino Mullane
Sent: Thursday, December 17, 2009 9:28 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

- MySQL is horizontally scalable via clustering and multi-master
replication (though you must beware of numerous gotchas). PostgreSQL can
be used with read-only slaves via Slony/Bucardo/etc replication, but is
limited to a single authoriative master.

(There's work ongoing to enable readonly hot standby slaves with
failover, but no multi-master is on the horizion).

Well that's refreshing: usually Bucardo is mistaken for a system that
only does master-master and not master-slave, rather than vice-versa. :)
You can have two authoritative masters with Bucardo, in addition to
any number of slaves radiating from one or both of those (as well as
just simple master->slaves).

- It's a cool tool when you want to query and integrate data from all
sorts of disparate sources, thanks to its support for pluggable storage
engines. If you want something for data analysis and integration rather
than safe storage it's well worth looking at.

What sort of sources? I'm curious here to find areas we can improve upon.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912170927
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksqP9kACgkQvJuQZxSWSshbUQCg3CfvpeivDi6gg2bkr74I17Qe
RKAAnRu3GTUQ3Bg3R2Fq3eOsgK4N0xd1
=5r9R
-----END PGP SIGNATURE-----

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#21Richard Broersma
richard.broersma@gmail.com
In reply to: Gauthier, Dave (#20)
#22Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Gauthier, Dave (#19)
#23Greg Sabino Mullane
greg@turnstep.com
In reply to: Gauthier, Dave (#20)
#24Kevin Kempter
kevink@consistentstate.com
In reply to: Richard Broersma (#21)
#25John R Pierce
pierce@hogranch.com
In reply to: Gauthier, Dave (#1)
#26Madison Kelly
linux@alteeve.com
In reply to: Gauthier, Dave (#1)
#27Scott Marlowe
scott.marlowe@gmail.com
In reply to: Madison Kelly (#26)
#28Gauthier, Dave
dave.gauthier@intel.com
In reply to: Greg Sabino Mullane (#23)
#29Scott Marlowe
scott.marlowe@gmail.com
In reply to: Gauthier, Dave (#28)
#30Greg Sabino Mullane
greg@turnstep.com
In reply to: Gauthier, Dave (#28)
#31David Boreham
david_list@boreham.org
In reply to: Scott Marlowe (#29)
#32Scott Marlowe
scott.marlowe@gmail.com
In reply to: David Boreham (#31)
#33Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Scott Marlowe (#32)
#34David Boreham
david_list@boreham.org
In reply to: Lincoln Yeoh (#33)
#35Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: David Boreham (#34)
#36Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Greg Smith (#4)
#37Merlin Moncure
mmoncure@gmail.com
In reply to: Lincoln Yeoh (#36)
#38Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Lincoln Yeoh (#36)
#39Merlin Moncure
mmoncure@gmail.com
In reply to: Ron Mayer (#38)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ron Mayer (#38)
#41Gauthier, Dave
dave.gauthier@intel.com
In reply to: Lincoln Yeoh (#36)
#42Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Gauthier, Dave (#41)
#43Greg Smith
gsmith@gregsmith.com
In reply to: Ron Mayer (#42)
#44Greg Smith
gsmith@gregsmith.com
In reply to: Merlin Moncure (#39)
#45Rakotomandimby Mihamina
mihamina@gulfsat.mg
In reply to: Gauthier, Dave (#1)
#46Scott Ribe
scott_ribe@killerbytes.com
In reply to: Rakotomandimby Mihamina (#45)
#47Erik Jones
ejones@engineyard.com
In reply to: Scott Ribe (#46)
#48Scott Marlowe
scott.marlowe@gmail.com
In reply to: Erik Jones (#47)
#49Gauthier, Dave
dave.gauthier@intel.com
In reply to: Erik Jones (#47)
#50Scott Marlowe
scott.marlowe@gmail.com
In reply to: Gauthier, Dave (#49)
#51Robert Hodges
robert.hodges@continuent.com
In reply to: Gauthier, Dave (#49)