Article on MySQL vs. Postgres

Started by Tim Perdueover 25 years ago50 messages

tperdue@valinux.com

over 25 years ago

On wednesday or thursday, I'm going to be publishing my article on MySQL
vs. Postgres on PHPBuilder.com.

Before I do that I want to confirm the major problem I had w/postgres:
the 8K tuple limit. When trying to import some tables from MySQL,
postgres kept choking because MySQL has no such limit on the size of a
row in the database (text fields on MySQL can be multi-megabyte).

Is it even possible to import large text fields into postgres? If not,
how in the world can anyone use this to store message board posts,
resumes, etc? Do you have to use pgsql-specific large-object
import/export commands?

I actually intended the article to be a win for Postgres, as I've used
it and had good luck with it for such a long time, but if you look at
the results below, it seems very positive for MySQL.

Performace/Scalability:

MySQL was About 50-60% faster in real-world web serving, but it crumbles
under a real load. Postgres on the other hand scaled 3x higher than
MySQL before it started to crumble on the same machine. Unfortunately,
Postgres would probably still lose on a high-traffic website because
MySQL can crank out the pages so much faster, number of concurrent
connections is hard to compare. MySQL also seems to make better use of
multiple-processor machines like the quad-xeon I tested on. Postgres
never saturated all 4 processors as MySQL did.

Tools:
MySQL has some nice admin tools that allow you to watch individual
connections and queries as they progress and tools to recover from
corruption. I haven't seem any similar tools for postgres.

Long-term stability:
Postgres is undoubtably the long-run winner in stability, whereas MySQL
will freak out or die when left running for more than a month at a time.
But if you ever do have a problem with postgres, you generally have to
nuke the database and recover from a backup, as there are no known tools
to fix index and database corruption. For a long-running postgres
database, you will occasionally have to drop indexes and re-create them,
causing downtime.

Usability:
Both databases use a similar command-line interface. Postgres uses
"slash commands" to help you view database structures. MySQL uses a more
memorable, uniform syntax like "Show Tables; Show Databases; Describe
table_x;" and has better support for altering/changing tables, columns,
and even databases.

Features:
Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
now supports foreign keys, which can help with referential integrity.
Postgres supports subselects and better support for creating tables as
the result of queries. The "transaction" support that MySQL lacks is
included in Postgres, although you'll never miss it on a website, unless
you're building something for a bank, and if you're doing that, you'll
use oracle.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Mikheev, Vadim

vmikheev@SECTORBASE.COM

over 25 years ago

In reply to: Tim Perdue (#1)

RE: Article on MySQL vs. Postgres

Before I do that I want to confirm the major problem I had w/postgres:
the 8K tuple limit. When trying to import some tables from MySQL,
postgres kept choking because MySQL has no such limit on the size of a
row in the database (text fields on MySQL can be multi-megabyte).

Jan is working on TOAST for 7.1 - there will be no 8K limit any more...

Long-term stability:
Postgres is undoubtably the long-run winner in stability,
whereas MySQL will freak out or die when left running for more than a

month

at a time. But if you ever do have a problem with postgres, you generally
have to nuke the database and recover from a backup, as there are no
known tools to fix index and database corruption. For a long-running

postgres

database, you will occasionally have to drop indexes and re-create them,
causing downtime.

I'm implementing WAL for 7.1 - there will be true after crash recovery...

Vadim

Import Notes

Resolved by subject fallback

Egon Schmid

eschmid@php.net

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

On wednesday or thursday, I'm going to be publishing my article on MySQL
vs. Postgres on PHPBuilder.com.

Cool!

Features:
Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
now supports foreign keys, which can help with referential integrity.
Postgres supports subselects and better support for creating tables as
the result of queries. The "transaction" support that MySQL lacks is
included in Postgres, although you'll never miss it on a website, unless
you're building something for a bank, and if you're doing that, you'll
use oracle.

Since MySQL version 3.23.16 it supports transactions with sleepycats DB3
and since version 3.23.19 it is under the GPL.

-Egon

--
SIX Offene Systeme GmbH ï¿½ Stuttgart - Berlin - New York
Sielminger Straï¿½e 63 ï¿½ D-70771 Leinfelden-Echterdingen
Fon +49 711 9909164 ï¿½ Fax +49 711 9909199 http://www.six.de
PHP-Stand auf Europas grï¿½sster Linux-Messe: 'LinuxTag 2001'
weitere Infos @ http://www.dynamic-webpages.de/

Noname

JanWieck@t-online.de

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

On wednesday or thursday, I'm going to be publishing my article on MySQL
vs. Postgres on PHPBuilder.com.

Before I do that I want to confirm the major problem I had w/postgres:
the 8K tuple limit. When trying to import some tables from MySQL,
postgres kept choking because MySQL has no such limit on the size of a
row in the database (text fields on MySQL can be multi-megabyte).

I just committed the first portion of TOAST. Enabling lztext
fields to hold multi-megabytes too. But it's not the answer
to such big objects. I have plans to add an Oracle like
large object handling in a future version.

I actually intended the article to be a win for Postgres, as I've used
it and had good luck with it for such a long time, but if you look at
the results below, it seems very positive for MySQL.

It's never a good plan to have an initial intention which of
the competitors should finally look good. It's visible
between the lines.

Performace/Scalability:

MySQL was About 50-60% faster in real-world web serving, but it crumbles
under a real load. Postgres on the other hand scaled 3x higher than
MySQL before it started to crumble on the same machine. Unfortunately,
Postgres would probably still lose on a high-traffic website because
MySQL can crank out the pages so much faster, number of concurrent
connections is hard to compare. MySQL also seems to make better use of
multiple-processor machines like the quad-xeon I tested on. Postgres
never saturated all 4 processors as MySQL did.

The question in this case is "what is real-world web
serving"? To spit out static HTML pages loaded into a
database? To handle discussion forums like OpenACS with high
concurrency and the need for transactions?

Web applications differ in database usage as much as any
other type of application. From huge amounts of static, never
changing data to complex data structures with many
dependencies constantly in motion. There is no such one
"real world web scenario".

Tools:
MySQL has some nice admin tools that allow you to watch individual
connections and queries as they progress and tools to recover from
corruption. I haven't seem any similar tools for postgres.

Yepp, we need alot more nice tools.

Long-term stability:
Postgres is undoubtably the long-run winner in stability, whereas MySQL
will freak out or die when left running for more than a month at a time.
But if you ever do have a problem with postgres, you generally have to
nuke the database and recover from a backup, as there are no known tools
to fix index and database corruption. For a long-running postgres
database, you will occasionally have to drop indexes and re-create them,
causing downtime.

Not true IMHO. We had some problems with indices in the past.
But you can drop/recreate them online and someone running a
query concurrently might just use a sequential scan during
that time. All other corruptions need backup and recovery.
WAL is on it's way.

Usability:
Both databases use a similar command-line interface. Postgres uses
"slash commands" to help you view database structures. MySQL uses a more
memorable, uniform syntax like "Show Tables; Show Databases; Describe
table_x;" and has better support for altering/changing tables, columns,
and even databases.

Since professional application development starts with a data
design, such "describe" commands and "alter" features are
unimportant. The more someone needs them, the more I know
that he isn't well educated.

Productional installations don't need any "alter" command at
all. New features are developed in the development area,
tested with real life data in the test environment and moved
to the production server including a maybe required data
conversion step during a downtime.

24/7 scenarios require hot standby, online synchronized
databases with hardware takeover. All that is far away from
our scope by now.

Features:
Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
now supports foreign keys, which can help with referential integrity.
Postgres supports subselects and better support for creating tables as
the result of queries. The "transaction" support that MySQL lacks is
included in Postgres, although you'll never miss it on a website, unless
you're building something for a bank, and if you're doing that, you'll
use oracle.

FOREIGN KEY doesn't help with referential integrity, it
guarantees it. No application must ever worry if it will
find the customer when it has a problem report. It does a
SELECT and has it or it would've never found the problem
report first - period.

And for big, functional expanding web sites, it does so even
if one of a dozen programmers forgot it once. If the
constraint says you cannot delete a customer who payed until
end of the year, the database won't let you, even if one of
the 7 CGI programs that can delete customers doesn't check.

Transactions are the base for any data integrity. Especially
in the web environment. Almost every web server I've seen has
some timeout for CGI, ADP, ASP or whatever they call it. As
soon as your page needs to update more than one table, you
run the risk of getting aborted just between, leaving the
current activity half done. No matter if a database supports
FOREIGN KEY. I could live without it, but transactions are
essential.

Fortunately the MySQL team has changed it's point of view on
that detail and made some noticeable advantage into that area
by integrating BDB. The lates BETA does support transactions
including rollback as they announced. As far as I see it, the
integration of BDB only buys them transactions, on the cost
of performance and maintainence efford. So the need for it
cannot be that small as you think.

Final notes:

I hate these "MySQL" vs. "PostgreSQL" articles that want to
say "this one is the better". Each one has it's advantages
and disadvantages. Both have a long TODO.

Your article might better analyze a couple of different
"real-world web services", telling what DB usage profile they
have and then suggesting which of the two databases is the
better choice in each case.

MySQL is a tool and PostgreSQL is a tool. But as with other
tools, a hammer doesn't help if you need a screw driver.

Please don't intend to tell anyone either of these databases
is "the best". You'd do both communities a bad job. Help
people to choose the right database for their current needs
and tell them to reevaluate their choice for the next project
instead of blindly staying with the same database. We'll end
up with alot of customers using both databases parallel for
different needs.

At the bottom line both teams share the same idea, open
source. Anyone who pays a license fee is a loss (looser?) for
all of us.

Jan

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

Benjamin Adida

ben@mit.edu

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

on 7/4/00 3:42 PM, Tim Perdue at tperdue@valinux.com wrote:

Before I do that I want to confirm the major problem I had w/postgres:
the 8K tuple limit. When trying to import some tables from MySQL,
postgres kept choking because MySQL has no such limit on the size of a
row in the database (text fields on MySQL can be multi-megabyte).

It's possible in the current version to up your tuple limit to 16K before
compilation, and you can use lztext, the compressed text type, which should
give you up to 32K of storage. Netscape's textarea limit is 32K, so that's a
good basis for doing a number of web-based things. Anything that is
multi-megabyte is really not something I'd want to store in an RDBMS.

I actually intended the article to be a win for Postgres, as I've used
it and had good luck with it for such a long time, but if you look at
the results below, it seems very positive for MySQL.

Jan said that each tool has its value, and that's true. I recommend you
define your evaluation context before you write this. Is this for running a
serious mission-critical web site? Is it for logging web site hits with
tolerance for data loss and a need for doing simple reporting?

Performace/Scalability:

MySQL was About 50-60% faster in real-world web serving, but it crumbles
under a real load. Postgres on the other hand scaled 3x higher than
MySQL before it started to crumble on the same machine. Unfortunately,
Postgres would probably still lose on a high-traffic website because
MySQL can crank out the pages so much faster, number of concurrent
connections is hard to compare. MySQL also seems to make better use of
multiple-processor machines like the quad-xeon I tested on. Postgres
never saturated all 4 processors as MySQL did.

What kind of queries did you perform? Did you use connection pooling (a lot
of PHP apps don't, from what I've seen)? How does the performance get
affected when a query in Postgres with subselects has to be split into 4
different queries in MySQL? Postgres is process-based, each connection
resulting in one process. If you use connection pooling with at least as
many connections as you have processors, you should see it scale quite well.
In fact, for serious load-testing, you should have 10-15 pooled connections.

I *strongly* question your intuition on Postgres running web sites. MySQL's
write performance is very poor, which forces excessive caching (see sites
like Slashdot) to prevent updates from blocking entire web site serving.
Yes, the BDB addition might be useful. Let's see some performance tests
using BDB tables.

Postgres is undoubtably the long-run winner in stability, whereas MySQL
will freak out or die when left running for more than a month at a time.
But if you ever do have a problem with postgres, you generally have to
nuke the database and recover from a backup, as there are no known tools
to fix index and database corruption. For a long-running postgres
database, you will occasionally have to drop indexes and re-create them,
causing downtime.

Dropping indexes and recreating them does not cause downtime. I've run a
couple of postgres-backed web sites for months on end with no issues. I've
survived a heavy slashdotting on my dual Pentium II-400, with Postgres
WRITES and READS on every Slashdot-referred hit, resulting in perfectly
respectable serving times (less than 3-4 seconds to serve > 20K of data on
each hit). No caching optimization of any kind on the app layer. And I'd
forgotten to vacuum my database for a few days.

Features:
Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
now supports foreign keys, which can help with referential integrity.
Postgres supports subselects and better support for creating tables as
the result of queries. The "transaction" support that MySQL lacks is
included in Postgres, although you'll never miss it on a website, unless
you're building something for a bank, and if you're doing that, you'll
use oracle.

I'm just shocked at this. Where did this "transactions aren't necessary"
school of thinking originate? I've been developing database-backed web sites
for 5 years now, and I can't conceive of building a serious web site without
transactions. How do you guarantee that a record and its children records
are all stored together successfully? Do you run on a magic power grid that
never fails? Do you never have code-related error conditions that require
rolling back a series of database edits?

One quick point: while you may well be personally unbiased, VA Linux just
endorsed and funded MySQL. SourceForge uses MySQL. How do you expect to
convince readers that you're being objective in this comparison?

-Ben

Peter Eisentraut

peter_e@gmx.net

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

Tim Perdue writes:

the 8K tuple limit.

BLCKSZ in src/include/config.h -- But it's being worked on these very
days.

Postgres never saturated all 4 processors as MySQL did.

Blame that on your operating system?

MySQL has some nice admin tools that allow you to watch individual
connections and queries as they progress

ps
tail -f <serverlog>

and tools to recover from corruption. I haven't seem any similar tools
for postgres.

I always like this one -- "tools to recover from corruption". If your
database is truly corrupted then there's nothing you can do about it, you
need a backup. If your database engine just creates garbage once in a
while then the solution is to fix the database engine, not to provide
external tools to clean up after it.

as there are no known tools to fix index

REINDEX

Both databases use a similar command-line interface. Postgres uses
"slash commands" to help you view database structures. MySQL uses a more
memorable, uniform syntax like "Show Tables; Show Databases; Describe
table_x;"

Yeah, but once you have memorized ours then it will be shorter to type. :)
And you get tab completion. And what's so non-uniform about ours?

The "transaction" support that MySQL lacks is included in Postgres,
although you'll never miss it on a website,

Think again. Transactions and multi-version concurrency control are
essential for any multi-user web site that expects any writes at all. I'll
reiterate the old Bugzilla bug: User A issues a search that "takes
forever". User B wants to update some information in the database, waits
for user A. Now *every* user in the system, reading or writing, is blocked
waiting for A (and B).

But you don't even have to go that far. What if you just update two
separate tables at once?

If your web site is truly read only, yes, you don't need transactions. But
then you don't need a database either. If your web site does writes, you
need transactions, or you're really not trying hard enough.

--
Peter Eisentraut Sernanders vï¿½g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

Tim Perdue

tperdue@valinux.com

over 25 years ago

In reply to: Benjamin Adida (#5)

Re: Article on MySQL vs. Postgres

Benjamin Adida wrote:

Jan said that each tool has its value, and that's true. I recommend you
define your evaluation context before you write this. Is this for running a
serious mission-critical web site? Is it for logging web site hits with
tolerance for data loss and a need for doing simple reporting?

This is for what most people do with PHP and databases - run
semi-critical medium-traffic sites. Anyone running a mission-critical
site would have to look elsewhere for true robustness. I would not at
this time recommend any serious, life-threatening app run On either
database.

Performace/Scalability:

MySQL was About 50-60% faster in real-world web serving, but it crumbles
under a real load. Postgres on the other hand scaled 3x higher than
MySQL before it started to crumble on the same machine. Unfortunately,
Postgres would probably still lose on a high-traffic website because
MySQL can crank out the pages so much faster, number of concurrent
connections is hard to compare. MySQL also seems to make better use of
multiple-processor machines like the quad-xeon I tested on. Postgres
never saturated all 4 processors as MySQL did.

What kind of queries did you perform?

I took a real-world page from our site
<http://sourceforge.net/forum/forum.php?forum_id=1> and made it portable
to both databases. Of course, I could not import the "body" of the
message into postgres because of the 8k limitation, so the body had to
be dropped from both databases.

The "nested" view of this page requires joins against three tables and
some recursion to show submessages.

The test was conducted with "ab" (apache benchmark software) using
varying numbers of concurrent connections and 1000 total page views.

The "10% inserts" test is most realistic, as about 10% of all page views
in a discussion forum involve posting to the database. I used a
random-number generator in the PHP script to insert a row into the table
10% of the time. If you look at the results, you'll see that MySQL was
actually harmed somewhat more by the writes than postgres was.

Here are the actual results I saw on my quad-xeon machine:

postgres:

concurrency w/pconnects:
10 cli - 10.27 pg/sec 333.69 kb/s
20 cli - 10.24 pg/sec 332.86 kb/s
30 cli - 10.25 pg/sec 333.01 kb/s
40 cli - 10.0 pg/sec 324.78 kb/s
50 cli - 10.0 pg/sec 324.84 kb/s
75 cli - 9.58 pg/sec 311.43 kb/s
90 cli - 9.48 pg/sec 307.95 kb/s
100 cli - 9.23 pg/sec 300.00 kb/s
110 cli - 9.09 pg/sec 295.20 kb/s
120 cli - 9.28 pg/sec 295.02 kb/s (2.2% failure)

concurrency w/10% inserts & pconnects:
30 cli - 9.97 pg/sec 324.11 kb/s
40 cli - 10.08 pg/sec 327.40 kb/s
75 cli - 9.51 pg/sec 309.13 kb/s

MySQL:

Concurrency Tests w/pconnects:
30 cli - 16.03 pg/sec 521.01 kb/s
40 cli - 15.64 pg/sec 507.18 kb/s *failures
50 cli - 15.43 pg/sec 497.88 kb/s *failures
75 cli - 14.70 pg/sec 468.64 kb/s *failures
90 - mysql dies
110 - mysql dies
120 - mysql dies

Concurrency Tests w/o pconnects:
10 cli - 16.55 pg/sec 537.63 kb/s
20 cli - 15.99 pg/sec 519/51 kb/s
30 cli - 15.55 pg/sec 505.19 kb/s
40 cli - 15.46 pg/sec 490.01 kb/s 4.7% failure
50 cli - 15.59 pg/sec 482.24 kb/s 8.2% failure
75 cli - 17.65 pg/sec 452.08 kb/s 36.3% failure
90 cli - mysql dies

concurrency w/10% inserts & pconnects:
20 cli - 16.37 pg/sec 531.79 kb/s
30 cli - 16.15 pg/sec 524.64 kb/s
40 cli - 22.04 pg/sec 453.82 kb/sec 37.8% failure

Did you use connection pooling (a lot

I used persistent connections, yes. Without them, Postgres' showing was
far poorer, with mysql showing about 2x the performance.

of PHP apps don't, from what I've seen)? How does the performance get
affected when a query in Postgres with subselects has to be split into 4
different queries in MySQL?

I'd really love to see a case where a real-world page view requires 4x
the queries on MySQL. If you are doing subselects like that on a website
in real-time you've got serious design problems and postgres would
fold-up and quit under the load anyway.

Postgres is process-based, each connection
resulting in one process. If you use connection pooling with at least as
many connections as you have processors, you should see it scale quite well.
In fact, for serious load-testing, you should have 10-15 pooled connections.

I *strongly* question your intuition on Postgres running web sites. MySQL's

Specifically, what is the problem with my "intuition"? All I did in the
prior message was report my results and ask for feedback before I post
it.

write performance is very poor, which forces excessive caching (see sites
like Slashdot) to prevent updates from blocking entire web site serving.
Yes, the BDB addition might be useful. Let's see some performance tests
using BDB tables.

I wouldn't use BDB tables as MySQL 3.23.x isn't stable and I wouldn't
use it until it is.

Postgres is undoubtably the long-run winner in stability, whereas MySQL
will freak out or die when left running for more than a month at a time.
But if you ever do have a problem with postgres, you generally have to
nuke the database and recover from a backup, as there are no known tools
to fix index and database corruption. For a long-running postgres
database, you will occasionally have to drop indexes and re-create them,
causing downtime.

Dropping indexes and recreating them does not cause downtime. I've run a
couple of postgres-backed web sites for months on end with no issues. I've
survived a heavy slashdotting on my dual Pentium II-400, with Postgres
WRITES and READS on every Slashdot-referred hit, resulting in perfectly
respectable serving times (less than 3-4 seconds to serve > 20K of data on
each hit). No caching optimization of any kind on the app layer. And I'd
forgotten to vacuum my database for a few days.

Not sure why you're arguing with this as this was a clear win for
postgres.

Do you run on a magic power grid that
never fails?

Reality is that postgres is as likely - or more likely - to wind up with
corrupted data than MySQL. I'm talking physical corruption where I have
to destroy the database and recover from a dump. Just a couple months
ago I sent a message about "Eternal Vacuuming", in which case I had to
destroy and recover a multi-gigabyte database.

Further, I have had situations where postgres actually had DUPLICATE ids
in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That? Well,
I had to run a count(*) next to each ID and select out the rows where
there was more than one of each "unique" id, then reinsert those rows
and drop and rebuild the indexes and reset the sequences.

I've only been using MySQL for about a year (as compared to 2 years for
postgres), but I have never seen either of those problems with MySQL.

Do you never have code-related error conditions that require
rolling back a series of database edits?

Personally, I check every query in my PHP code. On the rare occasion
that it fales, I show an error and get out. Even with postgres, I have
always checked success or failure of a query and shown an appropriate
error. Never in two years of programming PHP/postgres have I ever used
commit/rollback, and I have written some extremely complex web apps
(sourceforge being a prime example). Geocrawler.com runs on postgres and
again, I NEVER saw any need for any kind of rollback at all.

The statelessness of the web pretty much obviates the needs for
locks/rollbacks as each process is extremely quick and runs from start
to finish instantly. It's not like the old days where you pull data down
into a local application, work on it, then upload it again.

Only now, with some extremely complex stuff that we're doing on
SourceForge would I like to see locks and rollbacks (hence my recent
interest in benchmarking and comparing the two). Your average web
programmer will almost never run into that in the short term.

One quick point: while you may well be personally unbiased, VA Linux just
endorsed and funded MySQL. SourceForge uses MySQL. How do you expect to
convince readers that you're being objective in this comparison?

Your own strong biases are shown in your message. I do this stuff
because I'm curious and want to find out for myself. Most readers will
find it interesting as I did. Few will switch from MySQL to postgres or
vice versa because of it.

Another clarification: PHPBuilder is owned by internet.com, a competitor
of VA Linux/Andover.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Mike Mascari

mascarm@mascari.com

over 25 years ago

In reply to: Benjamin Adida (#5)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

This is for what most people do with PHP and databases - run
semi-critical medium-traffic sites. Anyone running a mission-critical
site would have to look elsewhere for true robustness. I would not at
this time recommend any serious, life-threatening app run On either
database.

I've seen problems with block read errors in large Oracle
databases which fail their alleged CRC check -- intermittent core
dumps which required a dump/restore of 25 years of insurance
claims data (40 gig - it was a lot at the time). After being down
for days and restoring on a new box, the same errors occured.

I'd really love to see a case where a real-world page view requires 4x
the queries on MySQL. If you are doing subselects like that on a website
in real-time you've got serious design problems and postgres would
fold-up and quit under the load anyway.

This can be true for Internet sites, of course. But with
corporate Intranet sites that dish-out and process ERP data, the
queries can become quite complex while concurrency is limited to
< 1000 users.

Further, I have had situations where postgres actually had DUPLICATE ids
in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That? Well,
I had to run a count(*) next to each ID and select out the rows where
there was more than one of each "unique" id, then reinsert those rows
and drop and rebuild the indexes and reset the sequences.

Umm...

DELETE FROM foo WHERE EXISTS
(SELECT f.key FROM foo f WHERE f.key = foo.key AND f.oid >
foo.oid);

I believe there's even a purely SQL (non-oid) method of doing
this as well.

Personally, I check every query in my PHP code. On the rare occasion
that it fales, I show an error and get out. Even with postgres, I have
always checked success or failure of a query and shown an appropriate
error. Never in two years of programming PHP/postgres have I ever used
commit/rollback, and I have written some extremely complex web apps
(sourceforge being a prime example). Geocrawler.com runs on postgres and
again, I NEVER saw any need for any kind of rollback at all.

This is the nature of the application. In the same example above,
how can I "charge" a cost center for the purchase of products in
an in-house distribution center and "deduct" the resulting
quantity from the distribution center's on-hand inventory sanely
without transactions?

Mike Mascari

Chris Bitmead

chrisb@nimrod.itg.telstra.com.au

over 25 years ago

In reply to: Benjamin Adida (#5)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

I'd really love to see a case where a real-world page view requires 4x
the queries on MySQL. If you are doing subselects like that on a website
in real-time you've got serious design problems and postgres would
fold-up and quit under the load anyway.

Why? There are some subselect queries that have no problems running in
real-time. There are some non-subselect queries which one should never
attempt in real time. There is nothing fundamentally wrong with using
subselects for page views if it works for you. Nor is there anything
necessarily wrong with a design that requires subselects.

Do you run on a magic power grid that
never fails?

Reality is that postgres is as likely - or more likely - to wind up with
corrupted data than MySQL.

What do you base this statement on? With your sample size of one
corrupted postgres database? Also do you include inconsistent data in
your definition of corrupted data?

Never in two years of programming PHP/postgres have I ever used
commit/rollback, and I have written some extremely complex web apps
(sourceforge being a prime example).

I would humbly suggest that you are doing it wrong then.

Geocrawler.com runs on postgres and
again, I NEVER saw any need for any kind of rollback at all.

So what do you do when you get an error and "get out" as you put it?
Leave the half-done work in the database?

The statelessness of the web pretty much obviates the needs for
locks/rollbacks as each process is extremely quick and runs from start
to finish instantly. It's not like the old days where you pull data down
into a local application, work on it, then upload it again.

Even in the "old days" you should never keep a transaction open while
you "work on it". Transactions should *always* be short, and the web
changes nothing.

Really, REALLY there is nothing different about the web to traditional
applications as far as the db is concerned.

#10

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

On Tue, 4 Jul 2000, Tim Perdue wrote:

Performace/Scalability:

MySQL was About 50-60% faster in real-world web serving, but it
crumbles under a real load. Postgres on the other hand scaled 3x
higher than MySQL before it started to crumble on the same machine.
Unfortunately, Postgres would probably still lose on a high-traffic
website because MySQL can crank out the pages so much faster

Actually, this one depends alot on how the site is
setup/programmed. I did work with a friend several months ago using the
newest released versions of MySQL and PostgreSQL ... we loaded (with some
massaging) the exact same data/tables onto both on the *exact* same
machine, and the exact same operating system. When we ran their existing
web site, without modifications, on both MySQL and PgSQL, the MySQL was
substantially faster ... when we spent a little bit of time looking at the
queries used, we found that due to MySQLs lack of sub-queries, each page
being loaded had to do multiple queries to get the same information that
we could get out of PgSQL using one. Once we optimized the queries, our
timings to load the page went from something like 3sec for MySQL and 1sec
for PgSQL ... (vs something like, if I recall correctly, 19sec for
PgSQL) ...

Same with some recent work I did with UDMSearch ... by default,
UDMSearch does 2+n queries to the database to get the information it
requires ... by re-writing the 'n' queries that are performed as an IN
query, I was able to cut down searches from taking ~1sec*n queries down to
a 3sec query ...

The point being that if you do a 1:1 comparison, MySQL will be
faster ... if you use features in PgSQL that don't exist in MySQL, you can
knock that speed difference down considerably, if not surpass MySQL,
depending on the circumstance ...

#11

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Noname (#4)

Re: Article on MySQL vs. Postgres

Final notes:

I hate these "MySQL" vs. "PostgreSQL" articles that want to
say "this one is the better". Each one has it's advantages
and disadvantages. Both have a long TODO.

Also, none of the 'comparisons' take the time to deal with the fact that
ones "disadvantages" can generally be overcome using its
"advantages" (ie. speed issues with PostgreSQL can generally be overcome
by making use of its high end features (ie. subselects)) ...

#12

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Benjamin Adida (#5)

Re: Article on MySQL vs. Postgres

On Tue, 4 Jul 2000, Benjamin Adida wrote:

Dropping indexes and recreating them does not cause downtime. I've run a

Just got hit with a 'bits moved;recreate index' on the PostgreSQL search
engine ... drop'd and re-created index on the fly, no server shut down ...

couple of postgres-backed web sites for months on end with no issues. I've
survived a heavy slashdotting on my dual Pentium II-400, with Postgres
WRITES and READS on every Slashdot-referred hit, resulting in perfectly
respectable serving times (less than 3-4 seconds to serve > 20K of data on
each hit). No caching optimization of any kind on the app layer. And I'd
forgotten to vacuum my database for a few days.

We had a *very* old version of PostgreSQL running on a Pentium acting as
an accounting/authentication backend to a RADIUS server for an ISP
... uptime for the server itself was *almost* 365 days (someone hit the
power switch by accident, meaning to power down a different machine
*sigh*) ... PostgreSQL server had been up for something like 6 months
without any problems, with the previous downtime being to upgrade the
server ...

Features:

Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
now supports foreign keys, which can help with referential integrity.
Postgres supports subselects and better support for creating tables as
the result of queries. The "transaction" support that MySQL lacks is
included in Postgres, although you'll never miss it on a website, unless
you're building something for a bank, and if you're doing that, you'll
use oracle.

I'm just shocked at this. Where did this "transactions aren't necessary"
school of thinking originate?

Ummm, hate to disparage someone else, and I may actually be incorrect, but
I'm *almost* certain that MySQL docs, at one time, had this in it
... where they were explaining why they didn't have and never would have
transaction support. Obviously this mentality has changed since, with
the recent addition of transactions through a third-party database product
(re: Berkeley DB) ...

I've been developing database-backed web sites for 5 years now, and I
can't conceive of building a serious web site without transactions.
How do you guarantee that a record and its children records are all
stored together successfully? Do you run on a magic power grid that
never fails? Do you never have code-related error conditions that
require rolling back a series of database edits?

Actually, hate to admit it, but it wasn't until recently that I clued into
what transaction were for and how they wre used :( I now use them for
just about everything I do, and couldn't imagine doing without them ...

#13

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Peter Eisentraut (#6)

Re: Article on MySQL vs. Postgres

On Wed, 5 Jul 2000, Peter Eisentraut wrote:

If your web site is truly read only, yes, you don't need transactions. But
then you don't need a database either. If your web site does writes, you
need transactions, or you're really not trying hard enough.

... or not popular enough :)

#14

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Tim Perdue (#7)

Re: Article on MySQL vs. Postgres

On Tue, 4 Jul 2000, Tim Perdue wrote:

Benjamin Adida wrote:

Jan said that each tool has its value, and that's true. I recommend you
define your evaluation context before you write this. Is this for running a
serious mission-critical web site? Is it for logging web site hits with
tolerance for data loss and a need for doing simple reporting?

This is for what most people do with PHP and databases - run
semi-critical medium-traffic sites. Anyone running a mission-critical
site would have to look elsewhere for true robustness. I would not at
this time recommend any serious, life-threatening app run On either
database.

Someone want to give me an example of something that would be
life-threatening that would run on a database? I can think of loads of
mission critical stuff, but life threatening? As for mission critical,
mission critical is in the eye of the end-user ... all my clients run
PostgreSQL for their backend needs, and I can guarantee you that each and
every one of them considers it a mission critical element to their sites
... then again, I have 3+ years of personal experience with PostgreSQL to
back me up ..

I took a real-world page from our site
<http://sourceforge.net/forum/forum.php?forum_id=1> and made it portable
to both databases. Of course, I could not import the "body" of the

did you take the time to optimize the queries to take advantage of
features that MySQL doesn't have, or just straight plug-n-play?

of PHP apps don't, from what I've seen)? How does the performance get
affected when a query in Postgres with subselects has to be split into 4
different queries in MySQL?

I'd really love to see a case where a real-world page view requires 4x
the queries on MySQL. If you are doing subselects like that on a website
in real-time you've got serious design problems and postgres would
fold-up and quit under the load anyway.

Odd, I'll have to let one of my clients know that their site has design
flaws ... wait, no, they had 3x the queries in MySQL as in PgSQL, so that
probably doesnt' apply ...

Do you run on a magic power grid that
never fails?

Reality is that postgres is as likely - or more likely - to wind up with
corrupted data than MySQL. I'm talking physical corruption where I have
to destroy the database and recover from a dump.

Odd, in my 3+ years of PostgreSQL development, I've yet to have a
project/database corrupt such that I had to restore from backups *knock on
wood* INDEX corruption, yup ... 'DROP INDEX/CREATE INDEX' fixes that
though. Physical database corruption, nope ...

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

Odd, were you using transactions here, or transactionless?

Do you never have code-related error conditions that require
rolling back a series of database edits?

Personally, I check every query in my PHP code. On the rare occasion
that it fales, I show an error and get out. Even with postgres, I have
always checked success or failure of a query and shown an appropriate
error. Never in two years of programming PHP/postgres have I ever used
commit/rollback, and I have written some extremely complex web apps
(sourceforge being a prime example). Geocrawler.com runs on postgres and
again, I NEVER saw any need for any kind of rollback at all.

Wait ... how does checking every query help if QUERY2 fails after QUERY1
is sent, and you aren't using transactions?

Only now, with some extremely complex stuff that we're doing on
SourceForge would I like to see locks and rollbacks (hence my recent
interest in benchmarking and comparing the two). Your average web
programmer will almost never run into that in the short term.

Cool, at least I'm not considered average :) I *always* use transactions
in my scripts ... *shrug* then again, I'm heavily into 'the rules of
normalization', so tend to not crowd everything into one table.

#15

Tim Perdue

tperdue@valinux.com

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

Odd, were you using transactions here, or transactionless?

Does it matter? I suppose it was my programming error that somehow I got
duplicate primary keys in a table in the database where that should be
totally impossible under any circumstance? Another stupid
transactionless program I'm sure.

At any rate, it appears that the main problem I had with postgres (the
8K tuple limit) is being fixed and I will mention that in my writeup.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

#16

Michael Mayo

michael-a-mayo@worldnet.att.net

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

----- Original Message -----
From: "Tim Perdue" <tperdue@valinux.com>

Before I do that I want to confirm the major problem I had w/postgres:
the 8K tuple limit.

Just wanted to point out that this is not *exactly* true. While the
default limit is 8k, all that is required to change it to 32k is to change
one line of text in config.h (blcksz from 8k to 32k). This is pointed out
in the FAQ. So I would really consider the *default* to be 8k and the
*limit* to be 32k. IMHO 32k is good enough for 99% of tuples in a typical
bulletin-board-like application. It is not unreasonable to reject posts >
32k in size. Though you might want to evaluate performance using the 32k
tuples; might increase or decrease depending on application.

-Mike

#17

Constantin Teodorescu

teo@flex.ro

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

MySQL was About 50-60% faster in real-world web serving ...

Sorry if I didn't noticed, but I searched all the messages in the thread
for an information about the PostgreSQL version used in the test and
didn't found anything.

Tim, what version of PostgreSQL did you used? Hope it's 7.x.

Constantin Teodorescu
FLEX Consulting Braila, ROMANIA

#18

Tim Perdue

tperdue@valinux.com

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

Constantin Teodorescu wrote:

Tim, what version of PostgreSQL did you used? Hope it's 7.x.

Yes, 7.0.2

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

#19

Hannu Krosing

hannu@tm.ee

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

On Tue, 4 Jul 2000, Tim Perdue wrote:

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

Odd, were you using transactions here, or transactionless?

Actully I think I remember a recent bug report about some condition that
failed the uniqueness check when inside a transaction ;(

I think the report came with a fix ;)

------------
Hannu

#20

Hannu Krosing

hannu@tm.ee

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

The Hermit Hacker wrote:

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

Odd, were you using transactions here, or transactionless?

Does it matter? I suppose it was my programming error that somehow I got
duplicate primary keys in a table in the database where that should be
totally impossible under any circumstance? Another stupid
transactionless program I'm sure.

At any rate, it appears that the main problem I had with postgres (the
8K tuple limit) is being fixed and I will mention that in my writeup.

Currently (as of 7.0.x) you could use BLKSIZE=32K + lztext datatype and
get text fields about 64-128K depending on data if you are desperately
after big textfields.

-----------
Hannu

#21

Noname

JanWieck@t-online.de

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

On Tue, 4 Jul 2000, Tim Perdue wrote:

I took a real-world page from our site
<http://sourceforge.net/forum/forum.php?forum_id=1> and made it portable
to both databases. Of course, I could not import the "body" of the

did you take the time to optimize the queries to take advantage of
features that MySQL doesn't have, or just straight plug-n-play?

What a "real-world", one single URL, whow.

The "made it portable to both" lets me think it is stripped
down to the common denominator that both databases support.
That is no transactions, no subqueries, no features.

That's no "comparision", it's BS - sorry. If you want to
write a good article, take a couple of existing web
applications and analyze the complexity of their underlying
data model, what features are important/unimportant for them
and what could be done better in them with each database.
Then make suggestions which application should use which
database and explain why you think so.

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

Odd, were you using transactions here, or transactionless?

Mark, you cannot use Postgres transactionless. Each single
statement run outside of a transaction block has it's own
transaction.

Anyway, what version of Postgres was it? How big was the
indexed field?

Jan

#22

Ron Chmara

ron@Opus1.COM

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

Someone want to give me an example of something that would be
life-threatening that would run on a database? I can think of loads of
mission critical stuff, but life threatening?

How soon we forget the Y2K horror story warnings....

Pharmacy dispensary systems <-No meds, folks die.
Medical needs supply chain systems <- No Meds or Gauze or Tools, folks die.
Surgery scheduling systems <- No doctors or rooms for surgery, folks die
Military bombing flight-path systems <-Bad data for bomb location...
Weapons Design specifications storage <- Poorly designed systems killing
the testers and military users
Powergrid billing info <-No power, on assisted living (life support)
Banking/Financial account data <-No money, slow death of hunger
Food Shipping systems <- No food
Water distribution/management systems <- No water (I live in a desert)

Just off of the top of my head, yes, it's possible to kill people
with bad data.

-Bop

--
Brought to you from iBop the iMac, a MacOS, Win95, Win98, LinuxPPC machine,
which is currently in MacOS land. Your bopping may vary.

#23

Giles Lean

giles@nemeton.com.au

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

Someone want to give me an example of something that would be
life-threatening that would run on a database?

Medical records: I've stored blood type, HIV status, general pathology
results, and radiology results in a database.

A government site I know about stores court records about domestic
violence orders. Access to this information is required on short
notice and its absence can definitely be life threatening.

Life-threatening doesn't have to be realtime.

Regards,

Giles

#24

Hannu Krosing

hannu@tm.ee

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

Hannu Krosing wrote:

Tim Perdue wrote:

The Hermit Hacker wrote:

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

There a bug report that allowed tuplicate ids in an uniqe field when
SELECT FOR UPDATE was used. Could this be your case ?

---8<-------8<-------8<-------8<-------8<-------8<-------8<-------8<----
gamer=# create table test(i int primary key);
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'test_pkey'
for table 'test'
CREATE
gamer=# insert into test values(1);
INSERT 18860 1
gamer=# begin;
BEGIN
gamer=# select * from test for update;
i
---
1
(1 row)

gamer=# insert into test values(1);
INSERT 18861 1
gamer=# commit;
COMMIT
gamer=# select * from test;
i
---
1
1
(2 rows)

gamer=# insert into test values(1);
ERROR: Cannot insert a duplicate key into unique index test_pkey
---8<-------8<-------8<-------8<-------8<-------8<-------8<-------8<----

IIRC the fix was also provided, so it could be fixed in current CVS (the
above
is from 7.0.2, worked the same in 6.5.3)

Odd, were you using transactions here, or transactionless?

Ironically the above has to be using transactions as select for update
works
like this only inside transactions and is thus ineffectif if
transaction=statement;

As multi-command statements are run as a single transaction
(which can't be done from psql as it does its own splittng ;()
so a command like 'select * from test for update;insert into test
values(1);'
has the same effect

Does it matter? I suppose it was my programming error that somehow I got
duplicate primary keys in a table in the database where that should be
totally impossible under any circumstance? Another stupid
transactionless program I'm sure.

constraints and transactions are quite different (though connected)
things.

lack of some types of constraints (not null, in (1,2,3)) can be overcome
with careful programming, others like foreign keys or unique can't
unless
transactions are used)

no amount of careful programming will overcome lack of transactions
(except
implementing transactions yourself ;)

-----------
Hannu

#25

Hannu Krosing

hannu@tm.ee

over 25 years ago

In reply to: The Hermit Hacker (#14)

Re: Article on MySQL vs. Postgres

"Robert B. Easter" wrote:

While it is slow, I've been able to store unlimited amounts of text into
the database by using the following code.

Thanks for a really nice exaple !

I've tested inserting over 4
megabytes from a TEXTAREA web form using PHP. When inserting such massive
amounts of text, you will have to wait a while, but it will eventually succeed
if you don't run out of memory. If you do run out of memory, the backend
terminates gracefully and the transaction aborts/rollsback.

-- Load the PGSQL procedural language
-- This could also be done with the createlang script/program.
-- See man createlang.
CREATE FUNCTION plpgsql_call_handler()
RETURNS OPAQUE AS '/usr/local/pgsql/lib/plpgsql.so'
LANGUAGE 'C';

CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql'
HANDLER plpgsql_call_handler
LANCOMPILER 'PL/pgSQL';

You probably meant pl/tcl as all your code is using that ?

---------
Hannu

#26

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Noname (#21)

Re: Article on MySQL vs. Postgres

On Wed, 5 Jul 2000, Jan Wieck wrote:

Odd, were you using transactions here, or transactionless?

Mark, you cannot use Postgres transactionless. Each single
statement run outside of a transaction block has it's own
transaction.

Sorry, but 'transactionless' I mean no BEGIN/END ... from what I've been
gathering from Tim, his code goes something like:

do query 1
do query 2
if query 2 fails "oops"

do query 1
do query 2
if query 2 fails, abort and auto-rollback query 1

Then again, Tim might be being even more simple then that:

do query 1
exit

#27

Robert B. Easter

reaster@comptechnews.com

over 25 years ago

In reply to: Hannu Krosing (#20)

Re: Article on MySQL vs. Postgres

On Wed, 05 Jul 2000, Hannu Krosing wrote:

Tim Perdue wrote:

The Hermit Hacker wrote:

Further, I have had situations where postgres actually had DUPLICATE
ids in a primary key field, probably due to some abort or other nasty
situation in the middle of a commit. How did I recover from That?
Well, I had to run a count(*) next to each ID and select out the rows
where there was more than one of each "unique" id, then reinsert those
rows and drop and rebuild the indexes and reset the sequences.

Odd, were you using transactions here, or transactionless?

Does it matter? I suppose it was my programming error that somehow I got
duplicate primary keys in a table in the database where that should be
totally impossible under any circumstance? Another stupid
transactionless program I'm sure.

At any rate, it appears that the main problem I had with postgres (the
8K tuple limit) is being fixed and I will mention that in my writeup.

Currently (as of 7.0.x) you could use BLKSIZE=32K + lztext datatype and
get text fields about 64-128K depending on data if you are desperately
after big textfields.

-----------
Hannu

While it is slow, I've been able to store unlimited amounts of text into
the database by using the following code. I've tested inserting over 4
megabytes from a TEXTAREA web form using PHP. When inserting such massive
amounts of text, you will have to wait a while, but it will eventually succeed
if you don't run out of memory. If you do run out of memory, the backend
terminates gracefully and the transaction aborts/rollsback.

-- Load the PGSQL procedural language
-- This could also be done with the createlang script/program.
-- See man createlang.
CREATE FUNCTION plpgsql_call_handler()
RETURNS OPAQUE AS '/usr/local/pgsql/lib/plpgsql.so'
LANGUAGE 'C';

CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql'
HANDLER plpgsql_call_handler
LANCOMPILER 'PL/pgSQL';

--------------------------------------------------------------------------------
--
-- Large Text storage
--

-- putlgtext - generic function to store text into the
-- specified text storage table.
-- The table specified in $1 should have the following
-- fields:
-- id, text_seq, text_block
--
-- $1 is the name of the table into which $3 is stored
-- $2 is the id of the text and references id in another table
-- $3 is the text to store, which is broken into chunks.
-- returns 0 on success
-- nonzero otherwise
CREATE FUNCTION putlgtext (TEXT, INTEGER, TEXT) RETURNS INTEGER AS '
set i_table $1
set i_id $2
set i_t {}
regsub -all {([\\''\\\\])} $3 {\\\\\\1} i_t
set i_seq 0
while { $i_t != {} } {
set i_offset 0
set tblock [string range $i_t 0 [expr 7000 + $i_offset]]
# Do not split string at a backslash
while { [string range $tblock end end] == "\\\\" && $i_offset < 1001 } {
set i_offset [expr $i_offset + 1]
set tblock [string range $i_t 0 [expr 7000 + $i_offset]]
}
set i_t [string range $i_t [expr 7000 + [expr $i_offset + 1]] end]
spi_exec "INSERT INTO $i_table (id, text_seq, text_block) VALUES ( $i_id , $i_seq , ''$tblock'' )"
incr i_seq
}
return 0
' LANGUAGE 'pltcl';

-- getlgtext - like putlgtext, this is a generic
-- function that does the opposite of putlgtext
-- $1 is the table from which to get TEXT
-- $2 is the id of the text to get
-- returns the text concatenated from one or more rows
CREATE FUNCTION getlgtext(TEXT, INTEGER) RETURNS TEXT AS '
set o_text {}
spi_exec -array q_row "SELECT text_block FROM $1 WHERE id = $2 ORDER BY text_seq" {
append o_text $q_row(text_block)
}
return $o_text
' LANGUAGE 'pltcl';

-- largetext exists just to hold an id and a dummy 'lgtext' attribute.
-- This table's trigger function provides for inserting and updating
-- into largetext_block. The text input to lgtext actually gets
-- broken into chunks and stored in largetext_block.
-- Deletes to this table will chain to largetext_block automatically
-- by referential integrity on the id attribute.
-- Selects have to be done using the getlgtext function.
CREATE TABLE largetext (
id INTEGER PRIMARY KEY,
lgtext TEXT -- dummy field
);
COMMENT ON TABLE largetext IS 'Holds large text';

-- This table must have the field names as they are.
-- These attribute names are expected by put/getlgtext.
CREATE TABLE largetext_block (
id INTEGER NOT NULL
REFERENCES largetext
ON DELETE CASCADE,

text_seq INTEGER NOT NULL,

text_block TEXT,

PRIMARY KEY (id, text_seq)
);
COMMENT ON TABLE largetext_block IS 'Holds blocks of text for table largetext';
CREATE SEQUENCE largetext_seq;

-- SELECT:
-- SELECT id AS the_id FROM largetext;
-- SELECT getlgtext('largetext_block', id) FROM largetext WHERE id = the_id;

-- INSERT:
-- INSERT INTO largetext (lgtext) values ('.......');

-- DELETE:
-- DELETE FROM largetext WHERE id = someid;
-- deletes from largetext and by referential
-- integrity, from largetext_text all associated block rows.
CREATE FUNCTION largetext_trigfun() RETURNS OPAQUE AS '
set i_t {}
regsub -all {([\\''\\\\])} $NEW($2) {\\\\\\1} i_t
switch $TG_op {
INSERT {
spi_exec "SELECT nextval(''largetext_seq'') AS new_id"
set NEW($1) $new_id
spi_exec "SELECT putlgtext(''largetext_block'', $new_id, ''$i_t'') AS rcode"
if { $rcode != 0 } then { return SKIP }
}
UPDATE {
if { $NEW($2) != {} } then {
spi_exec "DELETE FROM largetext_text WHERE id = $OLD($1)"
spi_exec "SELECT putlgtext(''largetext_block'', $OLD($1), ''$NEW($2)'') AS rcode"
if { $rcode != 0 } then { return SKIP }
}
}
}
set NEW($2) "ok"
return [array get NEW]
' LANGUAGE 'pltcl';

-- Set the function as trigger for table largetext
CREATE TRIGGER largetext_trig BEFORE INSERT OR UPDATE
ON largetext FOR EACH ROW EXECUTE
PROCEDURE largetext_trigfun(id,lgtext);

I had to use the regsub function calls to replace the \ escaping on literal
'\'s. What a pain! If anyone can try this code and suggest ways to improve
its speed, I'd be happy.

--
Robert

#28

Peter Galbavy

peter.galbavy@knowledge.com

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

What ? sleepycat DB3 is now GPL ? That would be a change of philosophy.

Peter

----- Original Message -----
From: "Egon Schmid" <eschmid@php.net>
To: "Tim Perdue" <tperdue@valinux.com>
Cc: <pgsql-hackers@hub.org>
Sent: Tuesday, July 04, 2000 7:51 PM
Subject: Re: [HACKERS] Article on MySQL vs. Postgres

Tim Perdue wrote:

On wednesday or thursday, I'm going to be publishing my article on MySQL
vs. Postgres on PHPBuilder.com.

Cool!

Features:
Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
now supports foreign keys, which can help with referential integrity.
Postgres supports subselects and better support for creating tables as
the result of queries. The "transaction" support that MySQL lacks is
included in Postgres, although you'll never miss it on a website, unless
you're building something for a bank, and if you're doing that, you'll
use oracle.

Since MySQL version 3.23.16 it supports transactions with sleepycats DB3
and since version 3.23.19 it is under the GPL.

-Egon

#29

Mike Mascari

mascarm@mascari.com

over 25 years ago

In reply to: Tim Perdue (#1)

Re: Article on MySQL vs. Postgres

Peter Galbavy wrote:

What ? sleepycat DB3 is now GPL ? That would be a change of philosophy.

Peter

Not to my understanding. If you sell a commercial solution
involving MySQL, you have to pay Sleepycat a licensing fee. For
non-commercial use, its free. Oh, what a tangled web we weave
when we bail from BSD.

Mike Mascari

#30

Robert B. Easter

reaster@comptechnews.com

over 25 years ago

In reply to: Hannu Krosing (#25)

Re: Article on MySQL vs. Postgres

On Wed, 05 Jul 2000, Hannu Krosing wrote:

"Robert B. Easter" wrote:

-- Load the PGSQL procedural language
-- This could also be done with the createlang script/program.
-- See man createlang.
CREATE FUNCTION plpgsql_call_handler()
RETURNS OPAQUE AS '/usr/local/pgsql/lib/plpgsql.so'
LANGUAGE 'C';

CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql'
HANDLER plpgsql_call_handler
LANCOMPILER 'PL/pgSQL';

You probably meant pl/tcl as all your code is using that ?

Yes, I mean't to say this:

-- Load the TCL procedural language
-- This could also be done with the createlang script/program.
-- See man createlang.
CREATE FUNCTION pltcl_call_handler()
RETURNS OPAQUE AS '/usr/local/pgsql/lib/pltcl.so'
LANGUAGE 'C';

CREATE TRUSTED PROCEDURAL LANGUAGE 'pltcl'
HANDLER pltcl_call_handler
LANCOMPILER 'PL/tcl';

--
Robert

#31

Benjamin Adida

ben@mit.edu

over 25 years ago

In reply to: Tim Perdue (#7)

Re: Article on MySQL vs. Postgres

on 7/4/00 8:30 PM, Tim Perdue at tperdue@valinux.com wrote:

This is for what most people do with PHP and databases - run
semi-critical medium-traffic sites. Anyone running a mission-critical
site would have to look elsewhere for true robustness. I would not at
this time recommend any serious, life-threatening app run On either
database.

To the person who owns the web site, data is always critical. Does
www.yahoo.com store "life-threatening" information? Not really, but if you
lose your yahoo.com email, the "oh sorry, our database doesn't support
transactions" excuse doesn't cut it.

I took a real-world page from our site
<http://sourceforge.net/forum/forum.php?forum_id=1> and made it portable
to both databases. Of course, I could not import the "body" of the
message into postgres because of the 8k limitation, so the body had to
be dropped from both databases.

The "nested" view of this page requires joins against three tables and
some recursion to show submessages.

Some recursion? That is interesting. Do you mean multiple queries to the
database? I don't see any reason to have multiple queries to the database to
show nested messages in a forum. Using stored procedures to create sort keys
at insertion or selection time is the efficient way to do this. Ah, but
MySQL doesn't have stored procedures.

Did you use connection pooling (a lot

I used persistent connections, yes. Without them, Postgres' showing was
far poorer, with mysql showing about 2x the performance.

Well, there must be some issue with your setup, because 10 requests per
second on Postgres on reads only is far from what I've seen on much wimpier
boxes than yours. Maybe I should look some more into how pconnect really
handles connection pooling, I have heard bad things that need to be
verified.

I'd really love to see a case where a real-world page view requires 4x
the queries on MySQL. If you are doing subselects like that on a website
in real-time you've got serious design problems and postgres would
fold-up and quit under the load anyway.

I believe the "design problems" come up if you need subselects and you're
using MySQL. I've used Illustra/Informix, Oracle, and now Postgres to build
database-backed web sites, and subselects are a vital part of any
somewhat-complex web app. How exactly do subselects constitute a design
problem in your opinion?

Specifically, what is the problem with my "intuition"? All I did in the
prior message was report my results and ask for feedback before I post
it.

Your intuition is that Postgres will be slower because it is slower than
MySQL at reads. I contend that:
- Postgres 7.0 is much faster at reads than the numbers you've shown.
I've seen it be much faster on smaller boxes.
- The slowdown you're seeing is probably due in no small part to the
implementation of pconnect(), the number of times it actually connects vs.
the number of times it goes to the pool, how large that pool gets, etc...
- The write-inefficiencies of MySQL will, on any serious web site, cut
performance so significantly that it is simply not workable. I'm thinking of
the delayed updates on Slashdot, the 20-25 second page loads on SourceForge
for permission updating and such...

Personally, I check every query in my PHP code. On the rare occasion
that it fales, I show an error and get out. Even with postgres, I have
always checked success or failure of a query and shown an appropriate
error. Never in two years of programming PHP/postgres have I ever used
commit/rollback, and I have written some extremely complex web apps
(sourceforge being a prime example). Geocrawler.com runs on postgres and
again, I NEVER saw any need for any kind of rollback at all.

Geez. So you never have two inserts or updates you need to perform at once?
*ever*? What happens if your second one fails? Do you manually attempt to
backtrack on the changes you've made?

The statelessness of the web pretty much obviates the needs for
locks/rollbacks as each process is extremely quick and runs from start
to finish instantly. It's not like the old days where you pull data down
into a local application, work on it, then upload it again.

Only now, with some extremely complex stuff that we're doing on
SourceForge would I like to see locks and rollbacks (hence my recent
interest in benchmarking and comparing the two). Your average web
programmer will almost never run into that in the short term.

This is simply false. If you're not using commit/rollbacks, you're either
cutting back on the functionality of your site, creating potential error
situations by the dozen, or you've got some serious design issues in your
system. Commit/Rollback is not an "advanced" part of building web sites. It
is a basic building block.

Telling your "average web programmer" to ignore transactions is like telling
your programmers not to free memory in your C programs because, hey, who
cares, you've got enough RAM for small programs, and they can learn to clean
up memory when they build "real" systems!

Of all things, this is precisely the type of thinking that crushes the
credibility of the open-source community. Enterprise IT managers understand
in great detail the need for transactions. Web sites actually need *more*
reliable technology, because you don't have that stateful session: you
sometimes need to recreate rollback mechanisms across pages by having
cleanup processes. Building this on a substrate that doesn't support the
basic transaction construct is impossible and irresponsible.

Your own strong biases are shown in your message. I do this stuff
because I'm curious and want to find out for myself. Most readers will
find it interesting as I did. Few will switch from MySQL to postgres or
vice versa because of it.

My bias? Well, my company doesn't have a vested interest in promoting
Postgres or MySQL. Before I started using Postgres, I looked into MySQL.
You're right if you think my evaluation didn't take too long. If I have
preferences, they're based purely on engineering decisions. That's not the
same as "my company just publicly endorsed MySQL, and check it out, we think
MySQL is better than Postgres."

Note that I am *not* saying that you're doing this on purpose, I'm just
saying that you're going to have a really hard time proving your
objectivity.

Another clarification: PHPBuilder is owned by internet.com, a competitor
of VA Linux/Andover.

PHP folks have a bias, too: PHP was built with MySQL in mind, it even ships
with MySQL drivers (and not Postgres). PHP's mediocre connection pooling
limits Postgres performance.

I'm happy to continue this discussion, but here's what I've noticed from
having had this argument many many times: if you don't believe that
transactions are useful or necessary, that subselects and enforced foreign
key constraints are hugely important, then this discussion will lead
nowhere. We simply begin with different assumptions.

I only suggest that you begin your evaluation article by explaining:
- your assumptions
- the fact that the page you used for benchmarking was originally built
for MySQL, and thus makes no use of more advanced Postgres features.

-Ben

#32

Hannu Krosing

hannu@tm.ee

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

Benjamin Adida wrote:

...useless rant about all MySQL users being stupid inept programmers
deleted....

PHP folks have a bias, too: PHP was built with MySQL in mind, it even ships
with MySQL drivers (and not Postgres). PHP's mediocre connection pooling
limits Postgres performance.

Well the point of this article is obviously in relation to PHP. Yes,
Rasmus Lerdorf himself uses MySQL and I'm sure Jan would say he's a
"wannabee", not a "real developer".

Rather he is probably a _web_ developer and not a _database_ developer, as
most developers with DB background abhor lack of transactions, as you have
surely noticed by now, and would not use MySQL fro R/W access ;)

Yes I'm sure that PHP was designed to make Postgres look bad. All
benchmarks are designed to make postgres look bad. All web designers
build everything in just that special way that makes postgres look bad,
and they all do it because they're inept and stupid,

Or just irresponsible.

That's how most websites grow -
at first no writes - MySQL is great (a filesystem with SQL interface
performance-wize)
then some writes in real time, when popularity grows bad things start to
happen.
then delayed writes a la Slashdot to keep the performance and integrity of
database.

unlike the small crowd of postgres users.

That could be part of the problem ;)

SQL is a complex beast and a programmer experienced in procedural languages
takes some time to learn to use it effectively. Until then he just tries to
use his C/Pascal/java/whatever knowledge and simple selects - and this is
where MySQL excels.

----------------
Hannu

#33

Tim Perdue

tperdue@valinux.com

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Benjamin Adida wrote:

...useless rant about all MySQL users being stupid inept programmers
deleted....

PHP folks have a bias, too: PHP was built with MySQL in mind, it even ships
with MySQL drivers (and not Postgres). PHP's mediocre connection pooling
limits Postgres performance.

Well the point of this article is obviously in relation to PHP. Yes,
Rasmus Lerdorf himself uses MySQL and I'm sure Jan would say he's a
"wannabee", not a "real developer".

Yes I'm sure that PHP was designed to make Postgres look bad. All
benchmarks are designed to make postgres look bad. All web designers
build everything in just that special way that makes postgres look bad,
and they all do it because they're inept and stupid, unlike the small
crowd of postgres users.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

#34

Peter Galbavy

peter.galbavy@knowledge.com

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Yes I'm sure that PHP was designed to make Postgres look bad. All
benchmarks are designed to make postgres look bad. All web designers
build everything in just that special way that makes postgres look bad,
and they all do it because they're inept and stupid, unlike the small
crowd of postgres users.

I don't believe that your sarcasm is unwarranted, BUT, and this is a big but
(just like mine :), I have found that the popularity of free software is
sometimes iversly proportional to it's complexity. Complexity in turn
sometimes, but not always, implies that the software has more features and
is better thought out. There are exceptions to this, but it has proven true
for many of the packages I have worked with.

MySQL is used by Linux folks (generalising), probably because the learning
curve is not too steep. And the otherway round for other DB + OS
combinations.

The problem I think that many folk have with printed benchmarks is the
apples to oranges comparisons. To make the comparison look valid, you have
to either reduce or ignore the differences of the fruit and just look at a
limited set of values. In the case of the apples and oranges, "average
diameter" may be valid, while "green-ness" is not. The eater of the fruit
actually wanted to know "which tastes better".

Peter

#35

Benjamin Adida

ben@mit.edu

over 25 years ago

In reply to: Tim Perdue (#33)

Re: Article on MySQL vs. Postgres

on 7/5/00 11:37 AM, Tim Perdue at tperdue@valinux.com wrote:

...useless rant about all MySQL users being stupid inept programmers
deleted....

Hmmm, okay, well, I guess my invitation to continue the conversation while
admitting a difference in assumptions is declined. Yes, my response was
harsh, but harsh on MySQL. I didn't attack MySQL programmers. I attacked the
product.

Is there a way to do this without incurring the wrath of MySQL users? If you
look at the Postgres mailing list, your worries (the duplicate key thing)
were addressed immediately by Postgres programmers, because they (the
Postgres team, which *doesn't* include me) understand the need to improve
the product.

And no, benchmarks aren't built to make Postgres look bad. But PHP is built
around an inefficient connection pooling system, which doesn't appear much
under MySQL because MySQL has extremely fast connection setup, while every
other RDBMS on the market (Oracle, Sybase, Informix, Postgres) does not.
That's the cost of setting up a transaction environment, it takes a bit of
time. Thus, PHP's pconnect() crushes performance on all databases except
MySQL.

But anyhow, I've clearly hit a nerve. You asked a question, I answered
truthfully, honestly, and logically. And you're absolutely right that I come
out strongly against MySQL. Proceed with this information as you see fit...

-Ben

#36

Thomas Lockhart

lockhart@alumni.caltech.edu

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Yes I'm sure that PHP was designed to make Postgres look bad. All
benchmarks are designed to make postgres look bad. All web designers
build everything in just that special way that makes postgres look bad,
and they all do it because they're inept and stupid, unlike the small
crowd of postgres users.

Another happy customer... ;)

Tim, one of the apparent "discriminators" between typical MySQL users
and typical Postgres users is their perception of the importance of
transactions and its relevance in application design.

For myself, coming from other commercial databases and having built
large data handling systems using those, doing without transactions is
difficult to accept. And we'd like for others to see the light too.
Hopefully the light will be a bit closer soon, since, apparently,
transactions are coming to the MySQL feature set.

You mentioned a speed difference in Postgres vs MySQL. The anecdotal
reports are quite often in this direction, but we typically see
comparable or better performance with Postgres when we actually look at
the app or benchmark. Would it be possible to see the test case and to
reproduce it here?

Regards.

- Thomas

#37

Constantin Teodorescu

teo@flex.ro

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Tim Perdue wrote:

Yes I'm sure that PHP was designed to make Postgres look bad. All
benchmarks are designed to make postgres look bad. All web designers
build everything in just that special way that makes postgres look bad,
and they all do it because they're inept and stupid, unlike the small
crowd of postgres users.

Tim, don't be so upset.

I'm not an english fluently speaker so I hope I can make myself clearly
understood.
Noone wants you to write a good article for PostgreSQL just because they
are developing PostgreSQL.
Noone hates MySQL.
Noone tries to make PostgreSQL look better as it is. We don't sell it
:-)
It's just a couple of things that are important in database benchmarks
and the PostgreSQL developers knows them better.
That's why I consider that you have done a good thing telling us about
your article and I sincerely hope that you don't feel sorry for that.
I agree with you that they were some replies to your message rather ...
violent I can say.
Definitely, MySQL and PostgreSQL has their own application preferences
and they are making a good job each of them.
It's so difficult to compare them as it would be comparing two cars
(theyu have 4 wheels, 4 doors, an engine) and we could pick for example
the VW Beetle and a Mercedes A-class.

So, I would say to write your article about using MySQL or PostgreSQL on
PHP applications and let other know your results. Now, when MySQL is
GPL, it's a good thing to make such a comparisson. But please, don't pe
angry and upset on the PostgreSQL developers and community. They just
tried to give a hand of help revealing some important features of
PostgreSQL.

hope it helps,
Best regards,
Constantin Teodorescu
FLEX Consulting Braila, ROMANIA

#38

Sevo Stille

sevo@ip23.net

over 25 years ago

In reply to: The Hermit Hacker (#12)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

We had a *very* old version of PostgreSQL running on a Pentium acting as
an accounting/authentication backend to a RADIUS server for an ISP
... uptime for the server itself was *almost* 365 days (someone hit the
power switch by accident, meaning to power down a different machine
*sigh*) ... PostgreSQL server had been up for something like 6 months
without any problems, with the previous downtime being to upgrade the
server ...

At a previous employer, there is still a database running that has not
seen a crash downtime ever since early 1996 - the only few downtimes it
ever saw were for a rare few postgres, OS and hardware upgrades. As
there have been no cries for help on any database or reboot issue ever
since I left (I still am appointed as the DB admin in case of any
trouble), it must be getting close to two years uptime by now, and that
literally unattended.

Sevo

--
sevo@ip23.net

#39

Tom Lane

tgl@sss.pgh.pa.us

over 25 years ago

In reply to: Hannu Krosing (#24)

Re: Article on MySQL vs. Postgres

Hannu Krosing <hannu@tm.ee> writes:

There a bug report that allowed tuplicate ids in an uniqe field when
SELECT FOR UPDATE was used. Could this be your case ?
[snip]
IIRC the fix was also provided, so it could be fixed in current CVS (the
above is from 7.0.2, worked the same in 6.5.3)

It does seem to be fixed in current CVS:

regression=# create table test(i int primary key);
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'test_pkey' for table 'test'
CREATE
regression=# insert into test values(1);
INSERT 145215 1
regression=# begin;
BEGIN
regression=# select * from test for update;
i
---
1
(1 row)

regression=# insert into test values(1);
ERROR: Cannot insert a duplicate key into unique index test_pkey
regression=#

regards, tom lane

#40

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

On Wed, 5 Jul 2000, Benjamin Adida wrote:

Another clarification: PHPBuilder is owned by internet.com, a competitor
of VA Linux/Andover.

PHP folks have a bias, too: PHP was built with MySQL in mind, it even ships
with MySQL drivers (and not Postgres). PHP's mediocre connection pooling
limits Postgres performance.

Careful here ... PHP was not built with MySQL in mind ... hell, I used PHP
ages before it even *had* MySQL support (hell, before I even know about
Postgres95 *or* MySQL) ... also, if I recall reading on the PHP site, the
MySQL support that is included is limited, but I don't recall where I read
it. There is a recommendation *somewhere* that if you want to use all the
features, you ahve to install the MySQL libraries first ...

Just to defend PHP, cause, well ... I like it :)

#41

Tim Perdue

tperdue@valinux.com

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Thomas Lockhart wrote:

You mentioned a speed difference in Postgres vs MySQL. The anecdotal
reports are quite often in this direction, but we typically see
comparable or better performance with Postgres when we actually look at
the app or benchmark. Would it be possible to see the test case and to
reproduce it here?

Finally a sensible reply from one of the core guys.

http://www.perdue.net/benchmarks.tar.gz

To switch between postgres and mysql, copy postgres.php to database.php,
change the line of SQL with the LIMIT statement in forum.php.

To move to mysql, copy mysql.php to database.php and change the line of
SQL in forum.php

No bitching about the "bad design" of the forum using recursion to show
submessages. It can be done in memory in PHP, but I chose to hit the
database instead. This page is a good example of one that hits the
database hard. It's one of the worst on our site.

At any rate, I wish someone would write an article that explains what
the benefits of transactions are, and how to use them effectively in a
web app, skipping the religious fervor surrounding pgsql vs. myql.
There's a lot of people visiting PHPBuilder who just want to expand
their knowledge of web development, and many of them would find that
interesting.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

#42

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Tim Perdue (#33)

Re: Article on MySQL vs. Postgres

On Wed, 5 Jul 2000, Tim Perdue wrote:

Benjamin Adida wrote:

...useless rant about all MySQL users being stupid inept programmers
deleted....

PHP folks have a bias, too: PHP was built with MySQL in mind, it even ships
with MySQL drivers (and not Postgres). PHP's mediocre connection pooling
limits Postgres performance.

Well the point of this article is obviously in relation to PHP. Yes,
Rasmus Lerdorf himself uses MySQL and I'm sure Jan would say he's a
"wannabee", not a "real developer".

I would seriously doubt that Jan wuld consider Rasmus a 'wannabee'
... Rasmus essentially built a Web optimized, HTML embedded language that
I imagine a *large* percentage of the sites on the 'Net rely on. My
experience with the language is that it is clean and *very* easy to pick
up for simple stuff, with some nice, advanced tools for the more complex
issues ...

I use PHP with PgSQL almost exclusively now for my frontends, since its
got some *nice* features for retrieving the results of queries (ie. I love
being able to do a SELECT * and being able to retrive the results by the
field name instead of having to know the ordering) ...

#43

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Tim Perdue (#41)

Re: Article on MySQL vs. Postgres

On Wed, 5 Jul 2000, Tim Perdue wrote:

At any rate, I wish someone would write an article that explains what
the benefits of transactions are, and how to use them effectively in a
web app, skipping the religious fervor surrounding pgsql vs. myql.
There's a lot of people visiting PHPBuilder who just want to expand
their knowledge of web development, and many of them would find that
interesting.

I couldn't write to save my life, but if you want to try and co-write
something, I'm more then willing to try and provide required input ...

#44

Benjamin Adida

ben@mit.edu

over 25 years ago

In reply to: Tim Perdue (#41)

Re: Article on MySQL vs. Postgres

Tim,

I'm sorry if I came off harsh in my previous comments. I'm a fervent
supporter of open-source software, and have hit massive pushback from
enterprise people because they see all the open-source sites using MySQL,
and that is outrageous to them. Although MySQL has a few, important niches
to fill, it's been used in places where I think it's hurt the credibility of
open-source web developers. I've been trying to talk to MySQL
developer/users about how we got to where we are, but with little success
(and what I've told you is by far the nastiest I've ever been in this
respect).

I hope that we can have a meaningful exchange about these issues. I'm a fan
of Postgres, but by no means a religious supporter of it. I *am* a religious
supporter of transactions, subselects, and such.

If you'd like to find out more about transactions, you can check out Philip
Greenspun's http://www.arsdigita.com/asj/aolserver/introduction-2.html which
has a paragraph about "Why Oracle?" which explains the reasons for choosing
an ACID-compliant RDBMS.

I'm also happy to write up a "why transactions are good" article.

-Ben

on 7/5/00 12:34 PM, Tim Perdue at tperdue@valinux.com wrote:

Show quoted text

Thomas Lockhart wrote:

You mentioned a speed difference in Postgres vs MySQL. The anecdotal
reports are quite often in this direction, but we typically see
comparable or better performance with Postgres when we actually look at
the app or benchmark. Would it be possible to see the test case and to
reproduce it here?

Finally a sensible reply from one of the core guys.

http://www.perdue.net/benchmarks.tar.gz

To switch between postgres and mysql, copy postgres.php to database.php,
change the line of SQL with the LIMIT statement in forum.php.

To move to mysql, copy mysql.php to database.php and change the line of
SQL in forum.php

No bitching about the "bad design" of the forum using recursion to show
submessages. It can be done in memory in PHP, but I chose to hit the
database instead. This page is a good example of one that hits the
database hard. It's one of the worst on our site.

At any rate, I wish someone would write an article that explains what
the benefits of transactions are, and how to use them effectively in a
web app, skipping the religious fervor surrounding pgsql vs. myql.
There's a lot of people visiting PHPBuilder who just want to expand
their knowledge of web development, and many of them would find that
interesting.

Tim

#45

Tim Perdue

tperdue@valinux.com

over 25 years ago

In reply to: Benjamin Adida (#44)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

Will you accept modifications to this if submit'd, to make better use of
features that PostgreSQL has to improve performance? Just downloaded it
and am going to look her through, just wondering if it would be a waste of
time for me to suggest changes though :)

If you can figure out an algorithm that shows these nested messages more
efficiently on postgres, then that would be a pretty compelling reason
to move SourceForge to Postgres instead of MySQL, which is totally
reaching its limits on our site. Right now, neither database appears
like it will work, so Oracle is starting to loom on the horizon.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Import Notes

Reference msg id not found: Pine.BSF.4.21.0007051341370.33627-100000@thelab.hub.org | Resolved by subject fallback

#46

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Tim Perdue (#45)

Re: Article on MySQL vs. Postgres

I've taken this offlist with Tim/Ben to see what we can come up with
... the thread is/has become too "heated" to get anything productive done
...

On Wed, 5 Jul 2000, Tim Perdue wrote:

The Hermit Hacker wrote:

Will you accept modifications to this if submit'd, to make better use of
features that PostgreSQL has to improve performance? Just downloaded it
and am going to look her through, just wondering if it would be a waste of
time for me to suggest changes though :)

If you can figure out an algorithm that shows these nested messages more
efficiently on postgres, then that would be a pretty compelling reason
to move SourceForge to Postgres instead of MySQL, which is totally
reaching its limits on our site. Right now, neither database appears
like it will work, so Oracle is starting to loom on the horizon.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#47

Noname

JanWieck@t-online.de

over 25 years ago

In reply to: The Hermit Hacker (#42)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

On Wed, 5 Jul 2000, Tim Perdue wrote:

Benjamin Adida wrote:

...useless rant about all MySQL users being stupid inept programmers
deleted....

PHP folks have a bias, too: PHP was built with MySQL in mind, it even ships
with MySQL drivers (and not Postgres). PHP's mediocre connection pooling
limits Postgres performance.

Well the point of this article is obviously in relation to PHP. Yes,
Rasmus Lerdorf himself uses MySQL and I'm sure Jan would say he's a
"wannabee", not a "real developer".

I would seriously doubt that Jan wuld consider Rasmus a 'wannabee'
.... Rasmus essentially built a Web optimized, HTML embedded language that
I imagine a *large* percentage of the sites ...

NEVER!

Once I've built a PG based middle tear with an apache module
that could in cooperation be a complete virtual host inside
of a DB. Including inbound Tcl scripting, DB-access, dynamic
images and whatnot. Never finished that work until AOL-Server
3.0 appeared, at which point I considered my product
"trashwork".

Some of the sources I looked at (and learned alot from) was
the PHP module. So I know what kind of programmer built that.

Maybe someone of the PG community should spend some time
building a better PHP coupling and contribute to that
project. And there are more such projects out that need a
helping hand from our side.

Jan

#48

Chris Bitmead

chrisb@nimrod.itg.telstra.com.au

over 25 years ago

In reply to: Benjamin Adida (#31)

Re: Article on MySQL vs. Postgres

Benjamin Adida wrote:

Some recursion? That is interesting. Do you mean multiple queries to the
database? I don't see any reason to have multiple queries to the database to
show nested messages in a forum. Using stored procedures to create sort keys
at insertion or selection time is the efficient way to do this. Ah, but
MySQL doesn't have stored procedures.

Can you be more specific on how you would support arbitrary nesting and
correct sorting of a threaded discussion in postgres? I've thought about
this problem but didn't come up with anything except to re-implement the
old recursive " retrieve* " from the old postgres.

#49

Bruce Momjian

pgman@candle.pha.pa.us

over 25 years ago

In reply to: Tim Perdue (#45)

Re: Article on MySQL vs. Postgres

The Hermit Hacker wrote:

Will you accept modifications to this if submit'd, to make better use of
features that PostgreSQL has to improve performance? Just downloaded it
and am going to look her through, just wondering if it would be a waste of
time for me to suggest changes though :)

If you can figure out an algorithm that shows these nested messages more
efficiently on postgres, then that would be a pretty compelling reason
to move SourceForge to Postgres instead of MySQL, which is totally
reaching its limits on our site. Right now, neither database appears
like it will work, so Oracle is starting to loom on the horizon.

All I can say is, "Yikes". Let's see if we can help this guy.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

#50

The Hermit Hacker

scrappy@hub.org

over 25 years ago

In reply to: Bruce Momjian (#49)

Re: Article on MySQL vs. Postgres

Way ahead of you :)

Ben and I are workign with Tim on this ... I've provided Ben with an
account on postgresql.org that he can use, with access to a v7.0 database
as well as web ...

On Thu, 6 Jul 2000, Bruce Momjian wrote:

The Hermit Hacker wrote:

Will you accept modifications to this if submit'd, to make better use of
features that PostgreSQL has to improve performance? Just downloaded it
and am going to look her through, just wondering if it would be a waste of
time for me to suggest changes though :)

If you can figure out an algorithm that shows these nested messages more
efficiently on postgres, then that would be a pretty compelling reason
to move SourceForge to Postgres instead of MySQL, which is totally
reaching its limits on our site. Right now, neither database appears
like it will work, so Oracle is starting to loom on the horizon.

All I can say is, "Yikes". Let's see if we can help this guy.
-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org