more anti-postgresql FUD

Started by Merlin Moncureover 19 years ago69 messages
#1Merlin Moncure
mmoncure@gmail.com

http://www.zabbix.com/manual/v1.1/install.php

in section PostgreSQL vs MySQL :
[quoting]
Regarding the choice between PostgreSQL and MySQL, MySQL is
recommended for several reasons:

* MySQL is faster

recent benchmarks using ZABBIX clearly show that PostgreSQL
(7.1.x) is at least 10 times slower than MySQL (3.23.29)

Note: These results are predictable. ZABBIX server processes use
simple SQL statements like single row INSERT, UPDATE and simple SELECT
operators. In such environment, use of advanced SQL engine (like
PostgreSQL) is overkill.
* no need to constantly run resource-hungry command "vacuum" for MySQL
* MySQL is used as a primary development platform.

If you do use PostgreSQL, zabbix_server will periodically (defined in
HousekeepingFrequency) execute command vacuum analyze.
[done]

anybody know these guys? this is right off the mysql anti-postgresql
advocacy page.

merlin

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

Merlin Moncure wrote:

http://www.zabbix.com/manual/v1.1/install.php

in section PostgreSQL vs MySQL :
[quoting]
Regarding the choice between PostgreSQL and MySQL, MySQL is
recommended for several reasons:

* MySQL is faster

recent benchmarks using ZABBIX clearly show that PostgreSQL
(7.1.x) is at least 10 times slower than MySQL (3.23.29)

Note: These results are predictable. ZABBIX server processes use
simple SQL statements like single row INSERT, UPDATE and simple SELECT
operators. In such environment, use of advanced SQL engine (like
PostgreSQL) is overkill.
* no need to constantly run resource-hungry command "vacuum" for MySQL
* MySQL is used as a primary development platform.

If you do use PostgreSQL, zabbix_server will periodically (defined in
HousekeepingFrequency) execute command vacuum analyze.
[done]

anybody know these guys? this is right off the mysql anti-postgresql
advocacy page.

Well they may be right that far back. But 7.1 is years and years old.

Joshua D. Drake

merlin

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Joshua D. Drake (#2)
Re: more anti-postgresql FUD

On 10/10/06, Joshua D. Drake <jd@commandprompt.com> wrote:

Merlin Moncure wrote:

http://www.zabbix.com/manual/v1.1/install.php

in section PostgreSQL vs MySQL :

Well they may be right that far back. But 7.1 is years and years old.

Joshua D. Drake

no excuse. that would be like postgresql having a documentation
chapter comparing nagios and zabbix for recommended network monitoring
tool and favoring nagios with advocacy information lifted from a
nagios developer's blog. while this is shady in and of itself, it
carries a burden of keeping the information up to date. if that was
in wikipedia i would be hitting the delete button.

FUD from another open source project is really poor form, particulary
when not in competing segements where a little bit of competitive
rivalry is expected.

merlin

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

Merlin Moncure wrote:

http://www.zabbix.com/manual/v1.1/install.php

in section PostgreSQL vs MySQL :
[quoting]
Regarding the choice between PostgreSQL and MySQL, MySQL is
recommended for several reasons:

I don't see any fear, uncertainty, or doubt there.

* MySQL is faster

It probably is for that application. Of course their references
benchmark is outdated, but that does not make it FUD.

* no need to constantly run resource-hungry command "vacuum" for
MySQL

Also a good, albeit outdated, reason.

* MySQL is used as a primary development platform.

Another good reason.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: more anti-postgresql FUD

Peter Eisentraut <peter_e@gmx.net> writes:

* MySQL is used as a primary development platform.

Another good reason.

Actually that's *the* reason --- it's always going to be hard for
Postgres to look good for an application that's been designed/optimized
for MySQL. The application has already made whatever compromises it
had to for that platform, and dropping it onto a different DB won't
magically undo them.

Some days I think database independence is a myth.

regards, tom lane

#6Chris Browne
cbbrowne@acm.org
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

mmoncure@gmail.com ("Merlin Moncure") writes:

http://www.zabbix.com/manual/v1.1/install.php
anybody know these guys? this is right off the mysql anti-postgresql
advocacy page.

On the upside, they actually indicated what versions they were working
with.

If they're so out of date that their documentation indicates they're
still using MySQL 3.23, it must be a spectacularly out of date
project, and therefore of little interest.
--
output = reverse("ofni.secnanifxunil" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/spiritual.html
A computer scientist is someone who, when told to "Go to Hell," sees
the "go to," rather than the destination, as harmful.
-- Dr. Roger M. Firestone, rfire@cais.cais.com

#7Merlin Moncure
mmoncure@gmail.com
In reply to: Peter Eisentraut (#4)
Re: more anti-postgresql FUD

On 10/10/06, Peter Eisentraut <peter_e@gmx.net> wrote:

Merlin Moncure wrote:

http://www.zabbix.com/manual/v1.1/install.php

in section PostgreSQL vs MySQL :
[quoting]
Regarding the choice between PostgreSQL and MySQL, MySQL is
recommended for several reasons:

I don't see any fear, uncertainty, or doubt there.

* MySQL is faster

It probably is for that application. Of course their references
benchmark is outdated, but that does not make it FUD.

ok, i'll grant that you are tecnically correct (not exactly FUD).
there seems to be no issue of intent here. however, if you are going
to compare two databases on a core feature such as performance, it's
important to keep your information up to date and factual. regardless
of the specifics in the document, it's important to consider the
perception of an uninformed person will come away with. this should
alse be considered in light of relatively public controversy (ableit
in narrow circles) surrounding some of the comments in the mysql
documentation in 2001.

merlin

#8Brandon Aiken
BAiken@winemantech.com
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

MySQL 3.23.29 is pre-InnoDB
(http://dev.mysql.com/doc/refman/4.1/en/innodb-in-mysql-3-23.html), so
this database is not transactional, not ACIDic, and does not support
row-level locking or foreign key referential integrity. At this point,
MySQL lacked support for subqueries, UNIONs, VIEWs, and nearly
everything else beyond basic CRUD.

I bet I can design a program that interfaces flat data files so fast it
makes any RDBMS pale in comparison. SQLite does that, and it's ACID
compliant! Performance is not the only motivation for using an RDBMS.
Data integrity and relational modeling are also big considerations.

--
Brandon Aiken
CS/IT Systems Engineer

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Merlin Moncure
Sent: Tuesday, October 10, 2006 1:56 PM
To: PgSQL General
Subject: [GENERAL] more anti-postgresql FUD

http://www.zabbix.com/manual/v1.1/install.php

in section PostgreSQL vs MySQL :
[quoting]
Regarding the choice between PostgreSQL and MySQL, MySQL is
recommended for several reasons:

* MySQL is faster

recent benchmarks using ZABBIX clearly show that PostgreSQL
(7.1.x) is at least 10 times slower than MySQL (3.23.29)

Note: These results are predictable. ZABBIX server processes use
simple SQL statements like single row INSERT, UPDATE and simple SELECT
operators. In such environment, use of advanced SQL engine (like
PostgreSQL) is overkill.
* no need to constantly run resource-hungry command "vacuum" for
MySQL
* MySQL is used as a primary development platform.

If you do use PostgreSQL, zabbix_server will periodically (defined in
HousekeepingFrequency) execute command vacuum analyze.
[done]

anybody know these guys? this is right off the mysql anti-postgresql
advocacy page.

merlin

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#9Jorge Godoy
jgodoy@gmail.com
In reply to: Tom Lane (#5)
Re: more anti-postgresql FUD

Tom Lane <tgl@sss.pgh.pa.us> writes:

Some days I think database independence is a myth.

I believe it is as real as Santa Claus and the Easter Bunny. All of us know
that those three exist, right? :-)

--
Jorge Godoy <jgodoy@gmail.com>

#10David Fetter
david@fetter.org
In reply to: Tom Lane (#5)
Re: more anti-postgresql FUD

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

* MySQL is used as a primary development platform.

Another good reason.

Actually that's *the* reason --- it's always going to be hard for
Postgres to look good for an application that's been
designed/optimized for MySQL. The application has already made
whatever compromises it had to for that platform, and dropping it
onto a different DB won't magically undo them.

Some days I think database independence is a myth.

I do, too, but only on the weekdays ending in 'y' ;)

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#11Andrew Kelly
akelly@corisweb.org
In reply to: Tom Lane (#5)
Re: more anti-postgresql FUD

On Tue, 2006-10-10 at 14:50 -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

* MySQL is used as a primary development platform.

Another good reason.

Actually that's *the* reason --- it's always going to be hard for
Postgres to look good for an application that's been designed/optimized
for MySQL. The application has already made whatever compromises it
had to for that platform, and dropping it onto a different DB won't
magically undo them.

Some days I think database independence is a myth.

If it's not even possible to get trustworthy, duplicate renderings of
XHTML/CSS on popular browsers without tweaks, we can truly never expect
something as utopian as that.
Sadly.

Andy

#12Noname
alexei.vladishev@gmail.com
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

Hello,

I'm author and maintainer of ZABBIX and the manual. I would like to add
some comments to the thread.

First of all, ZABBIX supports three database engines: MySQL, Oracle and
PostgreSQL. It uses absolutely standard SQL, same for all three
database engines. We have absolutely no intention to push or recommend
one of those. I'm big fan of PostgreSQL and having a choice I would
choose PostgreSQL for anything except ZABBIX.

Unfortunately PostgreSQL performs much slower than MySQL doing large
number of updates for one single table. By its nature ZABBIX requires
to execute hundreds of updates per second for large installations.
PostgreSQL cannot handle this nicely.

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

The manual states that PostgreSQL works ten times slower for ZABBIX, in
reality it is much worser.

Yes, I'm aware of autovacuuming, etc. But it eats resources and I
cannot handle to run it periodically because I want steady performance
from my application. I do not want to see ZABBIX performing slower just
because of database housekeeper.

Several years ago I contacted PostgreSQL developers but unfortunately
the only answer was "Run vacuum. We won't change PostgreSQL to reuse
unused tuples for updates".

Perhaps something has changed in recent releases of PostgreSQL, I don't
think so. Please correct me if I'm wrong.

Kind regards,
Alexei

#13Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Tom Lane (#5)
Re: more anti-postgresql FUD

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Some days I think database independence is a myth.

On the day when you don't, please tell me what application you found
where it isn't. I want to buy the developers a drink. Or maybe a
bar.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes

#14Guy Rouillier
guyr@masergy.com
In reply to: Andrew Sullivan (#13)
Re: more anti-postgresql FUD

Andrew Sullivan wrote:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Some days I think database independence is a myth.

On the day when you don't, please tell me what application you found
where it isn't. I want to buy the developers a drink. Or maybe a
bar.

The Mantis bug tracking software http://www.mantisbt.org/ now works with
PostgreSQL (was developed with MySQL.) It works equally well with both,
including automated installation.

--
Guy Rouillier

#15Joshua D. Drake
jd@commandprompt.com
In reply to: Andrew Sullivan (#13)
Re: more anti-postgresql FUD

Andrew Sullivan wrote:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Some days I think database independence is a myth.

On the day when you don't, please tell me what application you found
where it isn't. I want to buy the developers a drink. Or maybe a
bar.

Command Prompt will help sponsor that community event ;)

Joshua D. Drake

A

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#16Steve Crawford
scrawford@pinpointresearch.com
In reply to: Guy Rouillier (#14)
Re: more anti-postgresql FUD

Guy Rouillier wrote:

Andrew Sullivan wrote:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Some days I think database independence is a myth.

On the day when you don't, please tell me what application you found
where it isn't. I want to buy the developers a drink. Or maybe a
bar.

The Mantis bug tracking software http://www.mantisbt.org/ now works with
PostgreSQL (was developed with MySQL.) It works equally well with both,
including automated installation.

I find that "database independence" == "lowest common denominator". I
may have missed something, but a quick scan of the Mantis code didn't
reveal any use of triggers, rules, foreign-keys, user-defined types, etc.

Whenever I see that a project has been "ported" to PostgreSQL I can
usually be sure that it is not a project that was designed to take
advantage of the features and capabilities that PG offers.

But I suspect that porting something that uses all the features of mySql
to PostgreSQL will be far easier than porting something that uses all
the features of PostgreSQL over to mySql (if it is possible at all).

Cheers,
Steve

#17Chris Browne
cbbrowne@acm.org
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

ajs@crankycanuck.ca (Andrew Sullivan) writes:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Some days I think database independence is a myth.

On the day when you don't, please tell me what application you found
where it isn't. I want to buy the developers a drink. Or maybe a
bar.

You're sitting across the street from the local headquarters of the
vendor of just such an application.

Of course, they don't use foreign keys, triggers, dates are
represented as char(10), and validity checking is expected to be coded
into either (traditionally) transaction screens or (new technologies)
BAPIs (Business APIs). Really, the application was designed with IMS
<http://en.wikipedia.org/wiki/Information_Management_System&gt; in mind.

And *they* can afford to pay for a whole bar, once you pay the annual
licensing fee :-(.

Oh, and a cluster of IBM p570s would probably be enough to run a 20
user system :-(. [Actually, that's probably not *entirely* fair; I
once administered an R/3 system supporting ~30 users on a uniprocessor
DEC Alpha with 256MB of RAM, which by modern standards is pretty
pedestrian...]
--
(format nil "~S@~S" "cbbrowne" "linuxdatabases.info")
http://www3.sympatico.ca/cbbrowne/sap.html
"Access to a COFF symbol table via ldtbread is even less abstract,
really sucks in general, and should be banned from earth."
-- SCSH 0.5.1 unix.c

#18Tim Tassonis
timtas@cubic.ch
In reply to: Steve Crawford (#16)
Re: more anti-postgresql FUD

Steve Crawford schrieb:

Guy Rouillier wrote:

Andrew Sullivan wrote:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

Some days I think database independence is a myth.

On the day when you don't, please tell me what application you found
where it isn't. I want to buy the developers a drink. Or maybe a
bar.

The Mantis bug tracking software http://www.mantisbt.org/ now works with
PostgreSQL (was developed with MySQL.) It works equally well with both,
including automated installation.

I find that "database independence" == "lowest common denominator". I
may have missed something, but a quick scan of the Mantis code didn't
reveal any use of triggers, rules, foreign-keys, user-defined types, etc.

Well, that is hardly surprising. What exactly is your point?

If you want to write portable software, you usually stay with generally
available, standardized features or API's, be it "database independent",
"platform independent", you name it. You certainly don't go for
user-defined types. I really think all the nice features and
capabilities of PostgreSQL are great, but I would never, ever start
using any of them extensively in a project that might have to run on
another database. Ever heard of vendor lock-in and "embrace and expand"?

Whenever I see that a project has been "ported" to PostgreSQL I can
usually be sure that it is not a project that was designed to take
advantage of the features and capabilities that PG offers.

But I suspect that porting something that uses all the features of mySql
to PostgreSQL will be far easier than porting something that uses all
the features of PostgreSQL over to mySql (if it is possible at all).

You're certainly right here (I did this before), that's why a lot of
projects can support PostgreSQL when they started off with mySql. You
can bet that isn't the case with projects that started off with Oracle
(care to rewrite a few hundred triggers, packages and statements?).

Bye
Tim

Show quoted text

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#19Joshua D. Drake
jd@commandprompt.com
In reply to: Tim Tassonis (#18)
Re: more anti-postgresql FUD

Well, that is hardly surprising. What exactly is your point?

If you want to write portable software, you usually stay with generally
available, standardized features or API's, be it "database independent",
"platform independent", you name it. You certainly don't go for
user-defined types. I really think all the nice features and
capabilities of PostgreSQL are great, but I would never, ever start
using any of them extensively in a project that might have to run on
another database. Ever heard of vendor lock-in and "embrace and expand"?

Bah! Ever heard of crappy software because of database independence? I
have yet to see a good application that supports "database independence".

Joshua D. Drake

Whenever I see that a project has been "ported" to PostgreSQL I can
usually be sure that it is not a project that was designed to take
advantage of the features and capabilities that PG offers.

But I suspect that porting something that uses all the features of mySql
to PostgreSQL will be far easier than porting something that uses all
the features of PostgreSQL over to mySql (if it is possible at all).

You're certainly right here (I did this before), that's why a lot of
projects can support PostgreSQL when they started off with mySql. You
can bet that isn't the case with projects that started off with Oracle
(care to rewrite a few hundred triggers, packages and statements?).

Bye
Tim

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#20Ron Johnson
ron.l.johnson@cox.net
In reply to: Chris Browne (#17)
Re: more anti-postgresql FUD

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/11/06 14:48, Chris Browne wrote:

ajs@crankycanuck.ca (Andrew Sullivan) writes:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

[snip]

Oh, and a cluster of IBM p570s would probably be enough to run a 20
user system :-(. [Actually, that's probably not *entirely* fair; I
once administered an R/3 system supporting ~30 users on a uniprocessor
DEC Alpha with 256MB of RAM, which by modern standards is pretty
pedestrian...]

<GEEZER>
My first programming job was walloping COBOL on a 1.9 MIPS IBM 43xx
with only 6MB RAM. Supported *70* online users and had a
*relational* database (CA Datacom-DB).

Of course, the FEPs, block-mode terminals and CICS were the crucial
difference.

Damned shame that Unix killed that mentality, and that client-server
was usually implemented so poorly.
</GEEZER>

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFLX6wS9HxQb37XmcRAopdAJ9kFEeHx0KXEFhhGGq+REuiYRh7GgCgnxR9
Urj+S/Ce0+b9KrqP4gPVyeM=
=wWG0
-----END PGP SIGNATURE-----

#21Geoffrey
esoteric@3times25.net
In reply to: Ron Johnson (#20)
Re: more anti-postgresql FUD

Ron Johnson wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/11/06 14:48, Chris Browne wrote:

ajs@crankycanuck.ca (Andrew Sullivan) writes:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

[snip]

Oh, and a cluster of IBM p570s would probably be enough to run a 20
user system :-(. [Actually, that's probably not *entirely* fair; I
once administered an R/3 system supporting ~30 users on a uniprocessor
DEC Alpha with 256MB of RAM, which by modern standards is pretty
pedestrian...]

<GEEZER>
My first programming job was walloping COBOL on a 1.9 MIPS IBM 43xx
with only 6MB RAM. Supported *70* online users and had a
*relational* database (CA Datacom-DB).

Of course, the FEPs, block-mode terminals and CICS were the crucial
difference.

Damned shame that Unix killed that mentality, and that client-server
was usually implemented so poorly.
</GEEZER>

You had that much memory? Used to run a time reporting system on a 3b2
400, 4MB Ram, WE32100 10MHz processor, 1.1 MIPS. Flat file home grown
database system that used indices in shared memory and semaphore
communication between three continuously running processes.

The application ran in pretty much all the AT&T factories at the time.

--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
- Benjamin Franklin

#22Ron Johnson
ron.l.johnson@cox.net
In reply to: Geoffrey (#21)
Re: more anti-postgresql FUD

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/11/06 19:10, Geoffrey wrote:

Ron Johnson wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/11/06 14:48, Chris Browne wrote:

ajs@crankycanuck.ca (Andrew Sullivan) writes:

On Tue, Oct 10, 2006 at 02:50:44PM -0400, Tom Lane wrote:

[snip]

Oh, and a cluster of IBM p570s would probably be enough to run a 20
user system :-(. [Actually, that's probably not *entirely* fair; I
once administered an R/3 system supporting ~30 users on a uniprocessor
DEC Alpha with 256MB of RAM, which by modern standards is pretty
pedestrian...]

<GEEZER>
My first programming job was walloping COBOL on a 1.9 MIPS IBM 43xx
with only 6MB RAM. Supported *70* online users and had a
*relational* database (CA Datacom-DB).

Of course, the FEPs, block-mode terminals and CICS were the crucial
difference.

Damned shame that Unix killed that mentality, and that client-server
was usually implemented so poorly.
</GEEZER>

You had that much memory? Used to run a time reporting system on a 3b2

Hey, I remember those.

A 3b2 was my first exposure to Unix. At the time I was a VMS
programmer who loved DCL, and was *not* impressed by Unix.

I *still* use VMS at work, and while DCL is really showing it's age
and while bash 3.1 on Linux (which I'm writing this from now) is
light-years better than sh, VMS is still a great "DP" operating system.

400, 4MB Ram, WE32100 10MHz processor, 1.1 MIPS. Flat file home grown
database system that used indices in shared memory and semaphore
communication between three continuously running processes.

The application ran in pretty much all the AT&T factories at the time.

Flat files and minimal user interaction? Bah.

Now, if *one* machine ran a whole AT&T factory, that would be
impressive. Of course, VAX/VMS would do it, and no one would bat an
eyelash.

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFLY1tS9HxQb37XmcRAvXVAJ9sJrS8FBFDUyAqLTuff3vHun/qYQCgl/Fb
PIc0DnyGF4jitjWBjF1H+z8=
=9ePW
-----END PGP SIGNATURE-----

#23Merlin Moncure
mmoncure@gmail.com
In reply to: Noname (#12)
Re: more anti-postgresql FUD

On 11 Oct 2006 07:54:52 -0700, alexei.vladishev@gmail.com
<alexei.vladishev@gmail.com> wrote:

Hello,

I'm author and maintainer of ZABBIX and the manual. I would like to add
some comments to the thread.

just so you know, I brought this up after taking a look at the zabbix
software, which is in my opinion very excellent. I came across a
little strong in my comments and peter e was correct in pointing out
that the performance related comments were not 'fud'. I felt a little
bad after opening this thread but you have to take this in context of
the bigger picture. The postgresql poeple have been dealing with
(sometimes) unfounded prejudices for years.

Unfortunately PostgreSQL performs much slower than MySQL doing large
number of updates for one single table. By its nature ZABBIX requires
to execute hundreds of updates per second for large installations.
PostgreSQL cannot handle this nicely.

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

this is a very contrived test:
1. nothing really going on
2. no data
3. single user test
4. zabbix doesn't do this, nor does anything else
5. proves nothing.

zabbix is a bit more complex than that with multiple users, tables and
the ocassional join. With a high number of servers in play things
might go differently than you expect.

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

8.0, 8.1, and 8.2 are all a bit faster at these types of queries, just
so you know. 8.1 and up i believe have autovcacuum defaulted on. In
fairness, vacuuming in the 7.1x days was a different beast.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

The manual states that PostgreSQL works ten times slower for ZABBIX, in
reality it is much worser.

Yes, I'm aware of autovacuuming, etc. But it eats resources and I
cannot handle to run it periodically because I want steady performance
from my application. I do not want to see ZABBIX performing slower just
because of database housekeeper.

vacuum gives you stable performance, not vice verca. I would imagine
zabbixs server scalability is driven by a number of factors.

Several years ago I contacted PostgreSQL developers but unfortunately
the only answer was "Run vacuum. We won't change PostgreSQL to reuse
unused tuples for updates".

Perhaps something has changed in recent releases of PostgreSQL, I don't
think so. Please correct me if I'm wrong.

well, I am playing with zabbix with the possible eventuality of
rolling it out in our servers I might be able to get you some hard
data on performance. By the way, I'm currently managing a
spectactularly large mysql database which is getting moved to
postgresql with the next release of the software -- in part because I
was able to show that postgresql gave much more reliable performance
in high load envirnonments.

In light of this discussion, I might be interested in running a little
test to see how zabbix would hold up on postgresql under a
artificially high load. If I was to show that things were quite so
one-sided as you assumed, would you be willing to say as much in your
documentation? :-)

merlin

#24Stephen Frost
sfrost@snowman.net
In reply to: Noname (#12)
Re: more anti-postgresql FUD

* alexei.vladishev@gmail.com (alexei.vladishev@gmail.com) wrote:

Unfortunately PostgreSQL performs much slower than MySQL doing large
number of updates for one single table. By its nature ZABBIX requires
to execute hundreds of updates per second for large installations.
PostgreSQL cannot handle this nicely.

If you refuse to vacuum (or have the table autovacuumed) then sure. Of
course, I don't know of anyone who actually uses PostgreSQL who would
run a system like that.

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

Don't say 'sorry' to us for using MyISAM (though it pretty much
invalidates the test), say 'sorry' to your users... You can try running
Postgres with fsync=off but I would strongly recommend against it in a
production environment (just like I'd strongly recommend against
MyISAM).

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

If you periodically vacuum the table (where periodically most likely
would mean after some number of write transactions) I expect you'd find
Postgres performance to at *least* stabalize. If you vacuum with a
periodicity reasonably ratioed to your update statement frequency you'd
find that it will *improve* performance and Postgres will provide a
*consistant* performance.

Yes, I'm aware of autovacuuming, etc. But it eats resources and I
cannot handle to run it periodically because I want steady performance
from my application. I do not want to see ZABBIX performing slower just
because of database housekeeper.

This, above all things imv, would be FUD here. Vacuum/autovacuum aren't
something to be feared as damaging, detrimental, or resource hogging.
Vacuum doesn't take an exclusive lock and moves along quite decently if
done with an appropriate frequency. If you wait far, far, too long to
do a vacuum (to the point where you've got 10x as many dead tuples as
live ones) then sure it'll take a while, but that doesn't make it
resource hogging when you consider what you're having it do.

Several years ago I contacted PostgreSQL developers but unfortunately
the only answer was "Run vacuum. We won't change PostgreSQL to reuse
unused tuples for updates".

That's exactly what vacuum *does*, it marks dead tuples as being
available for reuse. Please understand that vacuum != vacuum full.

Perhaps something has changed in recent releases of PostgreSQL, I don't
think so. Please correct me if I'm wrong.

I'm afraid there's a bit of a misunderstanding about what vacuum is for
and how it can affect the behaviour of Postgres. Please, please forget
whatever notion you currently have of vacuum and actually run some tests
with it, and post back here (or -performance) if you run into problems,
have questions or concerns. I expect you could also tune autovacuum to
be frequent enough on the appropriate tables that you wouldn't have to
intersperse your own vacuum commands in. Also, as pointed out, current
releases (8.1) also have quite a few enhanments and performance
improvements.

Thanks,

Stephen

#25snacktime
snacktime@gmail.com
In reply to: Noname (#12)
Re: more anti-postgresql FUD

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

Something is wrong with your test code. If I had to guess I would say
you did all the updates in a single transaction without committing
them, in which case yes it will slow down until you commit.

#26Noname
alexei.vladishev@gmail.com
In reply to: snacktime (#25)
Re: more anti-postgresql FUD

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

Something is wrong with your test code. If I had to guess I would say
you did all the updates in a single transaction without committing
them, in which case yes it will slow down until you commit.

No, I'm not doing all the updates in a single transaction. Is it so
hard to repeat my test in your environment? :) It would take 5min to
see my point.

#27Noname
alexei.vladishev@gmail.com
In reply to: Stephen Frost (#24)
Re: more anti-postgresql FUD

Unfortunately PostgreSQL performs much slower than MySQL doing large
number of updates for one single table. By its nature ZABBIX requires
to execute hundreds of updates per second for large installations.
PostgreSQL cannot handle this nicely.

If you refuse to vacuum (or have the table autovacuumed) then sure. Of
course, I don't know of anyone who actually uses PostgreSQL who would
run a system like that.

In order to keep performance of busy application steady, I had to
perform the vacuum every 10 seconds. As I said earlier ZABBIX Server
does hundredrs of updates per second and performance of the updates
degrades very fast.

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

Don't say 'sorry' to us for using MyISAM (though it pretty much
invalidates the test), say 'sorry' to your users... You can try running
Postgres with fsync=off but I would strongly recommend against it in a
production environment (just like I'd strongly recommend against
MyISAM).

Yes, I believe fsync=on during my tests, the option is commented in
PostgreSQL config file. It explains worses performance of PostgreSQL
for the first 20K updated, but still my observation are valid.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

If you periodically vacuum the table (where periodically most likely
would mean after some number of write transactions) I expect you'd find
Postgres performance to at *least* stabalize. If you vacuum with a
periodicity reasonably ratioed to your update statement frequency you'd
find that it will *improve* performance and Postgres will provide a
*consistant* performance.

Yes, I'm aware of autovacuuming, etc. But it eats resources and I
cannot handle to run it periodically because I want steady performance
from my application. I do not want to see ZABBIX performing slower just
because of database housekeeper.

This, above all things imv, would be FUD here. Vacuum/autovacuum aren't
something to be feared as damaging, detrimental, or resource hogging.
Vacuum doesn't take an exclusive lock and moves along quite decently if
done with an appropriate frequency. If you wait far, far, too long to
do a vacuum (to the point where you've got 10x as many dead tuples as
live ones) then sure it'll take a while, but that doesn't make it
resource hogging when you consider what you're having it do.

Face it, if one does hundreds updates per second for one table (that's
exactly what ZABBIX does, and not for one record(!) table as in my
simple test), performance degrades so fast that vacuum has to be
executed once per 5-15 seconds to keep good performance. The vacuum
will run at least several seconds with high disk io. Do you think it
won't make "PostgreSQL at least 10x slower than MySQL" as stated in the
manual? What we are discussing here? :)

And by the way, ZABBIX periodically doess execute vacuum for subset of
tables, the functionality is is built in ZABBIX.

Several years ago I contacted PostgreSQL developers but unfortunately
the only answer was "Run vacuum. We won't change PostgreSQL to reuse
unused tuples for updates".

That's exactly what vacuum *does*, it marks dead tuples as being
available for reuse. Please understand that vacuum != vacuum full.

Perhaps something has changed in recent releases of PostgreSQL, I don't
think so. Please correct me if I'm wrong.

I'm afraid there's a bit of a misunderstanding about what vacuum is for
and how it can affect the behaviour of Postgres. Please, please forget
whatever notion you currently have of vacuum and actually run some tests
with it, and post back here (or -performance) if you run into problems,
have questions or concerns. I expect you could also tune autovacuum to
be frequent enough on the appropriate tables that you wouldn't have to
intersperse your own vacuum commands in. Also, as pointed out, current
releases (8.1) also have quite a few enhanments and performance
improvements.

I will try to experiment with newer PostgreSQL when I find some time.
I'm sure PostgreSQL is doing very good progress, and I'm really happy
to see that PostgreSQL became an excellent alternative to
Oracle/DB2/Informix.

#28Noname
alexei.vladishev@gmail.com
In reply to: Merlin Moncure (#23)
Re: more anti-postgresql FUD

I'm author and maintainer of ZABBIX and the manual. I would like to add
some comments to the thread.

just so you know, I brought this up after taking a look at the zabbix
software, which is in my opinion very excellent. I came across a
little strong in my comments and peter e was correct in pointing out
that the performance related comments were not 'fud'. I felt a little
bad after opening this thread but you have to take this in context of
the bigger picture. The postgresql poeple have been dealing with
(sometimes) unfounded prejudices for years.

No worries! :)

Unfortunately PostgreSQL performs much slower than MySQL doing large
number of updates for one single table. By its nature ZABBIX requires
to execute hundreds of updates per second for large installations.
PostgreSQL cannot handle this nicely.

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

this is a very contrived test:
1. nothing really going on
2. no data
3. single user test
4. zabbix doesn't do this, nor does anything else
5. proves nothing.

zabbix is a bit more complex than that with multiple users, tables and
the ocassional join. With a high number of servers in play things
might go differently than you expect.

I cannot agree. Yes, ZABBIX software is much more coplex than the test
here. But performance of core functions of ZABBIX Server depends on
speed of update operations very much. The goal of the test was to
demonstrate very fast performance degradation of the updates.

I'm sure PostgreSQL would perform nicely for a large database with
large number of users, but I just wanted to prove my statement from the
manual.

...

well, I am playing with zabbix with the possible eventuality of
rolling it out in our servers I might be able to get you some hard
data on performance. By the way, I'm currently managing a
spectactularly large mysql database which is getting moved to
postgresql with the next release of the software -- in part because I
was able to show that postgresql gave much more reliable performance
in high load envirnonments.

In light of this discussion, I might be interested in running a little
test to see how zabbix would hold up on postgresql under a
artificially high load. If I was to show that things were quite so
one-sided as you assumed, would you be willing to say as much in your
documentation? :-)

I would be very interested in any real-life experience running large
ZABBIX installation under PostgreSQL. Feel free to send me your
results. Yes, I'm ready to change the manual, no doubt! :)

Cheers,
Alexei

#29Chris Mair
chrisnospam@1006.org
In reply to: Noname (#12)
Re: more anti-postgresql FUD

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

Hi,
it would be cool if you could at least:

- bundle your updates into transactions of, say, 1000 updates at a time
i.e. wrap a BEGIN; END; around a 1000 of them
- run postgresql with fsync off, since you're using MyISAM
- run PostgreSQL at least 8, since you're running MySQL 5

I'd bet MySQL would still be faster on such an artificial, single user
test, but not *that much* faster.

If you don't want to install 8.0, could you maybe at least do the first
two items (shouldn't be a lot of work)...?

Which client are you using? Just mysql/psql or some API?

Bye, Chris.

#30Noname
alexei.vladishev@gmail.com
In reply to: Chris Mair (#29)
Re: more anti-postgresql FUD

it would be cool if you could at least:

- bundle your updates into transactions of, say, 1000 updates at a time
i.e. wrap a BEGIN; END; around a 1000 of them
- run postgresql with fsync off, since you're using MyISAM
- run PostgreSQL at least 8, since you're running MySQL 5

I'd bet MySQL would still be faster on such an artificial, single user
test, but not *that much* faster.

I'm quite sure the results will be very close to what I get before even
if I do all of the above. My post was not about MySQL vs PostgreSQL. It
was about very fast performance degradation of PostgreSQL in case of
large number of updates provided vacuum is not used.

If you don't want to install 8.0, could you maybe at least do the first
two items (shouldn't be a lot of work)...?

Which client are you using? Just mysql/psql or some API?

C API

Alexei

#31Scott Ribe
scott_ribe@killerbytes.com
In reply to: Geoffrey (#21)
Re: more anti-postgresql FUD

Used to run a time reporting system on a 3b2
400, 4MB Ram, WE32100 10MHz processor, 1.1 MIPS.

You had zeroes? We had to use the letter "O"!

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice

#32Tim Tassonis
timtas@cubic.ch
In reply to: Joshua D. Drake (#19)
Re: more anti-postgresql FUD

Joshua D. Drake wrote:

Well, that is hardly surprising. What exactly is your point?

If you want to write portable software, you usually stay with generally
available, standardized features or API's, be it "database independent",
"platform independent", you name it. You certainly don't go for
user-defined types. I really think all the nice features and
capabilities of PostgreSQL are great, but I would never, ever start
using any of them extensively in a project that might have to run on
another database. Ever heard of vendor lock-in and "embrace and expand"?

Bah! Ever heard of crappy software because of database independence?

No, actually not. I certainly heard about buggy, bad- performing
software and about software not fitting its goal, but that is mostly due
to other reasons than database independence.

I know a lot of crappy, database dependent applications.

I have yet to see a good application that supports "database

independence".

If you are talking about high- end applications (big databases with lot
of transactions), you're of course right. However, there are a lot of
applications with small or medium sized databases and not so many
transactions, where you don't need to get the best out of your RDBMS for
decent performance.

With a good design and some expierience in portability in general, you
will be able to write a good, "quite" database independent application,
supporting some of more standardized RDBMS's.

Bye
Tim

#33Jim C. Nasby
jim@nasby.net
In reply to: Tim Tassonis (#32)
Re: more anti-postgresql FUD

On Thu, Oct 12, 2006 at 07:40:42PM +0200, Tim Tassonis wrote:

I have yet to see a good application that supports "database

independence".

If you are talking about high- end applications (big databases with lot
of transactions), you're of course right. However, there are a lot of
applications with small or medium sized databases and not so many
transactions, where you don't need to get the best out of your RDBMS for
decent performance.

With a good design and some expierience in portability in general, you
will be able to write a good, "quite" database independent application,
supporting some of more standardized RDBMS's.

Actually, back when I worked at http://ud.com we had a fairly complex
database that could see a heavy transactional load and was actually
quite database independent (we developed on DB2, supported Oracle, could
have supported MSSQL and at one point actually had it running on
PostgreSQL). The application made extensive use of functions/stored
procedures, constraints, RI, and triggers.

How did this miracle occur? Well... the database code itself wasn't
actually database independent. It was generated by a bunch of XSLT that
we developed (we called it 'datadef'). In addition to generating
database generation code (SQL), it generated C accessor functions (much
of the code was in C), documentation, and some other things.

In fact, it even had the ability to generate code in a language it could
understand. This allowed us to do enums in such a way that the C code
had a real enum type, and the database had a table that stored the same
information. You could then refer to that enum anywhere in a table
definition in datadef, and an appropriate field definition would be
created, complete with RI back to the appropriate parent table.

Sadly, my understanding is that management didn't want datadef
open-sourced.

But you can actually write good code that will run on multiple
databases if you're willing to write the tools to allow you to do it.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#34Christopher Browne
cbbrowne@acm.org
In reply to: Guy Rouillier (#14)
Re: more anti-postgresql FUD

After a long battle with technology, jim@nasby.net ("Jim C. Nasby"), an earthling, wrote:

But you can actually write good code that will run on multiple
databases if you're willing to write the tools to allow you to do it.

There's an argument out there that we don't actually have relational
databases (the "fine point" there being that it is technically
permissible to create tables that lack a primary key), but rather
"toolboxes" that might be used to construct relational systems.

That kind of fits with that sort of toolkit approach...
--
"cbbrowne","@","gmail.com"
http://cbbrowne.com/info/nonrdbms.html
"I'm sorry, Mr. Kipling, but you just don't know how to use the
English Language." -- Editor of the San Francisco Examiner, informing
Rudyard Kipling, who had one article published in the newspaper, that
he needn't bother submitting a second, 1889

#35Alexander Staubo
alex@purefiction.net
In reply to: Noname (#12)
Re: more anti-postgresql FUD

On Oct 11, 2006, at 16:54 , alexei.vladishev@gmail.com wrote:

I'm author and maintainer of ZABBIX and the manual. I would like to
add
some comments to the thread.

[snip]

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

PostgreSQL 7.4 was released in Nov 2003, and 7.4.12 does not (afaik)
include any performance enhancements. MySQL 5.0.22 came out in May
2006 and, despite the low version number, includes a number of
additional features and performance enhancements.

You might start by comparing apples to apples; "apt-get install
postgresql-8.1".

PostgreSQL does approximately 1600 records per second for the first
10000, then 200rps for the first 100k records, and then slower and
slower downgrading to 10-20 rps(!!!) when reaching 300k.

You are absolutely right that PostgreSQL performs significantly worse
than MySQL at this extremely artificial test.

On my box (Dell PowerEdge 1850, dual Xeon 2.8GHz, 4GB RAM, 10kRPM
SCSI, Linux 2.6.15, Ubuntu) I get 1,100 updates/sec, compared to
10,000 updates/sec with MySQL/InnoDB, using a stock installation of
both. Insert performance is only around 10% worse than MySQL at
around 9,000 rows/sec. Curiously enough, changing shared_buffers,
wal_buffers, effective_cache_size and even fsync seems to have no
effect on update performance, while fsync has a decent effect on
insert performance.

Alexander.

#36Andrew - Supernews
andrew+nonews@supernews.com
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

On 2006-10-13, Alexander Staubo <alex@purefiction.net> wrote:

On my box (Dell PowerEdge 1850, dual Xeon 2.8GHz, 4GB RAM, 10kRPM
SCSI, Linux 2.6.15, Ubuntu) I get 1,100 updates/sec, compared to
10,000 updates/sec with MySQL/InnoDB, using a stock installation of
both. Insert performance is only around 10% worse than MySQL at
around 9,000 rows/sec. Curiously enough, changing shared_buffers,
wal_buffers, effective_cache_size and even fsync seems to have no
effect on update performance, while fsync has a decent effect on
insert performance.

Your disk probably has write caching enabled. A 10krpm disk should be
limiting you to under 170 transactions/sec with a single connection
and fsync enabled.

I also did some tests on this, and even though the machine I was testing
on had some competing database activity, autovacuum was effective at
keeping the table size stable (at 70-odd pages) when running several
hundred thousand updates on a 1-row table.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

#37Alexander Staubo
alex@purefiction.net
In reply to: Andrew - Supernews (#36)
Re: more anti-postgresql FUD

On Oct 13, 2006, at 17:13 , Andrew - Supernews wrote:

On 2006-10-13, Alexander Staubo <alex@purefiction.net> wrote:

On my box (Dell PowerEdge 1850, dual Xeon 2.8GHz, 4GB RAM, 10kRPM
SCSI, Linux 2.6.15, Ubuntu) I get 1,100 updates/sec, compared to
10,000 updates/sec with MySQL/InnoDB, using a stock installation of
both. Insert performance is only around 10% worse than MySQL at
around 9,000 rows/sec. Curiously enough, changing shared_buffers,
wal_buffers, effective_cache_size and even fsync seems to have no
effect on update performance, while fsync has a decent effect on
insert performance.

Your disk probably has write caching enabled. A 10krpm disk should be
limiting you to under 170 transactions/sec with a single connection
and fsync enabled.

What formula did you use to get to that number? Is there a generic
way on Linux to turn off (controller-based?) write caching?

Alexander.

#38Merlin Moncure
mmoncure@gmail.com
In reply to: Joshua D. Drake (#2)
Re: more anti-postgresql FUD

On 10/13/06, Roman Neuhauser <neuhauser@sigpipe.cz> wrote:

# mmoncure@gmail.com / 2006-10-10 14:16:19 -0400:

FUD from another open source project is really poor form, particulary
when not in competing segements where a little bit of competitive
rivalry is expected.

OMG WTF what FUD???

please see my later comments. 'fud' is not a great term. however, if
you are going to publish remarks about another project that might be
perceived as disparaging, please keep them up to date and factually
relevant. I can write queries that are 10x slower on mysql that
postgresql but that ultimately means nothing. the major point thought
is that zabbix does *not* run 10x slower on postgresql and I am going
to prove it.

btw, i never said anything disparaging about mysql or zabbix. i am
focused like a laser beam on the comments in the documentation and the
greater implications for the community.

I had a large (several milion rows), indexed table, same data, in
MySQL (4.0.x) and PostgreSQL (late 7.4), on the same RHEL or FreeBSD
(don't remember) machine. Walking over the table with

SELECT * FROM TABLE ORDER BY pk LIMIT 10 OFFSET N;

using offset to walk a table is extremely poor form because of:
* poor performance
* single user mentality
* flat file mentality

databases are lousy at this becuase they inheritly do not support
abolute addressing of data -- nore should they, beause this is not
what sql is all about. in short, 'offset' is a hack, albeit a useful
one in some cases, but dont gripe when it doesn't deliver the goods.

for server side browsing use cursors or a hybrid pl/pgqsl loop. for
client side, browse fetching relative to the last key:

select * from foo where p > p1 order by p limit k;

in 8.2, we get proper comparisons so you can do this with multiple part keys:

select * from foo where (a1,b1,b1) > (a,b,c) order by a,b,c limit k;

for fast dynamic browsing you can vary k for progressive fetches.

or the MySQL equivalent, MySQL was several times faster than
PostgreSQL, but the times were getting longer and longer....
As N grew in increments of 10, it took ages for MySQL to return
the rows. PostgreSQL... Well, it was as "slow" with N=100000 as it was
with N=0.

* MySQL is used as a primary development platform.

How does *this* qualify as FUD? Or are *you* spreading FUD to scare
people from even mentioning the software?

I think zabbix is fine software. I would hopefully prefer that if
someone were to write what could be perceived as negative things about
postgresql, they would back it up with facts better than 'update foo
set id = 0' ran a million times or 'select * from foo limit 1 offset
100000'

I don't like MySQL. I hate it when people put cheerleading where reason
should prevail.

outside of the 'fud' statement, which was a hastily written reaction,
my tone has been more constructive criticism.

merlin

#39Andrew - Supernews
andrew+nonews@supernews.com
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

On 2006-10-13, Alexander Staubo <alex@purefiction.net> wrote:

On Oct 13, 2006, at 17:13 , Andrew - Supernews wrote:

Your disk probably has write caching enabled. A 10krpm disk should be
limiting you to under 170 transactions/sec with a single connection
and fsync enabled.

What formula did you use to get to that number?

It's just the number of disk revolutions per second. Without caching, each
WAL flush tends to require a whole revolution unless the on-disk layout of
the filesystem is _very_ strange. You can get multiple commits per WAL
flush if you have many concurrent connections, but with a single connection
that doesn't apply.

Is there a generic
way on Linux to turn off (controller-based?) write caching?

I don't use Linux, sorry. Modern SCSI disks seem to ship with WCE=1 on
mode page 8 on the disk, thus enabling evil write caching by default.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

#40Alexander Staubo
alex@purefiction.net
In reply to: Andrew - Supernews (#39)
Re: more anti-postgresql FUD

On Oct 13, 2006, at 17:35 , Andrew - Supernews wrote:

On 2006-10-13, Alexander Staubo <alex@purefiction.net> wrote:

On Oct 13, 2006, at 17:13 , Andrew - Supernews wrote:

Your disk probably has write caching enabled. A 10krpm disk
should be
limiting you to under 170 transactions/sec with a single connection
and fsync enabled.

What formula did you use to get to that number?

It's just the number of disk revolutions per second. Without
caching, each
WAL flush tends to require a whole revolution unless the on-disk
layout of
the filesystem is _very_ strange. You can get multiple commits per WAL
flush if you have many concurrent connections, but with a single
connection
that doesn't apply.

Makes sense. However, in this case I was batching updates in
transactions and committing each txn at 1 second intervals, all on a
single connection. In other words, the bottleneck illustrated by this
test should not be related to fsyncs, and this does not seem to
explain the huge discrepancy between update (1,000/sec) and insert
(9,000 inserts/sec, also in 1-sec txns) performance.

Alexander.

#41Andrew - Supernews
andrew+nonews@supernews.com
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

On 2006-10-13, Alexander Staubo <alex@purefiction.net> wrote:

Makes sense. However, in this case I was batching updates in
transactions and committing each txn at 1 second intervals, all on a
single connection. In other words, the bottleneck illustrated by this
test should not be related to fsyncs, and this does not seem to
explain the huge discrepancy between update (1,000/sec) and insert
(9,000 inserts/sec, also in 1-sec txns) performance.

Update has to locate the one live row version amongst all the dead ones;
insert doesn't need to bother.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

#42Roman Neuhauser
neuhauser@sigpipe.cz
In reply to: Merlin Moncure (#1)
Re: more anti-postgresql FUD

# mmoncure@gmail.com / 2006-10-10 14:16:19 -0400:

FUD from another open source project is really poor form, particulary
when not in competing segements where a little bit of competitive
rivalry is expected.

OMG WTF what FUD???

# mmoncure@gmail.com / 2006-10-10 13:55:57 -0400:

http://www.zabbix.com/manual/v1.1/install.php

recent benchmarks using ZABBIX clearly show that PostgreSQL
(7.1.x) is at least 10 times slower than MySQL (3.23.29)

Note: These results are predictable. ZABBIX server processes use
simple SQL statements like single row INSERT, UPDATE and simple SELECT
operators. In such environment, use of advanced SQL engine (like
PostgreSQL) is overkill.

That's true.

* no need to constantly run resource-hungry command "vacuum" for MySQL

Last time I used MySQL that was true.

Some time ago I did a simplistic, but quite telling, test.

I had a large (several milion rows), indexed table, same data, in
MySQL (4.0.x) and PostgreSQL (late 7.4), on the same RHEL or FreeBSD
(don't remember) machine. Walking over the table with

SELECT * FROM TABLE ORDER BY pk LIMIT 10 OFFSET N;

or the MySQL equivalent, MySQL was several times faster than
PostgreSQL, but the times were getting longer and longer....
As N grew in increments of 10, it took ages for MySQL to return
the rows. PostgreSQL... Well, it was as "slow" with N=100000 as it was
with N=0.

* MySQL is used as a primary development platform.

How does *this* qualify as FUD? Or are *you* spreading FUD to scare
people from even mentioning the software?

--
I don't like MySQL. I hate it when people put cheerleading where reason
should prevail.

#43Martijn van Oosterhout
kleptog@svana.org
In reply to: Andrew - Supernews (#39)
Re: more anti-postgresql FUD

On Fri, Oct 13, 2006 at 03:35:37PM -0000, Andrew - Supernews wrote:

It's just the number of disk revolutions per second. Without caching, each
WAL flush tends to require a whole revolution unless the on-disk layout of
the filesystem is _very_ strange. You can get multiple commits per WAL
flush if you have many concurrent connections, but with a single connection
that doesn't apply.

Is that really true? In theory block n+1 could be half a revolution
after block n, allowing you to commit two transactions per revolution.

If you work with the assumption that blocks are consecutive I can see
your point, but is that a safe assumption?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#43)
Re: more anti-postgresql FUD

Martijn van Oosterhout <kleptog@svana.org> writes:

On Fri, Oct 13, 2006 at 03:35:37PM -0000, Andrew - Supernews wrote:

It's just the number of disk revolutions per second. Without caching, each
WAL flush tends to require a whole revolution unless the on-disk layout of
the filesystem is _very_ strange.

Is that really true? In theory block n+1 could be half a revolution
after block n, allowing you to commit two transactions per revolution.

Not relevant, unless the prior transaction happened to end exactly at a
WAL block boundary. Otherwise, you still have to re-write the back end
of the same disk block the previous transaction wrote into. (In
practice, for the sort of tiny transactions that are at stake here,
quite a few xacts fit into a single WAL block so the same block is
rewritten several times before moving on to the next.)

There was a long thread in -hackers a couple years back exploring ways
to break this "1 xact per disk rotation" barrier with more creative
layouts of the WAL files, but nobody could come up with something that
looked reasonably robust --- ie, both safe and not full of unsupportable
assumptions about knowing exactly where everything actually is on the
disk platter. It'd still be interesting if anyone gets a new idea...

regards, tom lane

#45Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#44)
Re: more anti-postgresql FUD

On 10/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

Is that really true? In theory block n+1 could be half a revolution
after block n, allowing you to commit two transactions per revolution.

Not relevant, unless the prior transaction happened to end exactly at a

does full page writes setting affect this?

merlin

#46Jeff Davis
pgsql@j-davis.com
In reply to: Merlin Moncure (#45)
Re: more anti-postgresql FUD

On Fri, 2006-10-13 at 13:52 -0400, Merlin Moncure wrote:

On 10/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

Is that really true? In theory block n+1 could be half a revolution
after block n, allowing you to commit two transactions per revolution.

Not relevant, unless the prior transaction happened to end exactly at a

does full page writes setting affect this?

No, full page writes only affects checkpoints.

For a transaction to commit, some bits must hit permanent storage
*somewhere*. If that location is in one general area on disk, you must
either commit several transactions at once (see commit_delay), or you
must wait until the next revolution to get back to that area of the
disk.

Regards,
Jeff Davis

#47Jim C. Nasby
jim@nasby.net
In reply to: Merlin Moncure (#45)
Re: more anti-postgresql FUD

On Fri, Oct 13, 2006 at 01:52:10PM -0400, Merlin Moncure wrote:

On 10/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

Is that really true? In theory block n+1 could be half a revolution
after block n, allowing you to commit two transactions per revolution.

Not relevant, unless the prior transaction happened to end exactly at a

does full page writes setting affect this?

If anything it makes it more true, but full pages are only written the
first time a page is dirtied after a checkpoint, so in a
high-transaction system I suspect they don't have a lot of impact.

It would be nice to have stats on how many transactions have to write a
full page, as well as how many have been written, though...
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#48Stephen Frost
sfrost@snowman.net
In reply to: Alexander Staubo (#37)
Re: more anti-postgresql FUD

* Alexander Staubo (alex@purefiction.net) wrote:

What formula did you use to get to that number? Is there a generic
way on Linux to turn off (controller-based?) write caching?

Just a side-note, but if you've got a pretty good expectation that you
won't be without power for 24 consecutive hours ever you can get a
controller with a battery-backed write cache (some will do better than
24 hours too). For the performance concerned... :)

Thanks,

Stephen

#49Joshua D. Drake
jd@commandprompt.com
In reply to: Stephen Frost (#48)
Re: more anti-postgresql FUD

Stephen Frost wrote:

* Alexander Staubo (alex@purefiction.net) wrote:

What formula did you use to get to that number? Is there a generic
way on Linux to turn off (controller-based?) write caching?

Just a side-note, but if you've got a pretty good expectation that you
won't be without power for 24 consecutive hours ever you can get a
controller with a battery-backed write cache (some will do better than
24 hours too). For the performance concerned... :)

No to mention if you are *that* concerned you could buy a generator for
500 bucks that will keep the machine alive if you absolutely have to.

There is nothing wrong with write back cache as long as you have the
infrastructure to support it.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#50Jeff Davis
pgsql@j-davis.com
In reply to: Jim C. Nasby (#47)
Re: more anti-postgresql FUD

On Fri, 2006-10-13 at 13:07 -0500, Jim C. Nasby wrote:

On Fri, Oct 13, 2006 at 01:52:10PM -0400, Merlin Moncure wrote:

On 10/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

Is that really true? In theory block n+1 could be half a revolution
after block n, allowing you to commit two transactions per revolution.

Not relevant, unless the prior transaction happened to end exactly at a

does full page writes setting affect this?

If anything it makes it more true, but full pages are only written the
first time a page is dirtied after a checkpoint, so in a
high-transaction system I suspect they don't have a lot of impact.

It would be nice to have stats on how many transactions have to write a
full page, as well as how many have been written, though...

Maybe rather than the number of transactions that are forced to write
full pages, would it be useful to know the fraction of the WAL traffic
used for full page writes? Otherwise, a transaction that dirtied one
data page would be counted the same as a transaction that dirtied 100
data pages.

I guess it gets tricky though, because you really need to know the
difference between what the volume of WAL traffic is and what it would
be if full_page_writes was disabled.

That brings up a question. Does a full page write happen in addition to
a record of the changes to that page, or instead of a record of the
changes to that page? If the answer is "in addition" the calculation
would just be a count of the pages dirtied between checkpoints. Or am I
way off base?

But yes, statistics in that area would be useful to know whether you
need to crank up the checkpoint_timeout. Ideas?

Regards,
Jeff Davis

#51AgentM
agentm@themactionfaction.com
In reply to: Joshua D. Drake (#49)
Re: more anti-postgresql FUD

On Oct 13, 2006, at 14:36 , Joshua D. Drake wrote:

Stephen Frost wrote:

* Alexander Staubo (alex@purefiction.net) wrote:

What formula did you use to get to that number? Is there a generic
way on Linux to turn off (controller-based?) write caching?

Just a side-note, but if you've got a pretty good expectation that
you
won't be without power for 24 consecutive hours ever you can get a
controller with a battery-backed write cache (some will do better
than
24 hours too). For the performance concerned... :)

No to mention if you are *that* concerned you could buy a generator
for
500 bucks that will keep the machine alive if you absolutely have to.

There is nothing wrong with write back cache as long as you have the
infrastructure to support it.

Why does the battery have to be at that level? It's seems like a
reasonable poor man's solution would be to have a standard $50 UPS
plugged in and have the UPS signal postgresql to shut down and sync.
Then, theoretically, it would be safe to run with fsync=off. The
level of risk seems the same no?

-M

#52Merlin Moncure
mmoncure@gmail.com
In reply to: AgentM (#51)
Re: more anti-postgresql FUD

On 10/13/06, AgentM <agentm@themactionfaction.com> wrote:

No to mention if you are *that* concerned you could buy a generator
for
500 bucks that will keep the machine alive if you absolutely have to.

There is nothing wrong with write back cache as long as you have the
infrastructure to support it.

Why does the battery have to be at that level? It's seems like a
reasonable poor man's solution would be to have a standard $50 UPS
plugged in and have the UPS signal postgresql to shut down and sync.
Then, theoretically, it would be safe to run with fsync=off. The
level of risk seems the same no?

1. your ups must be configured to power down your computer or you are
only delaying the inevitable for 10 minutes. (a raid bbu might stay
alive for 24 hours)

2. less points of failure: ups doesnt help you if your cpu fries,
power supply fries, memory frieds, motherboard fries, o/s halts, etc
etc. :-)

3. experience has taught me not to put 100% faith in ups power switchover.

merlin

#53Joshua D. Drake
jd@commandprompt.com
In reply to: Merlin Moncure (#52)
Re: more anti-postgresql FUD

2. less points of failure: ups doesnt help you if your cpu fries,
power supply fries, memory frieds, motherboard fries, o/s halts, etc
etc. :-)

3. experience has taught me not to put 100% faith in ups power switchover.

As a follow up to this. We have all line conditioning natural gas
generators for our equipment.

We had an outage once due to power... guess how?

An electrician blew the panel.

Joshua D. Drake

merlin

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#54Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Tom Lane (#44)
Re: more anti-postgresql FUD

On Fri, Oct 13, 2006 at 01:35:51PM -0400, Tom Lane wrote:

looked reasonably robust --- ie, both safe and not full of unsupportable
assumptions about knowing exactly where everything actually is on the
disk platter. It'd still be interesting if anyone gets a new idea...

Might it be the case that WAL is the one area where, for Postgres,
the cost of using raw disk could conceivably be worth the benefit?
(I.e. you end up having to write a domain-specific filesystemish
thing that is optimised for exactly your cases)? (And before you ask
me, no I'm not volunteering :( )

A

--
Andrew Sullivan | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
--Alexander Hamilton

#55Joshua D. Drake
jd@commandprompt.com
In reply to: Noname (#27)
Re: more anti-postgresql FUD

Face it, if one does hundreds updates per second for one table (that's
exactly what ZABBIX does, and not for one record(!) table as in my
simple test), performance degrades so fast that vacuum has to be
executed once per 5-15 seconds to keep good performance. The vacuum
will run at least several seconds with high disk io. Do you think it
won't make "PostgreSQL at least 10x slower than MySQL" as stated in the
manual? What we are discussing here? :)

I am not sure what we are discussing actually. It is well know that
PostgreSQL can not do the type of update load you are talking. Even with
autovacuum.

Now, there are ways to make postgresql be able to handle this *if* you
know what you are doing with things like partitioning but out of the
box, this guy is right.

That being said, innodb would likely suffer from the same problems and
the only reason his app works the way it does is because he is using MyISAM.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Sullivan (#54)
Re: more anti-postgresql FUD

Andrew Sullivan <ajs@crankycanuck.ca> writes:

On Fri, Oct 13, 2006 at 01:35:51PM -0400, Tom Lane wrote:

looked reasonably robust --- ie, both safe and not full of unsupportable
assumptions about knowing exactly where everything actually is on the
disk platter. It'd still be interesting if anyone gets a new idea...

Might it be the case that WAL is the one area where, for Postgres,
the cost of using raw disk could conceivably be worth the benefit?

Raw disk wouldn't do much of anything to increase my comfort factor...

In practice, the answer these days for anyone who's remotely serious
is "get a battery-backed write cache", so I'm not sure how tense we
need to be about devising application-level workarounds. BBWC was
rare and expensive the last time we discussed this seriously, but
it's not so much anymore.

regards, tom lane

#57Thomas Kellerer
spam_eater@gmx.net
In reply to: Noname (#12)
Re: more anti-postgresql FUD

alexei.vladishev@gmail.com wrote on 11.10.2006 16:54:

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

As others have pointed out, committing the data is a vital step in when testing
the performance of a relational/transactional database.

What's the point of updating an infinite number of records and never committing
them? Or were you running in autocommit mode?
Of course MySQL will be faster if you don't have transactions. Just as a plain
text file will be faster than MySQL.

You are claiming that this test does simulate the load that your applications
puts on the database server. Does this mean that you never commit data when
running on MySQL?

This test also proves (in my opinion) that any multi-db application when using
the lowest common denominator simply won't perform equally well on all
platforms. I'm pretty sure the same test would also show a very bad performance
on an Oracle server.
It simply ignores the basic optimization that one should do in an transactional
system. (Like batching updates, committing transactions etc).

Just my 0.02�
Thomas

#58Dann Corbit
DCorbit@connx.com
In reply to: Thomas Kellerer (#57)
Re: more anti-postgresql FUD

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Thomas Kellerer
Sent: Friday, October 13, 2006 2:11 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] more anti-postgresql FUD

alexei.vladishev@gmail.com wrote on 11.10.2006 16:54:

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

As others have pointed out, committing the data is a vital step in when
testing
the performance of a relational/transactional database.

What's the point of updating an infinite number of records and never
committing
them? Or were you running in autocommit mode?
Of course MySQL will be faster if you don't have transactions. Just as a
plain
text file will be faster than MySQL.

You are claiming that this test does simulate the load that your
applications
puts on the database server. Does this mean that you never commit data
when
running on MySQL?

This test also proves (in my opinion) that any multi-db application when
using
the lowest common denominator simply won't perform equally well on all
platforms. I'm pretty sure the same test would also show a very bad
performance
on an Oracle server.
It simply ignores the basic optimization that one should do in an
transactional
system. (Like batching updates, committing transactions etc).

Just my 0.02€
Thomas

In a situation where a ludicroulsly high volume of update transactions is expected, probably a tool like MonetDB would be a good idea:
http://monetdb.cwi.nl/

It's basically the freely available DB correspondent to TimesTen:
http://www.oracle.com/database/timesten.html

For an in-memory database, the high speed will require heaps and gobs of RAM, but then you will be able to do transactions 10x faster than anything else can.

It might be interesting to add fragmented column tubes in RAM {like MonetDB uses} for highly transactional tables to PostgreSQL some day.

Show quoted text

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#59Thomas Kellerer
spam_eater@gmx.net
In reply to: Noname (#12)
Re: more anti-postgresql FUD

alexei.vladishev@gmail.com wrote on 11.10.2006 16:54:

Do a simple test to see my point:

1. create table test (id int4, aaa int4, primary key (id));
2. insert into test values (0,1);
3. Execute "update test set aaa=1 where id=0;" in an endless loop

I just did the test on PostgreSQL 7.4.12 and MySQL 5.0.22 (MyISAM,
sorry had no configured InnoDB). Ubuntu 6.0.6, AMD64, 2GB, default
database settings.

MySQL performs very well, approximately 15000-20000 updates per second
with no degradation of performance.

Just a follow up: if you base your choice of DBMS on this test, you have to
chose HSQLDB. I just ran this test on my WinXP AMD64 box, and it performed
constantly at ~40000 updates per second.

Thomas

#60Chris Mair
chrisnospam@1006.org
In reply to: Noname (#30)
Re: more anti-postgresql FUD

it would be cool if you could at least:

- bundle your updates into transactions of, say, 1000 updates at a time
i.e. wrap a BEGIN; END; around a 1000 of them
- run postgresql with fsync off, since you're using MyISAM
- run PostgreSQL at least 8, since you're running MySQL 5

I'd bet MySQL would still be faster on such an artificial, single user
test, but not *that much* faster.

I'm quite sure the results will be very close to what I get before even
if I do all of the above. My post was not about MySQL vs PostgreSQL. It
was about very fast performance degradation of PostgreSQL in case of
large number of updates provided vacuum is not used.

If you don't want to install 8.0, could you maybe at least do the first
two items (shouldn't be a lot of work)...?

Which client are you using? Just mysql/psql or some API?

C API

Ok,
I did some tests at last on this using the above 3 suggestions.
I ran on 8.1.4 on Linux with fsync=off and did 300k updates bundled
into 100 updates / 1 transaction (100 turned out to be a sweeter
spot than 1000).

The box was comparable to yours, I think: 1xOpteron 2.2GHz, 2xSATA
RAID0 (yes, I know...), 1GB RAM

Details and results are here:
http://www.1006.org/misc/20061014_pgupdates_bench/

The interesting part is the graph that shows updates / sec real time
vs. running total of updates:
http://www.1006.org/misc/20061014_pgupdates_bench/results.png

Let's start with the red crosses: that's without vacuum, and yes,
you're right: PG's performance degrades.

But, it doesn't degrade quite as bad as you mentioned
(you mentioned 1600u/s for the first 10k, then 200u/s for
the first 100k). At 100k I'm still at 2700u/s down
from ~8000u/s. Only after ~140k updates my line drops
quicker. I obviously bump into some limit given by my
setup there. The thing is totally CPU-bound by the way.

Ok.
So, this a very bizarre load for PostgreSQL, especially
without any vacuum.

Let's add some vacuum: every 50k (green x) or even every 10k
(blue *) updates - which is a very reasonable thing do to for
this type of load.

With vacuum, I get a stable performance all the way up to
300k updates. Rates are 4700 u/s or even 8500 u/s.

Note the curves show no drops when vacuum is active.

Out of curiosity I did a run having autovacuum visit the db
every 30 seconds (purple squares): even without any special
effort to find good vacuum spots, I can get a good 3300
updates/sec all the way up to 300k updates!

I'd dare to say that if you just ran ZABBIX on 8.1 with
autovacuum on with a shortish interval (30 sec?) you'd
get rid of your performance problems. Time to update
the documentation after all? ;)

Bye,
Chris.

--

Chris Mair
http://www.1006.org

#61Joshua D. Drake
jd@commandprompt.com
In reply to: Chris Mair (#60)
Re: more anti-postgresql FUD

I'd dare to say that if you just ran ZABBIX on 8.1 with
autovacuum on with a shortish interval (30 sec?) you'd
get rid of your performance problems. Time to update
the documentation after all? ;)

I would be curious to see what would happen if you added to vacuum_*
parameters to the equation (such as the delay).

Joshua D. Drake

Bye,
Chris.

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#62Merlin Moncure
mmoncure@gmail.com
In reply to: Chris Mair (#60)
Re: more anti-postgresql FUD

On 10/14/06, Chris Mair <chrisnospam@1006.org> wrote:

Ok,
I did some tests at last on this using the above 3 suggestions.
I ran on 8.1.4 on Linux with fsync=off and did 300k updates bundled
into 100 updates / 1 transaction (100 turned out to be a sweeter
spot than 1000).

My postgresql 'magic number' is 150. 150 is the point at which I stop
getting meangingful improvements on two important cases: simple
update/insert transactions like yours and also where the performance
improvement on fetching multiple rows level's off. In other words,
selecting 100k rows in 150 record chunks is marginally slower then
selecting the whole thing at once (and single record selects is of
course much slower). In code, the number 150 is called 'merlin's
constant' :)

however, its a pretty safe bet zabbix is not doing updates grouped in
transactions like that. on the other hand, the updates are not so
localized either.

Details and results are here:
http://www.1006.org/misc/20061014_pgupdates_bench/

wow, great chart!

8500 updates/sec is really spectacular. It proves that mvcc bloat on
small tables is controllable. On large tables, the bloat is usually
not as much of a concern and can actually be a good thing. You also
proved, in my opinion conclusively, that running vacuum in high update
environments is a good thing.

With vacuum, I get a stable performance all the way up to
300k updates. Rates are 4700 u/s or even 8500 u/s.

It looks like with careful tuning 10k could be cracked. Also, while
mvcc provides certain scnenarios that have to be worked around, you
also get its advantages. Updates lock only the record being written
to and only to other writers. mysql ISAM does full table
locking...which is going to perform better in a 2p server with 100
users? 4p and 1000 users?

merlin

#63Merlin Moncure
mmoncure@gmail.com
In reply to: Chris Mair (#60)
Re: more anti-postgresql FUD

On 10/14/06, Chris Mair <chrisnospam@1006.org> wrote:

The interesting part is the graph that shows updates / sec real time
vs. running total of updates:
http://www.1006.org/misc/20061014_pgupdates_bench/results.png

one small thing: the variances inside the trendline are caused by
using integer timestamps...each slanted line is one second. The blue
line has a very slight wobble which is the effects of the vacuum..its
very slight. Actually in this test it would probably be good to
vacuum extremely often, like every 100 records or so.

merlin

#64Alvaro Herrera
alvherre@commandprompt.com
In reply to: Merlin Moncure (#63)
Re: more anti-postgresql FUD

Merlin Moncure wrote:

On 10/14/06, Chris Mair <chrisnospam@1006.org> wrote:

The interesting part is the graph that shows updates / sec real time
vs. running total of updates:
http://www.1006.org/misc/20061014_pgupdates_bench/results.png

one small thing: the variances inside the trendline are caused by
using integer timestamps...each slanted line is one second. The blue
line has a very slight wobble which is the effects of the vacuum..its
very slight. Actually in this test it would probably be good to
vacuum extremely often, like every 100 records or so.

I was thinking what would happen if you used 8.2 for this test and had a
process continuously vacuuming the table, i.e. start a new vacuum as
soon as the previous one finished, with a reasonable vacuum_delay
setting (not sure what would qualify as reasonable; probably needs its
own set of tests to determine the sweet spot).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#65Alban Hertroys
alban@magproductions.nl
In reply to: Merlin Moncure (#38)
Re: more anti-postgresql FUD

Merlin Moncure wrote:

On 10/13/06, Roman Neuhauser <neuhauser@sigpipe.cz> wrote:

SELECT * FROM TABLE ORDER BY pk LIMIT 10 OFFSET N;

using offset to walk a table is extremely poor form because of:
* poor performance
* single user mentality
* flat file mentality

databases are lousy at this becuase they inheritly do not support
abolute addressing of data -- nore should they, beause this is not
what sql is all about. in short, 'offset' is a hack, albeit a useful
one in some cases, but dont gripe when it doesn't deliver the goods.

for server side browsing use cursors or a hybrid pl/pgqsl loop. for
client side, browse fetching relative to the last key:

select * from foo where p > p1 order by p limit k;

This does require some way for the client to keep a single transaction
open. If this kind of query is performed by a web application (as is
often the case), the "client" is the server side web script engine, and
not all of those beasts are capable of keeping a transaction open across
pages (PHP comes to mind).
This combined with expensive (complex) queries is regularly a pain.

The alternative solution of storing the query results in a temporary
table suffers from the same problem (the transaction is gone after the
first page).

I believe, as a result of this, it is not uncommon to pass the primary
key id's of all results on in a hidden field, so they are available for
quick querying on proceeding pages.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

#66Ivan Sergio Borgonovo
mail@webthatworks.it
In reply to: Alban Hertroys (#65)
Re: more anti-postgresql FUD

On Mon, 16 Oct 2006 11:05:33 +0200
Alban Hertroys <alban@magproductions.nl> wrote:

This does require some way for the client to keep a single
transaction open. If this kind of query is performed by a web
application (as is often the case), the "client" is the server side
web script engine, and not all of those beasts are capable of
keeping a transaction open across pages (PHP comes to mind).
This combined with expensive (complex) queries is regularly a pain.

But which scripting language in a web environment (read running under Apache) is able to keep transactions open across pages?

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

#67Merlin Moncure
mmoncure@gmail.com
In reply to: Alban Hertroys (#65)
Re: more anti-postgresql FUD

On 10/16/06, Alban Hertroys <alban@magproductions.nl> wrote:

Merlin Moncure wrote:

for server side browsing use cursors or a hybrid pl/pgqsl loop. for
client side, browse fetching relative to the last key:

select * from foo where p > p1 order by p limit k;

This does require some way for the client to keep a single transaction
open. If this kind of query is performed by a web application (as is
often the case), the "client" is the server side web script engine, and
not all of those beasts are capable of keeping a transaction open across
pages (PHP comes to mind).
This combined with expensive (complex) queries is regularly a pain.

Server-side browsing requires transactions so is unsuitable for
certain types of web enviroments. However client-side following as I
described as not...it is the right and proper way to solve this
problem . It's also why the sql row-wise comparion is so important,
because it provides an easy way to do this with table with mutiple
part keys.

merlin

#68Karen Hill
karen_hill22@yahoo.com
In reply to: Merlin Moncure (#38)
Re: more anti-postgresql FUD

"Merlin Moncure" wrote:

SELECT * FROM TABLE ORDER BY pk LIMIT 10 OFFSET N;

using offset to walk a table is extremely poor form because of:
* poor performance
* single user mentality
* flat file mentality

databases are lousy at this becuase they inheritly do not support
abolute addressing of data -- nore should they, beause this is not
what sql is all about. in short, 'offset' is a hack, albeit a useful
one in some cases, but dont gripe when it doesn't deliver the goods.

for server side browsing use cursors or a hybrid pl/pgqsl loop. for
client side, browse fetching relative to the last key:

select * from foo where p > p1 order by p limit k;

in 8.2, we get proper comparisons so you can do this with multiple part keys:

select * from foo where (a1,b1,b1) > (a,b,c) order by a,b,c limit k;

I have 8.2 Beta 1 (Win32) on my home pc and offset was faster than
fetching relative to the last key as measured by explain analyze. This
was on a table with about 1,000 rows.

regards,

karen

#69Alvaro Herrera
alvherre@commandprompt.com
In reply to: Karen Hill (#68)
Re: more anti-postgresql FUD

Karen Hill wrote:

I have 8.2 Beta 1 (Win32) on my home pc and offset was faster than
fetching relative to the last key as measured by explain analyze. This
was on a table with about 1,000 rows.

For such a small table the difference is probably irrelevant. Try with
several million rows.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support