performace review

Started by Tomi NAover 19 years ago19 messagesgeneral
Jump to latest
#1Tomi NA
hefest@gmail.com

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

t.n.a.

#2Alexander Staubo
alex@purefiction.net
In reply to: Tomi NA (#1)
Re: performace review

On Oct 7, 2006, at 20:06 , Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

As with any use of a database, it is useless and/or impossible to
evaluate blanket statements about performance without also evaluating
the application's use of the database. One person's slow join may be
another person's fast join.

Also, note how that the FAQ says that "the default distribution of
openCRX includes some indexes", implying that the schema must be
adapted to one's database to achieve adequate performance. The FAQ
does not mention whether the cited scaling limits include any
attempts at tuning PostgreSQL.

I don't see PostgreSQL being "bashed sentence after sentence",
however -- the two "known limitations" listed for PostgreSQL are
"slow (even for small datasets)" and "jokes [sic] on 3-table-joins"
-- and among the open-source databases mentioned, PostgreSQL is
described as scaling the highest.

With a running OpenCRX installation you could turn on PostgreSQL's
query logging to identify slow queries or bad query patterns. I'm
sure both the open-source communities would appreciate the feedback.

Alexander.

#3Jonathan Vanasco
postgres@2xlp.com
In reply to: Alexander Staubo (#2)
Re: performace review

On Oct 7, 2006, at 3:31 PM, Alexander Staubo wrote:

I don't see PostgreSQL being "bashed sentence after sentence",
however -- the two "known limitations" listed for PostgreSQL are
"slow (even for small datasets)" and "jokes [sic] on 3-table-joins"
-- and among the open-source databases mentioned, PostgreSQL is
described as scaling the highest.

did you notice this line
---------------

*** Please note that we do not recommend PostgreSQL for production
use. While PostgreSQL may be fine for many settings this DBMS simply
does not deliver the performance required for openCRX (PostgreSQL
takes minutes/hours to calculate 3-table-joins even for small data
sets).

---------------

that sounds to me like someone never vacuumed and analyzed their db.

#4Richard Broersma Jr
rabroersma@yahoo.com
In reply to: Jonathan Vanasco (#3)
Re: performace review

did you notice this line
---------------

*** Please note that we do not recommend PostgreSQL for production
use. While PostgreSQL may be fine for many settings this DBMS simply
does not deliver the performance required for openCRX (PostgreSQL
takes minutes/hours to calculate 3-table-joins even for small data
sets).

---------------

that sounds to me like someone never vacuumed and analyzed their db.

If you look at the db comparison chart, I think the only row that they got right for every RDMS
listed was "FREE vs Commericial". They suggest that postgresql and mysql can only handle 20
concurrent users and this low value doesn't seems realistic to me at all.

Regards,

Richard Broersma Jr.

#5Brandon Aiken
BAiken@winemantech.com
In reply to: Tomi NA (#1)
Re: performace review

It wouldn't surprise me if their bashing were correct, but I doubt that it's PostgreSQL's fault.

I download the db source (inside opencrx-1.9.1-core.postgresql-8.zip) and executed their three schema files, dbcreate-indexes.sql, dbcreate-views.sql, dbcreate-tables.sql. Each of the 118 tables has a three-field composite primary key of 'PRIMARY KEY (object_rid, object_oid, object_idx)'. object_rid and object_oid are both VARCHAR(200). There are *no* foreign key constraints. Each table has between 15 and 50 fields, with 25 looking about average.

Gee, why to table joins take so long? Maybe because a blind monkey created the schema? Normalized databases do tend to perform better, so I hear.

Brandon Aiken

________________________________

From: pgsql-general-owner@postgresql.org on behalf of Tomi NA
Sent: Sat 10/7/2006 2:06 PM
To: PgSQL General
Subject: [GENERAL] performace review

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

t.n.a.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#6Alexander Staubo
alex@purefiction.net
In reply to: Brandon Aiken (#5)
Re: performace review

On Oct 7, 2006, at 23:44 , Brandon Aiken wrote:

I download the db source (inside opencrx-1.9.1-
core.postgresql-8.zip) and executed their three schema files,
dbcreate-indexes.sql, dbcreate-views.sql, dbcreate-tables.sql.
Each of the 118 tables has a three-field composite primary key of
'PRIMARY KEY (object_rid, object_oid, object_idx)'. object_rid and
object_oid are both VARCHAR(200). There are *no* foreign key
constraints. Each table has between 15 and 50 fields, with 25
looking about average.

To be fair, there are a bunch of indexes, but the number of indexes
seems low compared to the number of fields.

Gee, why to table joins take so long? Maybe because a blind monkey
created the schema? Normalized databases do tend to perform
better, so I hear.

*De*normalization is the traditional hack to speed up queries,
because it reduces the need for joins.

Alexander.

#7Joshua D. Drake
jd@commandprompt.com
In reply to: Tomi NA (#1)
Re: performace review

Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

It is 100% FUD.

Joshua D. Drake

t.n.a.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#8Brandon Aiken
BAiken@winemantech.com
In reply to: Tomi NA (#1)
Re: performace review

Denormalization should reduce the number of joins and reduce the overall number of tables, yes? And the idea is to fully normalize and then back off because of physical limitations in the database you're using *with full knowledge and understanding that you're sacrificing the relational model to do it*. They obviously did not do that. They just built a denormalized database.

Look at one of the views:

CREATE OR REPLACE VIEW kernel_view_027 AS
(( SELECT 0 AS object_rid, act.object_oid, act.object_idx, act0."p$$assigned_to__rid" AS "p$$object_parent__rid", act0."p$$assigned_to__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
FROM kernel_account acc
WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
FROM kernel_activity act
LEFT JOIN kernel_activity act0 ON act0.object_idx = 0 AND act.object_rid::text = act0.object_rid::text AND act.object_oid::text = act0.object_oid::text
UNION
SELECT 0 AS object_rid, act.object_oid, act.object_idx, act0."p$$sender_parent__rid" AS "p$$object_parent__rid", act0."p$$sender_parent__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
FROM kernel_account acc
WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
FROM kernel_activity act
LEFT JOIN kernel_activity act0 ON act0.object_idx = 0 AND act.object_rid::text = act0.object_rid::text AND act.object_oid::text = act0.object_oid::text)
UNION
SELECT 0 AS object_rid, act.object_oid, act.object_idx, p0."p$$party__rid" AS "p$$object_parent__rid", p0."p$$party__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
FROM kernel_account acc
WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
FROM kernel_activity act
JOIN kernel_activityparty p0 ON p0.object_idx = 0 AND p0."p$$object_parent__rid"::text = act.object_rid::text AND p0."p$$object_parent__oid"::text = act.object_oid::text)
UNION
SELECT 0 AS object_rid, act.object_oid, act.object_idx, p0."p$$party_parent__rid" AS "p$$object_parent__rid", p0."p$$party_parent__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
FROM kernel_account acc
WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
FROM kernel_activity act
JOIN kernel_activityparty p0 ON p0.object_idx = 0 AND p0."p$$object_parent__rid"::text = act.object_rid::text AND p0."p$$object_parent__oid"::text = act.object_oid::text;

That's *horrible*. There's typecasting on the join conditions to convert things to text! If there's a join on it, not only should they already be the same type, there ought to be a foreign key constraint on it (even if it is a self-referencing table). The silly thing UNIONs the exact same query four times in order to create a heirarchy! And it has the gloriously descriptive name 'kernel_view_027' (all 19 views share this naming convention).

While creating the tables, I got three errors about an unknown datatype. Yeah, that's great design.

110 of the 118 tables have these fields:
object_rid varchar(200) NOT NULL,
"p$$object_parent__rid" varchar(200),
object_oid varchar(200) NOT NULL,
"p$$object_parent__oid" varchar(200),
object_idx int4 NOT NULL,
object__class varchar(200),
modified_at varchar(20),
created_at varchar(20),
modified_by varchar(20),
created_by varchar(20),
"owner" varchar(50),
access_level_browse int4,
access_level_update int4,
access_level_delete int4,

And this key:
CONSTRAINT kernel_media_pkey PRIMARY KEY (object_rid, object_oid, object_idx)

Wow, yeah. "modified_at" and "created_at". Those should definitely not be timestamps. "owner". Great field name, that. The only keys that don't allow NULLs are the primaries. Clearly each of the 110 tables will need all these fields, then.

Looking at the indexes, the vast majority of them are against one of these 14 universal fields, but only exist on some of the tables.

It's *badly designed*. I've been at this for less than three months and *I* can see it. They're using a relational database as an object database. No wonder their performance sucks!

Brandon Aiken

________________________________

From: Alexander Staubo [mailto:alex@purefiction.net]
Sent: Sat 10/7/2006 5:59 PM
To: Brandon Aiken
Cc: Tomi NA; pgsql-general@postgresql.org
Subject: Re: [GENERAL] performace review

On Oct 7, 2006, at 23:44 , Brandon Aiken wrote:

I download the db source (inside opencrx-1.9.1-
core.postgresql-8.zip) and executed their three schema files,
dbcreate-indexes.sql, dbcreate-views.sql, dbcreate-tables.sql.
Each of the 118 tables has a three-field composite primary key of
'PRIMARY KEY (object_rid, object_oid, object_idx)'. object_rid and
object_oid are both VARCHAR(200). There are *no* foreign key
constraints. Each table has between 15 and 50 fields, with 25
looking about average.

To be fair, there are a bunch of indexes, but the number of indexes
seems low compared to the number of fields.

Gee, why to table joins take so long? Maybe because a blind monkey
created the schema? Normalized databases do tend to perform
better, so I hear.

*De*normalization is the traditional hack to speed up queries,
because it reduces the need for joins.

Alexander.

#9Chris Browne
cbbrowne@acm.org
In reply to: Jonathan Vanasco (#3)
Re: performace review

rabroersma@yahoo.com (Richard Broersma Jr) writes:

did you notice this line
---------------

*** Please note that we do not recommend PostgreSQL for production
use. While PostgreSQL may be fine for many settings this DBMS simply
does not deliver the performance required for openCRX (PostgreSQL
takes minutes/hours to calculate 3-table-joins even for small data
sets).

---------------

that sounds to me like someone never vacuumed and analyzed their db.

If you look at the db comparison chart, I think the only row that
they got right for every RDMS listed was "FREE vs Commericial".
They suggest that postgresql and mysql can only handle 20 concurrent
users and this low value doesn't seems realistic to me at all.

For heavy load, MySQL with myisam has often started really choking at
~10 concurrent users, so that part doesn't seem ludicrously
unrealistic. (Somewhat off? Perhaps. Ludicrously so? Not.)

It's possible that the last time they tried PostgreSQL was with
version 7.1 or 7.2, and things have really changed since then.

This could also be a situation where adding a few useful indexes might
fix a lot of ills. Better to try to help fix the problems so as to
help show that the comparisons are way off base rather than to simply
cast stones...
--
output = ("cbbrowne" "@" "linuxdatabases.info")
http://linuxfinances.info/info/advocacy.html
Epistemology in One Lesson
Reality ruthlessly selects out creatures that embody hypotheses too
inconsistent with reality. Our only choice is whether we participate
by being selected out, or (in Popper's great phrase) by "letting our
ideas die in our stead."
-- Mark Miller

#10Jonathan Vanasco
postgres@2xlp.com
In reply to: Chris Browne (#9)
Re: performace review

On Oct 7, 2006, at 6:41 PM, Chris Browne wrote:

This could also be a situation where adding a few useful indexes might
fix a lot of ills. Better to try to help fix the problems so as to
help show that the comparisons are way off base rather than to simply
cast stones...

i'm too tight for cash to afford being wrong right now...

but I'd otherwise bet that the issue was from not vacuum analyzing

i've routinely had 3,9,12, i think even a 14 table join that would
take forever to run...

until i realized that i added/dropped an index and forgot to run
analyze. then they all work within a matter of split seconds. all of
them.

i've seen not just dramatic, but drastic , changes in performance and
the planner's output before and after a vacuum analyze of the db.

i'm really confident thats the problem. unfortunately, they have a
max_db contact email, and not a postgres. so i don't know who to
check with to see if they ran it or not.

#11Merlin Moncure
mmoncure@gmail.com
In reply to: Tomi NA (#1)
Re: performace review

On 10/7/06, Tomi NA <hefest@gmail.com> wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

FUD

postgresql in particular is an enormous fud magnet. on the surface,
these guys are a first class example of what I consider to be the dark
(as in evil) side of data management. o-r mapping run amok! and they
immediatly try to upsell you on the 'clustering' version.

merlin

#12Thomas Hallgren
thhal@mailblocks.com
In reply to: Joshua D. Drake (#7)
Re: performace review

Joshua D. Drake wrote:

Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

It is 100% FUD.

What would be the incentive for OpenCRX spreading FUD about PostgreSQL? Does anyone know?

Kind Regards,
Thomas Hallgren

#13Ron Johnson
ron.l.johnson@cox.net
In reply to: Thomas Hallgren (#12)
Re: performace review

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/22/06 06:45, Thomas Hallgren wrote:

Joshua D. Drake wrote:

Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

It is 100% FUD.

What would be the incentive for OpenCRX spreading FUD about PostgreSQL?
Does anyone know?

That implies malice. The people at OpenCRX apparently really
believe what they wrote.

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFO2SAS9HxQb37XmcRAswRAKDhpBH1arYnlxhiD/zs6URa1Us7cQCglERQ
O1QslqhBBqqCh8JVwmHJ3wQ=
=4e54
-----END PGP SIGNATURE-----

#14Joshua D. Drake
jd@commandprompt.com
In reply to: Ron Johnson (#13)
Re: performace review

Ron Johnson wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/22/06 06:45, Thomas Hallgren wrote:

Joshua D. Drake wrote:

Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

It is 100% FUD.

What would be the incentive for OpenCRX spreading FUD about PostgreSQL?
Does anyone know?

That implies malice. The people at OpenCRX apparently really
believe what they wrote.

I believe they probably do believe it and it was probably driven by a
complete lack of understanding of PostgreSQL.

It doesn't have to be malicious for it to be FUD though.

Sincerely,

Joshua D. Drake

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#14)
Re: performace review

"Joshua D. Drake" <jd@commandprompt.com> writes:

Ron Johnson wrote:

That implies malice. The people at OpenCRX apparently really
believe what they wrote.

I believe they probably do believe it and it was probably driven by a
complete lack of understanding of PostgreSQL.

It doesn't have to be malicious for it to be FUD though.

The psychological reason seems clear enough: if they can dismiss
postgres as not being worthy of their time, it saves them a lot of
work in supporting another database. By this point I'd imagine that
their code is sufficiently mysql-centric that trying to have real
support for other databases would be a huge undertaking; ergo, there
will be great resistance to the idea that they should take postgres
seriously. It's a bit of a self-fulfilling prophecy, too, because
once the code is sufficiently tuned for mysql you can indeed show that
any other database sucks running it ...

regards, tom lane

#16Wes Sheldahl
wes.sheldahl@gmail.com
In reply to: Tom Lane (#15)
Re: performace review

On 10/22/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Joshua D. Drake" <jd@commandprompt.com> writes:

Ron Johnson wrote:

That implies malice. The people at OpenCRX apparently really
believe what they wrote.

I believe they probably do believe it and it was probably driven by a
complete lack of understanding of PostgreSQL.

It doesn't have to be malicious for it to be FUD though.

The psychological reason seems clear enough: if they can dismiss
postgres as not being worthy of their time, it saves them a lot of
work in supporting another database. By this point I'd imagine that
their code is sufficiently mysql-centric that trying to have real
support for other databases would be a huge undertaking; ergo, there
will be great resistance to the idea that they should take postgres
seriously. It's a bit of a self-fulfilling prophecy, too, because
once the code is sufficiently tuned for mysql you can indeed show that
any other database sucks running it ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

I think you're right, except that they don't recommend MySQL for production
either; they seem happiest with MS SQL Server and Oracle, which seems a
little odd for an open source project. If they haven't figured out how to
get their app to work well with either MySQL or PostgreSQL, it seems a
little disingenuous to claim they support them on the site's front page. Oh
well. (shrug)

--
Wes Sheldahl
wes.sheldahl@gmail.com

#17Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Joshua D. Drake (#14)
Re: performace review

On Sun, 2006-10-22 at 08:12 -0700, Joshua D. Drake wrote:

Ron Johnson wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/22/06 06:45, Thomas Hallgren wrote:

Joshua D. Drake wrote:

Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

It is 100% FUD.

What would be the incentive for OpenCRX spreading FUD about PostgreSQL?
Does anyone know?

That implies malice. The people at OpenCRX apparently really
believe what they wrote.

I believe they probably do believe it and it was probably driven by a
complete lack of understanding of PostgreSQL.

Actually, after reading the reply from Brandon Aiken, I believe it was
driven by a complete lack of understanding of relational theory.

#18Merlin Moncure
mmoncure@gmail.com
In reply to: Joshua D. Drake (#14)
Re: performace review

On 10/22/06, Joshua D. Drake <jd@commandprompt.com> wrote:

Ron Johnson wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/22/06 06:45, Thomas Hallgren wrote:

Joshua D. Drake wrote:

Tomi NA wrote:

I was just reading http://www.opencrx.org/faq.htm where RDBMS engines
are one of the questions and see pgsql bashed sentence after sentence.
Can anyone offer any insight as to weather it's fact or FUD?

It is 100% FUD.

What would be the incentive for OpenCRX spreading FUD about PostgreSQL?
Does anyone know?

That implies malice. The people at OpenCRX apparently really
believe what they wrote.

I believe they probably do believe it and it was probably driven by a
complete lack of understanding of PostgreSQL.

It doesn't have to be malicious for it to be FUD though.

maybe. the feeling i get reading their page is that they are saying:
"mysql and postgresql, etc are made with tinker toys, you need to run
us on the real stuff". they then try to upsell you to diamond studded
titanium tinker toys, namely oracle rac.

merlin

#19Tony Caduto
tony_caduto@amsoftwaredesign.com
In reply to: Merlin Moncure (#18)
Re: performace review

I believe they probably do believe it and it was probably driven by a
complete lack of understanding of PostgreSQL.

This part kills me:

- slow (even for
small datasets)
- jokes on
3-table-joins

I wonder what version of PG they did their testing/development on? I
bet it was a version well before version 8.x

--
Tony Caduto
AM Software Design
http://www.amsoftwaredesign.com
Home of PG Lightning Admin for Postgresql
Your best bet for Postgresql Administration