The New Slashdot Setup (includes MySql server)
Info on the new slashdot.org setup
<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application.
In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
--
Alessio F. Bragadini alessio@albourne.com
APL Financial Services http://www.sevenseas.org/~alessio
Nicosia, Cyprus phone: +357-2-750652
"It is more complicated than you think"
-- The Eighth Networking Truth from RFC 1925
Info on the new slashdot.org setup
<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application.In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
Just finished reading the thread. I am surprised how many people
slammed them on their MySQL over PostgreSQL decision. People are
slamming MySQL all over the place. :-)
Seems like inertia was the reason to stay with MySQL. What that means
to me is that for their application space, PostgreSQL already has
superior technology, and people realize it. This means we are on our
way up, and MySQL is, well, ....
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
thanks for the pointer ... I just posted my response ... specifically
pointing out how "accurate" the MySQL docs tend to be *rofl*
On Thu, 18 May 2000, Alessio Bragadini wrote:
Info on the new slashdot.org setup
<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application.In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).--
Alessio F. Bragadini alessio@albourne.com
APL Financial Services http://www.sevenseas.org/~alessio
Nicosia, Cyprus phone: +357-2-750652"It is more complicated than you think"
-- The Eighth Networking Truth from RFC 1925
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
on 5/18/00 1:12 PM, Bruce Momjian at pgman@candle.pha.pa.us wrote:
Seems like inertia was the reason to stay with MySQL. What that means
to me is that for their application space, PostgreSQL already has
superior technology, and people realize it. This means we are on our
way up, and MySQL is, well, ....
There is this growing desire among some OpenACS people to replicate the
Slashdot functionality in an OpenACS module (probably a weekend's worth of
work). I wish I had a bit more free time to do it. It's time to show what
can be done with a real RDBMS (and a real web application environment, but
that's a different story).
-Ben
On Thu, 18 May 2000, Bruce Momjian wrote:
Info on the new slashdot.org setup
<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application.In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).Just finished reading the thread. I am surprised how many people
slammed them on their MySQL over PostgreSQL decision. People are
slamming MySQL all over the place. :-)Seems like inertia was the reason to stay with MySQL. What that means
to me is that for their application space, PostgreSQL already has
superior technology, and people realize it. This means we are on our
way up, and MySQL is, well, ....
In SlashDot's defence here ... I dooubt there is much they do that would
require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
and *alot* of SELECT ...
okay, that is a good point ... I know what the difference in performance
the two vs one select issue can produce ...
On Thu, 18 May 2000, Alfred Perlstein wrote:
* The Hermit Hacker <scrappy@hub.org> [000518 11:51] wrote:
On Thu, 18 May 2000, Bruce Momjian wrote:
Info on the new slashdot.org setup
<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application.In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).Just finished reading the thread. I am surprised how many people
slammed them on their MySQL over PostgreSQL decision. People are
slamming MySQL all over the place. :-)Seems like inertia was the reason to stay with MySQL. What that means
to me is that for their application space, PostgreSQL already has
superior technology, and people realize it. This means we are on our
way up, and MySQL is, well, ....In SlashDot's defence here ... I dooubt there is much they do that would
require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
and *alot* of SELECT ...If those guys still are doing multiple selects for each page view after
at least 2 years of being around and choking on the load, they seriously
need to get a clue. mod_perl... belch!--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Import Notes
Reply to msg id not found: 20000518121619.C21508@fw.wintelcom.net | Resolved by subject fallback
* The Hermit Hacker <scrappy@hub.org> [000518 11:51] wrote:
On Thu, 18 May 2000, Bruce Momjian wrote:
Info on the new slashdot.org setup
<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application.In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).Just finished reading the thread. I am surprised how many people
slammed them on their MySQL over PostgreSQL decision. People are
slamming MySQL all over the place. :-)Seems like inertia was the reason to stay with MySQL. What that means
to me is that for their application space, PostgreSQL already has
superior technology, and people realize it. This means we are on our
way up, and MySQL is, well, ....In SlashDot's defence here ... I dooubt there is much they do that would
require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
and *alot* of SELECT ...
If those guys still are doing multiple selects for each page view after
at least 2 years of being around and choking on the load, they seriously
need to get a clue. mod_perl... belch!
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
The Hermit Hacker wrote:
thanks for the pointer ... I just posted my response ... specifically
pointing out how "accurate" the MySQL docs tend to be *rofl*
And now there is a response to your response stating the following
The MySQL people have said exactly the same sort of things about
the PostgreSQL people. So please stop the name-calling and
the quotes around "test", it's not going to get you anywhere.That being said, the standard MySQL benchmark _still_ is 30 times
faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off,
_and_ nonstandard speed-up PostgreSQL features like VACUUM enabled,
btw, how does one "enable" vacuum ?
I might add). The main reason seems to be some sort of failure to
use the index in the SELECT and UPDATE test loops on the part of
PostgreSQL.The benchmark, for the curious, works like this:
First it creates a table with an index:
create table bench1 (id int NOT NULL,id2 int NOT NULL,id3 int NOT NULL,dummy1 char(30)); create unique
index bench1_index_ on bench1 using btree (id,id2); create index bench1_index_1 on bench1 using btree (id3);Then it fills the table with 300.000 entries with unique id values.
Then, it issues a query like this:
update bench1 set dummy1='updated' where id=1747
which causes the backend to do one thousand read() calls. For each query.
could it be that just updating 1 unique index causes 1k read()'s ?
No wonder it's slow. An EXPLAIN query states that it's using the
index, though. I have no clue what happens here. I've sent this
to the pgsql-general mailing list and have just reposted it to -hackers.
I somehow missed it (on -hackers at least) so I repost it here
Oh yes, the benchmark also revealed that CREATE TABLE in PostgreSQL 7.0
leaks about 2k of memory.
-------------------
Hannu
Hi,
Hannu Krosing:
And now there is a response to your response stating the following
That response is from me, actually. (I subscribed to -hackers two hours
ago, so I'm sorry if I missed anything.)
That being said, the standard MySQL benchmark _still_ is 30 times
faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off,
_and_ nonstandard speed-up PostgreSQL features like VACUUM enabled,btw, how does one "enable" vacuum ?
run-all-tests ... --fast.
The code has stuff like
$server->vacuum(1,\$dbh) if $opt_fast and defined $server->{vacuum};
sprinkled at strategic places.
--
Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661
The quote was selected randomly. Really. | http://smurf.noris.de/
--
Silence is the element in which great things fashion themselves.
--Thomas Carlyle
Hi,
Chris:
VACUUM is not a speed-up feature, it's a slow-down feature. It reclaims
space and that takes time. It does update system statistics which can
help performance if done after a data load or perhaps once a day.
OK, thanks for the clarification.
But "sprinkling the code" with vacuum sounds like a big performance
killer. Hope you are not counting vacuum as part of your 1000 read()
calls.
Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
call. They actually scan the pg_index table of the benchmark database.
Why they do that is another question entirely. (a) these tables should
have indices, and (b) whatever postgres wants to know should have been
cached someplace. Oh yes, (c) what's in pg_index that needs to be 4
MBytes big?
--
Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661
The quote was selected randomly. Really. | http://smurf.noris.de/
--
Man is the only animal that laughs and weeps; for he is
the only animal that is struck with the difference between
what things are and what they ought to be.
-- William Hazlitt (1778-1830)
Import Notes
Reply to msg id not found: 39259BEC.3777E935@bitmead.com
Hi,
Chris:
Matthias Urlichs wrote:
Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
call. They actually scan the pg_index table of the benchmark database.Does this only happen on the first call to INSERT/UPDATE after
connecting to the database, or does it happen with all subsequent calls
too?
All of them. Whatever the server is looking up here, it's _not_ cached.
--
Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661
The quote was selected randomly. Really. | http://smurf.noris.de/
--
GRITCH 1. n. A complaint (often caused by a GLITCH (q.v.)). 2. v. To
complain. Often verb-doubled: "Gritch gritch". 3. Glitch.
-- From the AI Hackers' Dictionary
Import Notes
Reply to msg id not found: 3925A53E.8E8E6592@bitmead.com
-----Original Message-----
From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
Behalf Of Matthias UrlichsWhy they do that is another question entirely. (a) these tables should
have indices, and (b) whatever postgres wants to know should have been
cached someplace. Oh yes, (c) what's in pg_index that needs to be 4
MBytes big?
What does 'vacuum pg_index' show ?
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
Matthias Urlichs wrote:
Hi,
Chris:
Matthias Urlichs wrote:
Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
call. They actually scan the pg_index table of the benchmark database.Does this only happen on the first call to INSERT/UPDATE after
connecting to the database, or does it happen with all subsequent calls
too?All of them. Whatever the server is looking up here, it's _not_ cached.
Maybe shared buffer isn't so large as to keep all the(4.1M) pg_index pages.
So it would read pages from disk every time,
Unfortunately pg_index has no index to scan the index entries of a relation now.
However why is pg_index so large ?
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
Hi,
Hiroshi Inoue:
What does 'vacuum pg_index' show ?
test=> vacuum pg_index;
NOTICE: Skipping "pg_index" --- only table owner can VACUUM it
VACUUM
OK, so I suppose I should do it as the postgres user...
test=> vacuum pg_index;
VACUUM
The debug output says:
DEBUG: --Relation pg_index--
DEBUG: Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 164, MaxLen 164;
Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec.
DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec.
DEBUG: Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec.
DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec.
... which helped. A lot.
Thanks, everybody. The first quick benchmark run I did afterwards states
that PostgreSQL is now only half as fast as MySQL, instead of the factor
of 30 seen previously, on the MySQL benchmark test. ;-)
--
Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661
The quote was selected randomly. Really. | http://smurf.noris.de/
--
Dorian Graying:
The unwillingness to gracefully allow one's body to show signs
of aging.
-Douglas Coupland, Generation X
Hi,
Hiroshi Inoue:
Maybe shared buffer isn't so large as to keep all the(4.1M) pg_index pages.
That seems to be the case.
So it would read pages from disk every time, Unfortunately pg_index
has no index to scan the index entries of a relation now.
Well, it's reasonable that you can't keep an index on the table which
states what the indices are. ;-)
... on the other hand, Apple's HFS file system stores all the information
about the on-disk locations of their files as a B-Tree in, in, you
guessed it, a B-Tree which is saved on disk as an (invisible) file.
Thus, the thing stores the information on where its sectors are located
at, inside itself.
To escape this catch-22 situation, the location of the first three
extents (which is usually all it takes anyway) is stored elsewhere.
Possibly, something like this would work with postgres too.
However why is pg_index so large ?
Creating ten thousand tables will do that to you.
Is there an option I can set to increase the appropriate cache, so that
the backend can keep the data in memory?
--
Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661
The quote was selected randomly. Really. | http://smurf.noris.de/
--
Famous last words:
They'd never (be stupid enough to) make him a manager.
On Fri, 19 May 2000, Matthias Urlichs wrote:
Hi,
Hiroshi Inoue:
What does 'vacuum pg_index' show ?
test=> vacuum pg_index;
NOTICE: Skipping "pg_index" --- only table owner can VACUUM it
VACUUMOK, so I suppose I should do it as the postgres user...
test=> vacuum pg_index;
VACUUMThe debug output says:
DEBUG: --Relation pg_index--
DEBUG: Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 164, MaxLen 164;
Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec.
DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec.
DEBUG: Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec.
DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec.... which helped. A lot.
Thanks, everybody. The first quick benchmark run I did afterwards states
that PostgreSQL is now only half as fast as MySQL, instead of the factor
of 30 seen previously, on the MySQL benchmark test. ;-)
Wow, shock of shocks ... MySQL has more inaccuracies in their docs? *grin*
... which helped. A lot.
Thanks, everybody. The first quick benchmark run I did afterwards states
that PostgreSQL is now only half as fast as MySQL, instead of the factor
of 30 seen previously, on the MySQL benchmark test. ;-)
while (horse == DEAD) {
beat();
}
... Anyway.. I can see this being true (the MySQL being twice as fast as
PostgreSQL) however I don't think that MySQL being faster than PostgreSQL
was ever up for debate. When you take a RDBMS and strip out a huge amount of
features, of course you're going to get a faster end product. It's just not
nearly as safe, feature rich or easy to work with (from a programmers
standpoint).
I looked at MySQL to use for my applications, for all of ten seconds.... To
code in and around, MySQL just isn't a useable RDBMS for me and I can hardly
see how it's useful for anyone doing the kind of programming I do..
What it is very good for is something like RADIUS/POP3 authentication, I
use it at my ISP to keep all my user authentication in one place... However
the only thing I catred about was speed there, and there are all of two
things I ever do to that database. I SELECT (once every auth request) and
occasionally I INSERT and possibly UPDATE, that coupled with the fact that
there are only two to three things in the database per user (username,
password and domain for POP3 auth) -- it's just not a very complicated thing
to do... I use a SQL backend because it's very easy to maintain and I can
easily write software to manipulate the data held in the tables -- that's
all.
With the other applications I and my company write, it's a totally different
story. I just don't see how a person can write any kind of a larger
application and not need all the features MySQL lacks...
I like MySQL for certain things -- however I've never considered "MySQL vs
PostgreSQL" -- they're just two totally different databases for totally
different uses IMHO.
-Mitch
Hi,
The Hermit Hacker:
Thanks, everybody. The first quick benchmark run I did afterwards states
that PostgreSQL is now only half as fast as MySQL, instead of the factor
of 30 seen previously, on the MySQL benchmark test. ;-)Wow, shock of shocks ... MySQL has more inaccuracies in their docs? *grin*
No, that factor of 30 was my result after running the benchmark for the
first time. Presumably, unless I skip the large_number_of_tables test,
it'll be just as slow the second time around.
The MySQL people probably didn't dig deeper into PostgreSQL's innards.
They don't seem to think it's their job to find out exactly why their
benchmark runs so slow on some other databases, and I don't particularly
fault them for that attitude.
The PostgreSQL community has an attitude too, after all.
One of these might be to answer "you must have had fsync turned on"
whenever somebody reports a way-too-slow benchmark. In this case,
that's definitely not true.
Another attitude of the PostgreSQL developers might be to answer "run
VACUUM" whenever somebody reports performance problems. That answer is
not helpful at all WRT this benchmark, because the user who caused the
problem ("test", in my case) isn't permitted to run VACUUM on the
pg_index table.
The alternate solution would be for the backend to notice "Gee, I just
scanned a whole heap of what turned out to be empty space in this here
pg_index file, maybe it would be a good idea call vacuum() on it."
Or, if that doesn't work, increase the buffer for holding its content.
Anyway, I fully expect to have a more reasonable benchmark result by
tomorrow, and the MySQL guys will get a documentation update. Which they
_will_ put in the next update's documentation file. Trust me. ;-)
--
Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661
The quote was selected randomly. Really. | http://smurf.noris.de/
--
"The so-called Christian world is contracepting itself out of existence."
-- Fr. L. Kieffer, HLI Reports, August 1989, as quoted in "The Far
Right, Speaking For Themselves," a Planned Parenthood pamphlet
At 02:04 PM 5/19/00 +0200, you wrote:
Well, it's reasonable that you can't keep an index on the table which
states what the indices are. ;-)... on the other hand, Apple's HFS file system stores all the information
about the on-disk locations of their files as a B-Tree in, in, you
guessed it, a B-Tree which is saved on disk as an (invisible) file.
Thus, the thing stores the information on where its sectors are located
at, inside itself.
To escape this catch-22 situation, the location of the first three
extents (which is usually all it takes anyway) is stored elsewhere.Possibly, something like this would work with postgres too.
This is one of several things we did at Illustra to make the backend
run faster. I did the design and implementation, but it was a few
years ago, so the details are hazy. Here's what I remember.
We had to solve three problems:
First, you had to be able to run initdb and bootstrap the system
without the index on pg_index in place. As I recall, we had to
carefully order the creation of the first several tables to make
that work, but it wasn't rocket science.
Second, when the index on pg_index gets created, you need to update
it with tuples that describe it. This is really just the same as
hard-coding the pg_attribute attribute entries into pg_attribute --
ugly, but not that bad.
Third, we had to abstract a lot of the hard-coded table scans in
the bowels of the system to call a routine that checked for the
existence of an index on the system table, and used it. In order
for the index on pg_index to get used, its reldesc had to be nailed
in the cache. Getting it there at startup was more hard-coded
ugliness, but you only had do to it one time.
The advantage is that you can then index a bunch more of the system
catalog tables, and on a bunch more attributes. That produced some
surprising speedups.
This was simple enough that I'm certain the same technique would
work in the current engine.
mike
"Matthias Urlichs" <smurf@noris.net> writes:
Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
call. They actually scan the pg_index table of the benchmark database.
Ohh ... pg_index is the culprit! OK, I know exactly where that's coming
from: the planner is looking around to see what indexes might be
interesting for planning the query. Several comments here:
1. Probably we ought to try to bypass most of the planning process for
a simple INSERT ... VALUES. (I thought I had fixed that, but apparently
it's not getting short-circuited soon enough, if the search for indexes
is still happening.)
2. The search is not using either an index or a cache IIRC. Needs to
be fixed but there may be no suitable index present in 7.0.
3. I have been toying with the notion of having relcache entries store
information about the indexes associated with the table, so that the
planner wouldn't have to search through pg_index at all. The trouble
with doing that is getting the info updated when an index is added or
dropped; haven't quite figured out how to do that...
regards, tom lane