MySQL and PostgreSQL speed compare
Hi,
I wrote a piece of benchmarking, just to test my classes, and was suprised
of the speed diffs.
So one more entry to the flame war (or the start of a new one) about which
one is faster, PostgreSQL or MySQL.
Well I expected MySQL to be the faster one, but this much.
Inserts on MySQL : 0.71sec/1000 rows
Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?)
Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)
Modify on MySQL : 0.67sec/1000 rows
Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?)
Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?)
Delete on MySQL : 1.04sec/1000 rows
Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?)
Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?)
Search were almost the same (MySQL were faster on some, PostgreSQL on some),
sorting and reading sorted entries from dba was the same. But
insert/modify/delete.
"PostgreSQL*" is postgres whith queries inside transactions. But as long as
transactions are broken in PostgreSQL you cant use them in real life (if a
query fails inside a transactions block, PostgreSQL "RollBack"s the whole
transaction block, and thats broken. You can not convince me of anything
else).
Then I thought that maybe it would even up if I made more than one simul.
call. So I rewrote the utility so that it forked itself several times. With
PostgreSQL I could not try the test with transactions activated
(transactions are broken in PostgreSQL, and the test shows it clearly).
PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with
20 connections. At five connections MySQL was 5 times faster, with 20
connections it was 4 times faster.
I do not claim that this is accurate, maybe my classes are broken or
something, or the test might be totally wrong. But *I think* it simulates
quite well a ordinary webserver running the dba locally (on the same server
as the www-server).
The setup is:
PII 450MHz with 256MByte memory.
Linux Redhat 6.0 (almost out of box).
MySQL, latest .rpm (a few weeks ago).
PostgreSQL, from CVS tree (HEAD, a few weeks ago).
MySQL on a SCSI disk.
PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and
tested. Suprise suprise it was slower! Well PostgreSQL was as nice as MySQL
towards the CPU when it was on the SCSI disk.
Used gcc to compile PostgreSQL, using only the --prefix when
./configur(ing).
If you like to run the test (or view the code), download DBA-Test and AFW
package from my site (www.comder.com). No fancy configure scripts exist so
you have to modify the code to make it run on your system.
Comments? Reasons for the result? What was wrong with the test?
I do not want to start a flame war. Just need help to get PostgreSQL up to
speed or MySQL to support sub-selects.
// Jarmo
PS. Posted this to MySQL and PostgreSQL list. Want to hear both sides. DS.
Jarmo Paavilainen wrote:
[ . . . ]
"PostgreSQL*" is postgres whith queries inside transactions. But as long as
transactions are broken in PostgreSQL you cant use them in real life (if a
query fails inside a transactions block, PostgreSQL "RollBack"s the whole
transaction block, and thats broken. You can not convince me of anything
else).
What do you think _should_ happen when a query fails inside a transaction block? (I am not
trying to convince you of anything, just being curious.)
Regards,
Frank
Well I expected MySQL to be the faster one, but this much.
Inserts on MySQL : 0.71sec/1000 rows
Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?)
Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)Modify on MySQL : 0.67sec/1000 rows
Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?)
Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?)Delete on MySQL : 1.04sec/1000 rows
Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?)
Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?)Search were almost the same (MySQL were faster on some, PostgreSQL on
some),
sorting and reading sorted entries from dba was the same. But
insert/modify/delete.
To me, all this is pointing toward the possibility that you haven't
switched of fsync. This will make a MASSIVE difference to insert/update
queries. Read the docs on how to do this, and what the implications are.
And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to
your postgres startup line in the postgres startup script in
/etc/rc.d/init.d.
Then try running the benchmark again and re-post the results. On a machine
with that much memory, allowing proper caching will make a huge difference.
I think MySQL does that by default, but PostgreSQL tries to be cautious and
flushes the it's disk cache bufferes after every query. This should even
things out quite a lot.
"PostgreSQL*" is postgres whith queries inside transactions. But as long
as
transactions are broken in PostgreSQL you cant use them in real life (if
a
query fails inside a transactions block, PostgreSQL "RollBack"s the whole
transaction block, and thats broken. You can not convince me of anything
else).
They are not as functionally complete as they could be, I'll give you that.
But if you are sticking to good programming (and this applies to more than
just SQL) practices, you should make sure that your code behaves properly
and checks for things before going in head long. It can be slower, but it
is a lot cleaner. In the same way you check for a zero-return when using
malloc in C, and clean up all compiler warnings, or run-time warnings in
perl, you sould consider doing, for example, a SELECT query to make sure
that the records are/aren't already there before inserting them or updating
them.
Just MHO. Yes it is slightly slower, but it does work, and it is a lot
neater than fillijg up the error logs with all sorts of garbage.
Then I thought that maybe it would even up if I made more than one simul.
call. So I rewrote the utility so that it forked itself several times.
With
PostgreSQL I could not try the test with transactions activated
(transactions are broken in PostgreSQL, and the test shows it clearly).
PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75%
with
20 connections. At five connections MySQL was 5 times faster, with 20
connections it was 4 times faster.
[snip]
MySQL on a SCSI disk.
PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and
tested. Suprise suprise it was slower! Well PostgreSQL was as nice as
MySQL towards the CPU when it was on the SCSI disk.
I thought the CPU hit was strange. This exaplains it...
Re-try the test with the fsync() disabled and re-post the results. I'm
interested to learn of your findings.
* Jarmo Paavilainen <netletter@comder.com> [001229 04:23] wrote:
Hi,
I wrote a piece of benchmarking, just to test my classes, and was suprised
of the speed diffs.So one more entry to the flame war (or the start of a new one) about which
one is faster, PostgreSQL or MySQL.Well I expected MySQL to be the faster one, but this much.
Inserts on MySQL : 0.71sec/1000 rows
Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?)
Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)Modify on MySQL : 0.67sec/1000 rows
Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?)
Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?)Delete on MySQL : 1.04sec/1000 rows
Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?)
Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?)Search were almost the same (MySQL were faster on some, PostgreSQL on some),
sorting and reading sorted entries from dba was the same. But
insert/modify/delete."PostgreSQL*" is postgres whith queries inside transactions. But as long as
transactions are broken in PostgreSQL you cant use them in real life (if a
query fails inside a transactions block, PostgreSQL "RollBack"s the whole
transaction block, and thats broken. You can not convince me of anything
else).
Well, I'm not going to try to convince you because you seem to have
made up your mind already, however for anyone else watching there's
nothing majorly broken with the 'all or nothing' approach in
postgresql, in fact it's very handy.
The all or nothing approach doesn't happen if a query fails to
modify or return any results, only if there's a genuine error in
the code, like inserting duplicate values into a column that should
be unique, or if you somehow send malformed sql to the server
mid-transaction. This is actually a pretty convient feature because
it prevents programmer mistakes from proceeding to trash more data
and backs it out.
The fact that MySQL doesn't support transactions at all severly
limits its utility for applications that need data consistancy, it
also makes it very dangerous to try any new queries on a database
because one can't just issue rollbacks after a test run.
Then I thought that maybe it would even up if I made more than one simul.
call. So I rewrote the utility so that it forked itself several times. With
PostgreSQL I could not try the test with transactions activated
(transactions are broken in PostgreSQL, and the test shows it clearly).
PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with
20 connections. At five connections MySQL was 5 times faster, with 20
connections it was 4 times faster.I do not claim that this is accurate, maybe my classes are broken or
something, or the test might be totally wrong. But *I think* it simulates
quite well a ordinary webserver running the dba locally (on the same server
as the www-server).The setup is:
PII 450MHz with 256MByte memory.
Linux Redhat 6.0 (almost out of box).
MySQL, latest .rpm (a few weeks ago).
PostgreSQL, from CVS tree (HEAD, a few weeks ago).
MySQL on a SCSI disk.
PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and
tested. Suprise suprise it was slower! Well PostgreSQL was as nice as MySQL
towards the CPU when it was on the SCSI disk.
Used gcc to compile PostgreSQL, using only the --prefix when
./configur(ing).If you like to run the test (or view the code), download DBA-Test and AFW
package from my site (www.comder.com). No fancy configure scripts exist so
you have to modify the code to make it run on your system.Comments? Reasons for the result? What was wrong with the test?
A lot of things went wrong here, first off you didn't contact the
developers to let them know ahead of time and discuss tuning the
system. Both the MySQL and Postgresql developers deserve a chance
to recommend tuneing for your application/bench or ask that you
delay your bench until bug X or Y is addressed.
I also think that while updates and inserts are important (they
sure are for us) you admit that Postgresql achieves the same speed
for queries as MySQL when doing searches.
Most sites are that I know of are dynamic content and perform
selects for the most part.
Some other flaws:
You have an admitted inbalance with the disk systems but don't go
into any details.
You probably didn't tune postgresql worth a damn. I don't see any
mention of you raising the amount of shared memory allocated to
postgresql. I also imagine you may have run the test many times
on Postgresql without vacuuming the database?
Telling both development communities:
MySQL, latest .rpm (a few weeks ago).
PostgreSQL, from CVS tree (HEAD, a few weeks ago).
doesn't tell us much, maybe there's some bug in the code that
needed work?
I do not want to start a flame war. Just need help to get PostgreSQL up to
speed or MySQL to support sub-selects.
I think your time would be better spent working on actually
impelementing the features you want rather than posting broken and
biased benchmarks that do more harm than good.
bye,
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
Hello,
one possible behaviour would be to rollback the transaction to the last
savepoint, which was set before the current statement (not the
transaction!) began. In this case I could commit all changes which have
passed without an error. I think, this is the default case in Oracle -
is this compliant with the SQL-standard?
Regards, Jens
Frank Joerdens schrieb:
Jarmo Paavilainen wrote:
[ . . . ]"PostgreSQL*" is postgres whith queries inside transactions. But as long as
transactions are broken in PostgreSQL you cant use them in real life (if a
query fails inside a transactions block, PostgreSQL "RollBack"s the whole
transaction block, and thats broken. You can not convince me of anything
else).What do you think _should_ happen when a query fails inside a transaction block? (I am not
trying to convince you of anything, just being curious.)Regards,
Frank
=============================================
Jens Hartwig
---------------------------------------------
debis Systemhaus GEI mbH
10875 Berlin
Tel. : +49 (0)30 2554-3282
Fax : +49 (0)30 2554-3187
Mobil : +49 (0)170 167-2648
E-Mail : jhartwig@debis.com
=============================================
But isn't it recommended to run the server with fsync? If so, you shouldn't
disable it on a benchmark then.
Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
http://www.rutgersinsurance.com
----- Original Message -----
From: "Gordan Bobic" <gordan@freeuk.com>
To: "PostgreSQL General" <pgsql-general@postgresql.org>
Sent: Friday, December 29, 2000 7:31 AM
Subject: Re: [GENERAL] MySQL and PostgreSQL speed compare
Well I expected MySQL to be the faster one, but this much.
Inserts on MySQL : 0.71sec/1000 rows
Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?)
Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)Modify on MySQL : 0.67sec/1000 rows
Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?)
Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?)Delete on MySQL : 1.04sec/1000 rows
Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?)
Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?)Search were almost the same (MySQL were faster on some, PostgreSQL on
some),
sorting and reading sorted entries from dba was the same. But
insert/modify/delete.To me, all this is pointing toward the possibility that you haven't
switched of fsync. This will make a MASSIVE difference to insert/update
queries. Read the docs on how to do this, and what the implications are.
And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to
your postgres startup line in the postgres startup script in
/etc/rc.d/init.d.Then try running the benchmark again and re-post the results. On a machine
with that much memory, allowing proper caching will make a huge
difference.
I think MySQL does that by default, but PostgreSQL tries to be cautious
and
flushes the it's disk cache bufferes after every query. This should even
things out quite a lot."PostgreSQL*" is postgres whith queries inside transactions. But as long
as
transactions are broken in PostgreSQL you cant use them in real life (if
a
query fails inside a transactions block, PostgreSQL "RollBack"s the
whole
transaction block, and thats broken. You can not convince me of anything
else).They are not as functionally complete as they could be, I'll give you
that.
But if you are sticking to good programming (and this applies to more than
just SQL) practices, you should make sure that your code behaves properly
and checks for things before going in head long. It can be slower, but it
is a lot cleaner. In the same way you check for a zero-return when using
malloc in C, and clean up all compiler warnings, or run-time warnings in
perl, you sould consider doing, for example, a SELECT query to make sure
that the records are/aren't already there before inserting them or
updating
them.
Just MHO. Yes it is slightly slower, but it does work, and it is a lot
neater than fillijg up the error logs with all sorts of garbage.Then I thought that maybe it would even up if I made more than one
simul.
Show quoted text
call. So I rewrote the utility so that it forked itself several times.
With
PostgreSQL I could not try the test with transactions activated
(transactions are broken in PostgreSQL, and the test shows it clearly).
PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75%with
20 connections. At five connections MySQL was 5 times faster, with 20
connections it was 4 times faster.[snip]
MySQL on a SCSI disk.
PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and
tested. Suprise suprise it was slower! Well PostgreSQL was as nice as
MySQL towards the CPU when it was on the SCSI disk.I thought the CPU hit was strange. This exaplains it...
Re-try the test with the fsync() disabled and re-post the results. I'm
interested to learn of your findings.
Just curious, what kind of tables did you set up in MySQL? My understanding is
that if you want to use the BerkeleyDB transaction support in the 3.23
beta/gamma/whatever, you need to create your tables as BerkeleyDB type rather
than ISAM. (And then the new row-level locking feature from NuSphere adds a
third table type, but that's another story...)
So if you wanted to do a true apples-to-apples comparison of databases that
supported transactions, you'd use BerkeleyDB tables in MySQL. If they were ISAM
tables, then we're just back at the same old "speed versus data integrity" flame
war that has always been the base of the MySQL/Postgres comparison.
Of course, if they *were* Berkeley tables and you still got those results, then
we'll need to dig a little deeper ;-)
Regards,
Ned
Jarmo Paavilainen wrote:
Show quoted text
Hi,
I wrote a piece of benchmarking, just to test my classes, and was suprised
of the speed diffs.So one more entry to the flame war (or the start of a new one) about which
one is faster, PostgreSQL or MySQL.Well I expected MySQL to be the faster one, but this much.
Inserts on MySQL : 0.71sec/1000 rows
Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?)
Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)Modify on MySQL : 0.67sec/1000 rows
Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?)
Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?)Delete on MySQL : 1.04sec/1000 rows
Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?)
Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?)Search were almost the same (MySQL were faster on some, PostgreSQL on some),
sorting and reading sorted entries from dba was the same. But
insert/modify/delete."PostgreSQL*" is postgres whith queries inside transactions. But as long as
transactions are broken in PostgreSQL you cant use them in real life (if a
query fails inside a transactions block, PostgreSQL "RollBack"s the whole
transaction block, and thats broken. You can not convince me of anything
else).Then I thought that maybe it would even up if I made more than one simul.
call. So I rewrote the utility so that it forked itself several times. With
PostgreSQL I could not try the test with transactions activated
(transactions are broken in PostgreSQL, and the test shows it clearly).
PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with
20 connections. At five connections MySQL was 5 times faster, with 20
connections it was 4 times faster.I do not claim that this is accurate, maybe my classes are broken or
something, or the test might be totally wrong. But *I think* it simulates
quite well a ordinary webserver running the dba locally (on the same server
as the www-server).The setup is:
PII 450MHz with 256MByte memory.
Linux Redhat 6.0 (almost out of box).
MySQL, latest .rpm (a few weeks ago).
PostgreSQL, from CVS tree (HEAD, a few weeks ago).
MySQL on a SCSI disk.
PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and
tested. Suprise suprise it was slower! Well PostgreSQL was as nice as MySQL
towards the CPU when it was on the SCSI disk.
Used gcc to compile PostgreSQL, using only the --prefix when
./configur(ing).If you like to run the test (or view the code), download DBA-Test and AFW
package from my site (www.comder.com). No fancy configure scripts exist so
you have to modify the code to make it run on your system.Comments? Reasons for the result? What was wrong with the test?
I do not want to start a flame war. Just need help to get PostgreSQL up to
speed or MySQL to support sub-selects.// Jarmo
PS. Posted this to MySQL and PostgreSQL list. Want to hear both sides. DS.
* Adam Lang <aalang@rutgersinsurance.com> [001229 05:47] wrote:
But isn't it recommended to run the server with fsync? If so, you shouldn't
disable it on a benchmark then.
Actually, if he ran Postgresql with WAL enabled, fsync shouldn't
make much of a difference.
But isn't it recommended to run the server with fsync? If so, you
shouldn't
disable it on a benchmark then.
Recommended for what? If you have a system that has to be up 24/7, then you
should have:
1) Sound, working hardware
2) UPS
3) Failovers / RAID
4) Backups
Provided that:
(you don't get a hardware failure, OR you have implemented 3)),
AND
(you don't get a power cut OR you have implemented 2)),
then there is no danger that I can see in having the OS handle your disk
caching. If you have a hardware caching controler, then the whole fsync
issue is slightly more academic, but disabling fsync will generally always
improve performance.
IMHO, if you don't trust your hardware and arrangements enough to feel
happy with switching off fsync, then you shouldn't be using that setup in a
mission critical application. It's OK for a development system where you
expect the system to be trashed several times a week, but I am not sure
that fsync should be high enough on the list of priorities in a mission
crytical system to be worth worrying about too much.
Feel free to disagree...
Regards.
Gordan
Import Notes
Resolved by subject fallback
Hi,
...
"PostgreSQL*" is postgres whith queries inside transactions.
But as long as transactions are broken in PostgreSQL you cant use them
in real
life (if a query fails inside a transactions block, PostgreSQL
"RollBack"s
the whole transaction block, and thats broken. You can not convince me
of anything
else).
What do you think _should_ happen when a query fails inside a
transaction block? (I am not trying to convince you of anything, just
being curious.)
If a query fails it should return an error, and let me decide if I want to
rollback, change my query or continue as nothing happened.
A good example is this:
Make a SELECT to check if the row exist.
If not it make a INSERT, or if its there it make an UPDATE (or something
totally different).
Everything is swell, but when several pids are trying to insert there is a
window of error between the SELECT and INSERT. And the test program
triggered it frequently. What if there were a lot of insert/updated before
and after the failing one (inside this transaction) and it would be ok if
this row was inserted by someone else. The dba does not know about that, *I
do* and can write my program in such a way.
How do you fix that in PostgreSQL! The only way I know of is not to use
transactions. Then if the INSERT fails you can try again with SELECT to
check if the row has been inserted by someone else. And ofcourse you would
need to build your own rollback function.
The "all or nothing approach" ala PostgreSQL is broken!
Nuff about transactions. I do not think I can convince you and you cant
convince me about that they are not. And why do you not check how the other
dbas has solved this. I bet they work as I describe.
// Jarmo
Hi,
...
Just curious, what kind of tables did you set up in MySQL? My
Ehh... there are more than one kind... I did not know. Still with
transactions on PostgreSQL (unsafe method?) MySQL was 2 times as fast as
PostgreSQL. I will check this out, and return to this list with the results.
...
tables, then we're just back at the same old "speed versus data
integrity" flame war that has always been the base of the
MySQL/Postgres comparison.
Its a question of a compromising between speed and "integrity". *I think*
PostgreSQL should and could lower their "integrity" a bit (15 times slower).
...
Of course, if they *were* Berkeley tables and you still got those
results, then we'll need to dig a little deeper ;-)
I do not know, but I will find out.
// Jarmo
...
I wrote a piece of benchmarking, just to test my classes, and
was suprised of the speed diffs.
...
Show quoted text
Inserts on MySQL : 0.71sec/1000 rows
Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?)
Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)
* Adam Lang <aalang@rutgersinsurance.com> [001229 05:47] wrote:
But isn't it recommended to run the server with fsync? If so, you
shouldn't
disable it on a benchmark then.
Actually, if he ran Postgresql with WAL enabled, fsync shouldn't
make much of a difference.
What's WAL? Are you referring to autocommit? I will admit that autocomit
already improves performance so much that fsync() isn't going to matter
that much, but it will still make a noticeable improvement. It certainly
did on my system (~20%). FWIW, disabling autocommint, and introducing the
overhead of doing a select for checking before EVERY INSERT and UPDATE,
made an improvement of about 2-3 times on my application... But, different
applications require different things, so... YMMV...
Regards.
Gordan
Hi,
...
I wrote a piece of benchmarking, just to test my classes, and
was suprised of the speed diffs.
...
The fact that MySQL doesn't support transactions at all severly
limits its utility for applications that need data consistancy, it
I thought it does. Well I need to test that before Im going to say anything
more.
...
A lot of things went wrong here, first off you didn't contact the
developers to let them know ahead of time and discuss tuning the
system. Both the MySQL and Postgresql developers deserve a chance
to recommend tuneing for your application/bench or ask that you
delay your bench until bug X or Y is addressed.
I did not (and do not) take it so seriously. I do not (did not) claim that
the test is in any way usefull I just wanted peoples response (and did get
some).
...
Most sites are that I know of are dynamic content and perform
selects for the most part.
Yes but still.
...
You have an admitted inbalance with the disk systems but don't go
into any details.
Yes that was sloppy. I thought of that after I started to write my email. I
did a fast test with both dbas on SCSI (just simple moved PostgreSQL "data"
directory to SCSI). But the result was even slower. Anyway the biggest
differece between SCSI and IDE is throughput and CPU usage. Throughput is
not an issue here (small chunks of data), and CPU should not be (PostgreSQL
peaked CPU with 20 connections on IDE and used 75% on SCSI). It might be a
bigger difference with more connections. Both my IDE and SCSI are quite new
with fast search.
...
You probably didn't tune postgresql worth a damn. I don't see any
Neither did I tune MySQL. Neither do 90% of the users.
...
mention of you raising the amount of shared memory allocated to
postgresql. I also imagine you may have run the test many times
on Postgresql without vacuuming the database?
The test program DROPS the tables and recreates them. I do not know if you
still would need to VACUUM the dba. Anyway I did run the test several times
wihtout seing any (big) differences.
...
I think your time would be better spent working on actually
impelementing the features you want rather than posting broken and
biased benchmarks that do more harm than good.
I do not think this was biased, maybe broken but not biased. Actually I use
PostgreSQL and all (free?) dbas that Ive installed have been PostgreSQL. The
code Im using was first written for PostgreSQL and the reason why I added
MySQL support was that my ISP refused to install PostgreSQL. I did the test
just to see if my classes also worked on MySQL before starting to port rest
of my code to MySQL (guess if I was suprised).
Implementing features... why? PostgreSQL has (almost) everything I need. Its
MySQL which would need to have some new features (sub-selects, views...).
Looking for the truth and nothing but the truth? Dont look for it among
humans. All they have are opinions.
// Jarmo
* Gordan Bobic <gordan@freeuk.com> [001229 07:39] wrote:
* Adam Lang <aalang@rutgersinsurance.com> [001229 05:47] wrote:
But isn't it recommended to run the server with fsync? If so, you
shouldn't
disable it on a benchmark then.
Actually, if he ran Postgresql with WAL enabled, fsync shouldn't
make much of a difference.What's WAL? Are you referring to autocommit? I will admit that autocomit
already improves performance so much that fsync() isn't going to matter
that much, but it will still make a noticeable improvement. It certainly
did on my system (~20%). FWIW, disabling autocommint, and introducing the
overhead of doing a select for checking before EVERY INSERT and UPDATE,
made an improvement of about 2-3 times on my application... But, different
applications require different things, so... YMMV...
WAL is apparently something that orders writes in such a way that you
may loose data, but ordering is maintained such that if you have
transactions A, B and C (in that order) and you crash, you'll see
one of these:
1) A B and C
2) A and B
3) just C
With fsync on you should see A B and C, but WAL makes data recovery
a lot better.
(I think. :) )
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
Jarmo Paavilainen wrote:
[ . . . ]
The "all or nothing approach" ala PostgreSQL is broken!
Nuff about transactions. I do not think I can convince you and you cant
convince me about that they are not.
Ok. I won't try then.
- Frank
Gordan Bobic writes:
Actually, if he ran Postgresql with WAL enabled, fsync shouldn't
make much of a difference.What's WAL?
Write Ahead Logging
It's a system that writes anything that it wants to do to the database to
a log file first and if the system crashes before the data is in the
database proper then it will replay the log on restart. (I guess you
could think of it as similar to a journalling file system.) That means
that fsync is never necessary on actual data files, only on the log files,
which are much smaller.
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Hi,
Ive got a few tips on what to do (turn off fsync(), could be broken
PostgreSQL from cvs). And few hints on whats wrong with MySQL (transactions
not enabled by default). Ill check these out and return to the list.
But first I want to comment a few things (marked with >>> from different
emails).
Actually, if he ran Postgresql with WAL enabled, fsync shouldn't
make much of a difference.
WAL seems to be enabled by default. What WAL is good for I do not know. But
if I start PostgreSQL without the -S I see a lot of info about WAL this and
WAL that.
...
But isn't it recommended to run the server with fsync? If so,
you shouldn't disable it on a benchmark then.
I run both MySQL and PostgreSQL as they are (minimum switches, no tuning, as
default as it can be). That is MySQL as the .rpm installed it
(--datadir --pid-file --skip-locking) and PostgreSQL with -i -S -D. Thats
the way most people would be running them anyway. And default should be good
enought for this test (simple queries, few rows (max 1000) per table).
...
Well I expected MySQL to be the faster one, but this much.
...
To me, all this is pointing toward the possibility that you haven't
switched of fsync. This will make a MASSIVE difference to insert/update
The idea was to run as recomended and as default as possible. But with the
latest (alpha/beta/development) code.
...
And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to
...
flushes the it's disk cache bufferes after every query. This should even
things out quite a lot.
Ill test that. Even thou it feels like tweaking PostgreSQL away from what
its considered safe by PostgreSQL developers. If it would be safe it would
be default.
...
transaction block, and thats broken. You can not convince me of
anything else).
...
They are not as functionally complete as they could be, I'll give you
that.
Thanks, I think ;-)
What if I do a SELECT to check for a row. Then I do a INSERT. But between
SELECT and INSERT someone else inserted a row. NO I do not think that "good
programming" will solve this.
Sir, thanks for sharing this with us. However, unless you can explain
why queries inside of transactions run faster than queries outside of
transactions, I would be inclined to mistrust the test. I haven't
I was suprised too. But the only difference is that I do a "BEGIN" before I
start inserting/modifying/deleting and then when Im done I do a "COMMIT".
Everything between those are exactly the same. Ive been told that MySQL does
not support transactions (by default) so there the test is broken. And with
PostgreSQL, well something inside PostgreSQL is broken (it cant be right
that with transaction PostgreSQL is 10 times faster than without).
...
interested to learn of your findings.
Ill update from cvs and rebuild PostgreSQL, and (try to) locate cvs of MySQL
and build it locally. And make the recomended tweaking (no fsync() but with
WAL). Ill also make sure that transactions are supported. Ill also add a
test of rollback to my test program.
// Jarmo
"PostgreSQL*" is postgres whith queries inside transactions.
But as long as transactions are broken in PostgreSQL you cant
use them in real life (if a query fails inside a transactions
block, PostgreSQL "RollBack"s the whole transaction block, and
thats broken. You can not convince me of anything else).
Nevertheless you shouldn't use word "broken" -:)
Right word - "savepoints are not implemented".
Vadim
Import Notes
Resolved by subject fallback
Make a SELECT to check if the row exist.
If not it make a INSERT, or if its there it make an UPDATE (or something
totally different).Everything is swell, but when several pids are trying to insert there is a
window of error between the SELECT and INSERT. And the test program
triggered it frequently. What if there were a lot of insert/updated before
and after the failing one (inside this transaction) and it would be ok if
this row was inserted by someone else. The dba does not know about that, *I
do* and can write my program in such a way.How do you fix that in PostgreSQL! The only way I know of is not to use
transactions. Then if the INSERT fails you can try again with SELECT to
check if the row has been inserted by someone else. And ofcourse you would
need to build your own rollback function.The "all or nothing approach" ala PostgreSQL is broken!
Nuff about transactions. I do not think I can convince you and you cant
convince me about that they are not. And why do you not check how the other
dbas has solved this. I bet they work as I describe.
Wouldn't it be smart to make the select and insert a transaction, and if it
fails, then start an update?
Rob Nelson
rdnelson@co.centre.pa.us
Import Notes
Resolved by subject fallback
Sir, thanks for sharing this with us. However, unless you can explain
why queries inside of transactions run faster than queries outside of
transactions, I would be inclined to mistrust the test. I haven'tI was suprised too. But the only difference is that I do a "BEGIN" before I
start inserting/modifying/deleting and then when Im done I do a "COMMIT".
Everything between those are exactly the same. Ive been told that MySQL does
not support transactions (by default) so there the test is broken. And with
PostgreSQL, well something inside PostgreSQL is broken (it cant be right
that with transaction PostgreSQL is 10 times faster than without).
All PostgreSQL statements are in some transaction. If you're not using
explicit transactions (ie, autocommit) then it's effectively wrapping the
statement in a transaction block of its own, so you're doing the
transaction start/end (including any necessary file access) <n> times
rather than once which is probably most of the difference you're seeing.