PostgreSQL Benchmarks
Hrm. I just saw that the PHP ADODB guy just published a bunch of database
benchmarks. It's fairly evident to me that benchmarking PostgreSQL on
Win32 isn't really fair:
http://php.weblogs.com/oracle_mysql_performance
*sigh*
Chris
Hrm. I just saw that the PHP ADODB guy just published a bunch of database
benchmarks. It's fairly evident to me that benchmarking PostgreSQL on
Win32 isn't really fair:
And why is the highly advocated transaction capable MySQL 4 not tested?
That's the problem, for every performance test they choose ISAM tables, and
when transactions are mentioned it's said "MySQL has transactions". But why
no benchmarks?
Regards,
Mario Weilguni
Import Notes
Resolved by subject fallback
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hrm. I just saw that the PHP ADODB guy just published a bunch of database
benchmarks. It's fairly evident to me that benchmarking PostgreSQL on
Win32 isn't really fair:
Is there anyone here that can contact them and get more details about
how the test was run? Even on Windows, I don't beleive that Postgres
should be quite as slow as indicated. I'd rather someone more familiar
with Windows than I take a stab at it.
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302110934
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html
iD8DBQE+SQoFvJuQZxSWSsgRAi5jAJ4li+qUDlxzM1r0XLZAjDlW1oA1ZwCgiOpT
0LgCKy9W5vG0nNcYWNJGkPM=
=Rrw5
-----END PGP SIGNATURE-----
On Tue, 2003-02-11 at 08:26, Christopher Kings-Lynne wrote:
Hrm. I just saw that the PHP ADODB guy just published a bunch of database
benchmarks. It's fairly evident to me that benchmarking PostgreSQL on
Win32 isn't really fair:http://php.weblogs.com/oracle_mysql_performance
*sigh*
How much of the performance difference is from the RDBMS, from the
middleware, and from the quality of implementation in the middleware.
While I'm not surprised that the the cygwin version of PostgreSQL is
slow, those results don't tell me anything about the quality of the
middleware interface between PHP and PostgreSQL. Does anyone know if we
can rule out some of the performance loss by pinning it to bad
middleware implementation for PostgreSQL?
Regards,
--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting
On Tue, 2003-02-11 at 08:31, Mario Weilguni wrote:
Hrm. I just saw that the PHP ADODB guy just published a bunch of database
benchmarks. It's fairly evident to me that benchmarking PostgreSQL on
Win32 isn't really fair:And why is the highly advocated transaction capable MySQL 4 not tested?
That's the problem, for every performance test they choose ISAM tables, and
when transactions are mentioned it's said "MySQL has transactions". But why
no benchmarks?
Insert Statement
Not using bind variables (MySQL and Oracle):
$DB->BeginTrans();
Using bind variables:
$DB->BeginTrans();
PL/SQL Insert Benchmark
Appears to not initiate a transaction. I'm assuming this is because
it's implicitly within a transaction? Oddly enough, I am seeing
explicit commits here.
It appears that the benchmarks are attempting to use transactions,
however, I have no idea if MySQL's HEAP supports them. For all I know,
transactions are being silently ignored.
Regards,
--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting
I've tested all the win32 versions of postgres I can get my hands on
(cygwin and not), and my general feeling is that they have problems with
insert performance with fsync() turned on, probably the fault of the os.
Select performance is not so much affected.
This is easily solved with transactions and other such things. Also
Postgres benefits from pl just like oracle.
May I make a suggestion that maybe it is time to start thinking about
tuning the default config file, IMHO its just a little bit too
conservative, and its hurting you in benchmarks being run by idiots, but
its still bad publicity. Any real database admin would know his test
are synthetic and not meaningful without having to look at the #s.
This is irritating me so much that I am going to put together a
benchmark of my own, a real world one, on (publicly available) real
world data. Mysql is a real dog in a lot of situations. The FCC
publishes a database of wireless transmitters that has tables with 10
million records in it. I'll pump that into pg, run some benchmarks,
real world queries, and we'll see who the faster database *really* is.
This is just a publicity issue, that's all. Its still annoying though.
I'll even run an open challenge to database admin to beat query
performance of postgres in such datasets, complex multi table joins,
etc. I'll even throw out the whole table locking issue and analyze
single user performance.
Merlin
_____________
How much of the performance difference is from the RDBMS, from the
middleware, and from the quality of implementation in the middleware.
While I'm not surprised that the the cygwin version of PostgreSQL is
slow, those results don't tell me anything about the quality of the
middleware interface between PHP and PostgreSQL. Does anyone know if we
can rule out some of the performance loss by pinning it to bad
middleware implementation for PostgreSQL?
Regards,
--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting
Import Notes
Resolved by subject fallback
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
May I make a suggestion that maybe it is time to start thinking about
tuning the default config file, IMHO its just a little bit too
conservative,
It's a lot too conservative. I've been thinking for awhile that we
should adjust the defaults.
The original motivation for setting shared_buffers = 64 was so that
Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
(64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
structures). At one time SHMMAX=1M was a pretty common stock kernel
setting. But our other data structures blew past the 1/2 meg mark
some time ago; at default settings the shmem request is now close to
1.5 meg. So people with SHMMAX=1M have already got to twiddle their
postgresql.conf settings, or preferably learn how to increase SHMMAX.
That means there is *no* defensible reason anymore for defaulting to
64 buffers.
We could retarget to try to stay under SHMMAX=4M, which I think is
the next boundary that's significant in terms of real-world platforms
(isn't that the default SHMMAX on some BSDen?). That would allow us
350 or so shared_buffers, which is better, but still not really a
serious choice for production work.
What I would really like to do is set the default shared_buffers to
1000. That would be 8 meg worth of shared buffer space. Coupled with
more-realistic settings for FSM size, we'd probably be talking a shared
memory request approaching 16 meg. This is not enough RAM to bother
any modern machine from a performance standpoint, but there are probably
quite a few platforms out there that would need an increase in their
stock SHMMAX kernel setting before they'd take it.
So what this comes down to is making it harder for people to get
Postgres running for the first time, versus making it more likely that
they'll see decent performance when they do get it running.
It's worth noting that increasing SHMMAX is not nearly as painful as
it was back when these decisions were taken. Most people have moved
to platforms where it doesn't even take a kernel rebuild, and we've
acquired documentation that tells how to do it on all(?) our supported
platforms. So I think it might be okay to expect people to do it.
The alternative approach is to leave the settings where they are, and
to try to put more emphasis in the documentation on the fact that the
factory-default settings produce a toy configuration that you *must*
adjust upward for decent performance. But we've not had a lot of
success spreading that word, I think. With SHMMMAX too small, you
do at least get a pretty specific error message telling you so.
Comments?
regards, tom lane
On Tue, 2003-02-11 at 10:20, Tom Lane wrote:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
May I make a suggestion that maybe it is time to start thinking about
tuning the default config file, IMHO its just a little bit too
conservative,It's a lot too conservative. I've been thinking for awhile that we
should adjust the defaults.The original motivation for setting shared_buffers = 64 was so that
Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
(64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
structures). At one time SHMMAX=1M was a pretty common stock kernel
setting. But our other data structures blew past the 1/2 meg mark
some time ago; at default settings the shmem request is now close to
1.5 meg. So people with SHMMAX=1M have already got to twiddle their
postgresql.conf settings, or preferably learn how to increase SHMMAX.
That means there is *no* defensible reason anymore for defaulting to
64 buffers.We could retarget to try to stay under SHMMAX=4M, which I think is
the next boundary that's significant in terms of real-world platforms
(isn't that the default SHMMAX on some BSDen?). That would allow us
350 or so shared_buffers, which is better, but still not really a
serious choice for production work.What I would really like to do is set the default shared_buffers to
1000. That would be 8 meg worth of shared buffer space. Coupled with
more-realistic settings for FSM size, we'd probably be talking a shared
memory request approaching 16 meg. This is not enough RAM to bother
any modern machine from a performance standpoint, but there are probably
quite a few platforms out there that would need an increase in their
stock SHMMAX kernel setting before they'd take it.So what this comes down to is making it harder for people to get
Postgres running for the first time, versus making it more likely that
they'll see decent performance when they do get it running.It's worth noting that increasing SHMMAX is not nearly as painful as
it was back when these decisions were taken. Most people have moved
to platforms where it doesn't even take a kernel rebuild, and we've
acquired documentation that tells how to do it on all(?) our supported
platforms. So I think it might be okay to expect people to do it.The alternative approach is to leave the settings where they are, and
to try to put more emphasis in the documentation on the fact that the
factory-default settings produce a toy configuration that you *must*
adjust upward for decent performance. But we've not had a lot of
success spreading that word, I think. With SHMMMAX too small, you
do at least get a pretty specific error message telling you so.Comments?
I'd personally rather have people stumble trying to get PostgreSQL
running, up front, rather than allowing the lowest common denominator
more easily run PostgreSQL only to be disappointed with it and move on.
After it's all said and done, I would rather someone simply say, "it's
beyond my skill set", and attempt to get help or walk away. That seems
better than them being able to run it and say, "it's a dog", spreading
word-of-mouth as such after they left PostgreSQL behind. Worse yet,
those that do walk away and claim it performs horribly are probably
doing more harm to the PostgreSQL community than expecting someone to be
able to install software ever can.
Nutshell:
"Easy to install but is horribly slow."
or
"Took a couple of minutes to configure and it rocks!"
Seems fairly cut-n-dry to me. ;)
Regards,
--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting
On Tue, Feb 11, 2003 at 11:20:14AM -0500, Tom Lane wrote:
...
We could retarget to try to stay under SHMMAX=4M, which I think is
the next boundary that's significant in terms of real-world platforms
(isn't that the default SHMMAX on some BSDen?).
...
Assuming 1 page = 4k, and number of pages is correct in GENERIC kernel configs,
SHMMAX=4M for NetBSD (8M for i386, x86_64)
Cheers,
Patrick
Nutshell:
"Easy to install but is horribly slow."or
"Took a couple of minutes to configure and it rocks!"
Since when is it easy to install on win32?
The easiest way I know of is through Cygwin, then you have to worry about
installing the IPC service (an getting the right version too!) I've
installed versions 6.1 to 7.1, but I almost gave up on the windows install.
At least in 6.x you had very comprehensive installation guide with a TOC.
Versus the competition which are you going to choose if you're a wanna-be
DBA? The one with all he hoops to jump through, or the one that comes with a
setup.exe?
Now I actually am in support of making it more aggressive, but it should
wait until we too have a setup.exe for the native windows port. (Changing it
on *n*x platforms is of little benefit because most benchmarks seem to run
it on w32 anyway :-( )
Just my $.02. I reserve the right to be wrong.
-J
Tom Lane wrote:
<snip>
What I would really like to do is set the default shared_buffers to
1000. That would be 8 meg worth of shared buffer space. Coupled with
more-realistic settings for FSM size, we'd probably be talking a shared
memory request approaching 16 meg. This is not enough RAM to bother
any modern machine from a performance standpoint, but there are probably
quite a few platforms out there that would need an increase in their
stock SHMMAX kernel setting before they'd take it.
<snip>
Totally agree with this. We really, really, really, really need to get
the default to a point where we have _decent_ default performance.
The alternative approach is to leave the settings where they are, and
to try to put more emphasis in the documentation on the fact that the
factory-default settings produce a toy configuration that you *must*
adjust upward for decent performance. But we've not had a lot of
success spreading that word, I think. With SHMMMAX too small, you
do at least get a pretty specific error message telling you so.Comments?
Yep.
Here's an *unfortunately very common* scenario, that again
unfortunately, a _seemingly large_ amount of people fall for.
a) Someone decides to "benchmark" database XYZ vs PostgreSQL vs other
databases
b) Said benchmarking person knows very little about PostgreSQL, so they
install the RPM's, packages, or whatever, and "it works". Then they run
whatever benchmark they've downloaded, or designed, or whatever
c) PostgreSQL, being practically unconfigured, runs at the pace of a
slow, mostly-disabled snail.
d) Said benchmarking person gets better performance from the other
databases (also set to their default settings) and thinks "PostgreSQL
has lots of features, and it's free, but it's Too Slow".
Yes, this kind of testing shouldn't even _pretend_ to have any real
world credibility.
e) Said benchmarking person tells everyone they know, _and_ everyone
they meet about their results. Some of them even create nice looking or
profesional looking web pages about it.
f) People who know even _less_ than the benchmarking person hear about
the test, or read the result, and don't know any better than to believe
it at face value. So, they install whatever system was recommended.
g) Over time, the benchmarking person gets the hang of their chosen
database more and writes further articles about it, and doesn't
generally look any further afield than it for say... a couple of years.
By this time, they've already influenced a couple of thousand people
in the non-optimal direction.
h) Arrgh. With better defaults, our next release would _appear_ to be a
lot faster to quite a few people, just because they have no idea about
tuning.
So, as sad as this scenario is, better defaults will probably encourage
a lot more newbies to get involved, and that'll eventually translate
into a lot more experienced users, and a few more coders to assist. ;-)
Personally I'd be a bunch happier if we set the buffers so high that we
definitely have decent performance, and the people that want to run
PostgreSQL are forced to make the choice of either:
1) Adjust their system settings to allow PostgreSQL to run properly, or
2) Manually adjust the PostgreSQL settings to run memory-constrained
This way, PostgreSQL either runs decently, or they are _aware_ that
they're limiting it. That should cut down on the false benchmarks
(hopefully).
:-)
Regards and best wishes,
Justin Clift
regards, tom lane
--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi
A quick-'n'-dirty first step would be more comments in postgresql.conf. Most
of the lines are commented out which would imply "use the default" but the
default is not shown. (I realize this has the difficulty of defaults that
change depending upon how PostgreSQL was configured/compiled but perhaps
postgresql.conf could be built by the make process based on the configuration
options.)
If postgresql.conf were commented with recommendations it would probably be
all I need though perhaps a recommendation to edit that file should be
displayed at the conclusion of "make install".
Cheers,
Steve
Show quoted text
On Tuesday 11 February 2003 8:20 am, Tom Lane wrote:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
May I make a suggestion that maybe it is time to start thinking about
tuning the default config file, IMHO its just a little bit too
conservative,It's a lot too conservative. I've been thinking for awhile that we
should adjust the defaults.The original motivation for setting shared_buffers = 64 was so that
Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
(64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
structures). At one time SHMMAX=1M was a pretty common stock kernel
setting. But our other data structures blew past the 1/2 meg mark
some time ago; at default settings the shmem request is now close to
1.5 meg. So people with SHMMAX=1M have already got to twiddle their
postgresql.conf settings, or preferably learn how to increase SHMMAX.
That means there is *no* defensible reason anymore for defaulting to
64 buffers.We could retarget to try to stay under SHMMAX=4M, which I think is
the next boundary that's significant in terms of real-world platforms
(isn't that the default SHMMAX on some BSDen?). That would allow us
350 or so shared_buffers, which is better, but still not really a
serious choice for production work.What I would really like to do is set the default shared_buffers to
1000. That would be 8 meg worth of shared buffer space. Coupled with
more-realistic settings for FSM size, we'd probably be talking a shared
memory request approaching 16 meg. This is not enough RAM to bother
any modern machine from a performance standpoint, but there are probably
quite a few platforms out there that would need an increase in their
stock SHMMAX kernel setting before they'd take it.So what this comes down to is making it harder for people to get
Postgres running for the first time, versus making it more likely that
they'll see decent performance when they do get it running.It's worth noting that increasing SHMMAX is not nearly as painful as
it was back when these decisions were taken. Most people have moved
to platforms where it doesn't even take a kernel rebuild, and we've
acquired documentation that tells how to do it on all(?) our supported
platforms. So I think it might be okay to expect people to do it.The alternative approach is to leave the settings where they are, and
to try to put more emphasis in the documentation on the fact that the
factory-default settings produce a toy configuration that you *must*
adjust upward for decent performance. But we've not had a lot of
success spreading that word, I think. With SHMMMAX too small, you
do at least get a pretty specific error message telling you so.Comments?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Tom Lane wrote:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
May I make a suggestion that maybe it is time to start thinking about
tuning the default config file, IMHO its just a little bit too
conservative,It's a lot too conservative. I've been thinking for awhile that we
should adjust the defaults.
One of the things I did on my Windows install was to have a number of
default configuration files, postgresql.conf.small,
postgresql.conf.medium, postgresql.conf.large.
Rather than choose one, in the "initdb" script, ask for or determine the
mount of shared memory, memory, etc.
Another pet peeve I have is forcing the configuration files to be in the
database directory. We had this argument in 7.1 days, and I submitted a
patch that allowed a configuration file to be specified as a command
line parameter. One of the things that Oracle does better is separating
the "configuration" from the data.
It is an easy patch to allow PostgreSQL to use a separate configuration
directory, and specify the data directory within the configuration file
(The way any logical application works), and, NO, symlinks are not a
solution, they are a kludge.
Justin Clift <justin@postgresql.org> writes:
Personally I'd be a bunch happier if we set the buffers so high that we
definitely have decent performance, and the people that want to run
PostgreSQL are forced to make the choice of either:
1) Adjust their system settings to allow PostgreSQL to run properly, or
2) Manually adjust the PostgreSQL settings to run memory-constrained
This way, PostgreSQL either runs decently, or they are _aware_ that
they're limiting it.
Yeah, that is the subtext here. If you can't increase SHMMAX then you
can always trim the postgresql.conf parameters --- but theoretically,
at least, you should then have a clue that you're running a
badly-configured setup ...
regards, tom lane
Tom, Justin,
What I would really like to do is set the default shared_buffers to
1000. That would be 8 meg worth of shared buffer space. Coupled with
more-realistic settings for FSM size, we'd probably be talking a shared
memory request approaching 16 meg. This is not enough RAM to bother
any modern machine from a performance standpoint, but there are probably
quite a few platforms out there that would need an increase in their
stock SHMMAX kernel setting before they'd take it.
What if we supplied several sample .conf files, and let the user choose which
to copy into the database directory? We could have a "high read
performance" profile, and a "transaction database" profile, and a
"workstation" profile, and a "low impact" profile. We could even supply a
Perl script that would adjust SHMMAX and SHMMALL on platforms where this can
be done from the command line.
--
Josh Berkus
Aglio Database Solutions
San Francisco
On Tue, 2003-02-11 at 12:10, Steve Crawford wrote:
A quick-'n'-dirty first step would be more comments in postgresql.conf. Most
This will not solve the issue with the large number of users who have no
interest in looking at the config file -- but are interested in
publishing their results.
--
Rod Taylor <rbt@rbt.ca>
PGP Key: http://www.rbt.ca/rbtpub.asc
Greg Copeland wrote:
I'd personally rather have people stumble trying to get PostgreSQL
running, up front, rather than allowing the lowest common denominator
more easily run PostgreSQL only to be disappointed with it and move on.After it's all said and done, I would rather someone simply say, "it's
beyond my skill set", and attempt to get help or walk away. That seems
better than them being able to run it and say, "it's a dog", spreading
word-of-mouth as such after they left PostgreSQL behind. Worse yet,
those that do walk away and claim it performs horribly are probably
doing more harm to the PostgreSQL community than expecting someone to be
able to install software ever can.
<RANT>
And that my friends is why PostgreSQL is still relatively obscure.
This attitude sucks. If you want a product to be used, you must put the
effort into making it usable.
It is a no-brainer to make the default configuration file suitable for
the majority of users. It is lunacy to create a default configuration
which provides poor performance for over 90% of the users, but which
allows the lowest common denominator to work.
A product must not perform poorly out of the box, period. A good product
manager would choose one of two possible configurations, (a) a high
speed fairly optimized system from the get-go, or (b) it does not run
unless you create the configuration file. Option (c) out of the box it
works like crap, is not an option.
This is why open source gets such a bad reputation. Outright contempt
for the user who may not know the product as well as those developing
it. This attitude really sucks and it turns people off. We want people
to use PostgreSQL, to do that we must make PostgreSQL usable. Usability
IS important.
</RANT>
Josh Berkus <josh@agliodbs.com> writes:
What if we supplied several sample .conf files, and let the user choose which
to copy into the database directory? We could have a "high read
performance" profile, and a "transaction database" profile, and a
"workstation" profile, and a "low impact" profile.
Uh ... do we have a basis for recommending any particular sets of
parameters for these different scenarios? This could be a good idea
in the abstract, but I'm not sure I know enough to fill in the details.
A lower-tech way to accomplish the same result is to document these
alternatives in postgresql.conf comments and encourage people to review
that file, as Steve Crawford just suggested. But first we need the raw
knowledge.
regards, tom lane
What if we supplied several sample .conf files, and let the user choose
which to copy into the database directory? We could have a "high read
Exactly my first thought when reading the proposal for a setting suited for
performance tests.
performance" profile, and a "transaction database" profile, and a
"workstation" profile, and a "low impact" profile. We could even supply a
And a .benchmark profile :-)
Perl script that would adjust SHMMAX and SHMMALL on platforms where this
can be done from the command line.
Or maybe configuration could be adjusted with ./configure if SHMMAX can be
determined at that point?
--
Kaare Rasmussen --Linux, spil,-- Tlf: 3816 2582
Kaki Data tshirts, merchandize Fax: 3816 2501
Howitzvej 75 Åben 12.00-18.00 Email: kar@kakidata.dk
2000 Frederiksberg Lørdag 12.00-16.00 Web: www.suse.dk
Josh Berkus wrote:
Tom, Justin,
<snip>
What if we supplied several sample .conf files, and let the user choose which
to copy into the database directory? We could have a "high read
performance" profile, and a "transaction database" profile, and a
"workstation" profile, and a "low impact" profile. We could even supply a
Perl script that would adjust SHMMAX and SHMMALL on platforms where this can
be done from the command line.
This might have value as the next step in the process of:
a) Are we going to have better defaults?
or
b) Let's stick with the current approach.
If we decide to go with better (changed) defaults, we may also be able
to figure out a way of having profiles that could optionally be chosen from.
As a longer term thought, it would be nice if the profiles weren't just
hard-coded example files, but more of:
pg_autotune --setprofile=xxx
Or similar utility, and it did all the work. Named profiles being one
capability, and other tuning measurements (i.e. cpu costings, disk
performance profiles, etc) being the others.
Regards and best wishes,
Justin Clift
--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi