Benchmark-Comparison PostGreSQL vs. SQL Server
Hi, this is my first question here, and also, it's somewhat delicate. So please be patient.
My question is, CAN PostGreSQL perform in the SQL Server area when it comes to speed?
In other words, are there explanations for the results I found (see below)?
Thanks,
Robert
-----
Background:
1. I read people were using PostGreSQL with TeraBytes of data sometimes, or thousands of users. These are things that could easily break SQL Server. - So I thought PostGreSQL might be similar fast to SQL Server.
2. I did some tests:
Windows XP SP2
Several GIGs free harddisk, ~400 MB free RAM
Java 1.5 / JDBC
PostGreSQL 8.0 beta (through Windows Installer), default configuration, default driver
SQL Server 2000 SP3a, default configuration, JDTS driver
Tablespaces of both databases on the same partition
Write-Test: Creating tables (slightly modified TCP-W benchmark)
Read-Test: Simple SELECT statements on all tables, returning the first 1000 rows (cursor variants: read-only and non-locking, resp. updatable and locking)
Results:
Writing: SQL Server 25 times faster.
Reading: SQL Server 100 times faster.
Robert Soeding wrote:
Hi, this is my first question here, and also, it's somewhat delicate.
So please be patient.My question is, CAN PostGreSQL perform in the SQL Server area when it
comes to speed? In other words, are there explanations for the
results I found (see below)?
Faster in some cases, slower in others in my experience. Oh, and
publishing performance comparisons with another database might be in
breach of your ms-sql server licencing.
Thanks, Robert
----- Background: 1. I read people were using PostGreSQL with
TeraBytes of data sometimes, or thousands of users. These are things
that could easily break SQL Server. - So I thought PostGreSQL might
be similar fast to SQL Server.
Some people have very large installations. This obviously isn't on
Windows, and not necessarily on x86 hardware.
2. I did some tests: Windows XP SP2 Several GIGs free harddisk, ~400
MB free RAM Java 1.5 / JDBC PostGreSQL 8.0 beta (through Windows
Installer), default configuration, default driver SQL Server 2000
SP3a, default configuration, JDTS driver Tablespaces of both
databases on the same partition Write-Test: Creating tables (slightly
modified TCP-W benchmark) Read-Test: Simple SELECT statements on all
tables, returning the first 1000 rows (cursor variants: read-only and
non-locking, resp. updatable and locking)Results: Writing: SQL Server 25 times faster. Reading: SQL Server 100
times faster.
The figures sound wrong. The Windows port isn't likely to be as fast as
the *nix versions (certainly not yet) but those figures don't match for
my experience with PG on Linux.
Unfortunately, although you provide a lot of information, almost none of
it tells us what the problem is. So -
1. What configuration changes have you made?
2. How many concurrent connections was this?
3. Were you selecting 1000 rows (LIMIT 1000), selecting all the rows
(and only fetching 1000) or actually defining an SQL cursor.
4. What was the load on the machine - CPU or DISK peaking?
5. What was the RAM usage like?
--
Richard Huxton
Archonet Ltd
1. What configuration changes have you made?
None, both installations are default configured.
2. How many concurrent connections was this?
One.
3. Were you selecting 1000 rows (LIMIT 1000), selecting all the rows
(and only fetching 1000) or actually defining an SQL cursor.
I used "LIMIT 1000", resp. "TOP 1000" statements.
4. What was the load on the machine - CPU or DISK peaking?
5. What was the RAM usage like?
CPU (< 10%) and RAM (<10%) usage were very low.
- I guess (as you mentioned below) it's the NT file system.
When running PostgreSQL queries I can hear the harddisk buzzing, but not with SQL Server queries.
On the other hand, if an application has to "fight" against the file system, I would suppose it to increase RAM and CPU usage significantly.
-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Thursday, November 25, 2004 12:34 PM
To: Robert Soeding
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Benchmark-Comparison PostGreSQL vs. SQL Server
Robert Soeding wrote:
Hi, this is my first question here, and also, it's somewhat delicate.
So please be patient.My question is, CAN PostGreSQL perform in the SQL Server area when it
comes to speed? In other words, are there explanations for the results
I found (see below)?
Faster in some cases, slower in others in my experience. Oh, and publishing performance comparisons with another database might be in breach of your ms-sql server licencing.
Thanks, Robert
----- Background: 1. I read people were using PostGreSQL with
TeraBytes of data sometimes, or thousands of users. These are things
that could easily break SQL Server. - So I thought PostGreSQL might be
similar fast to SQL Server.
Some people have very large installations. This obviously isn't on Windows, and not necessarily on x86 hardware.
2. I did some tests: Windows XP SP2 Several GIGs free harddisk, ~400
MB free RAM Java 1.5 / JDBC PostGreSQL 8.0 beta (through Windows
Installer), default configuration, default driver SQL Server 2000
SP3a, default configuration, JDTS driver Tablespaces of both databases
on the same partition Write-Test: Creating tables (slightly modified
TCP-W benchmark) Read-Test: Simple SELECT statements on all tables,
returning the first 1000 rows (cursor variants: read-only and
non-locking, resp. updatable and locking)Results: Writing: SQL Server 25 times faster. Reading: SQL Server 100
times faster.
The figures sound wrong. The Windows port isn't likely to be as fast as the *nix versions (certainly not yet) but those figures don't match for my experience with PG on Linux.
Unfortunately, although you provide a lot of information, almost none of it tells us what the problem is. So - 1. What configuration changes have you made?
2. How many concurrent connections was this?
3. Were you selecting 1000 rows (LIMIT 1000), selecting all the rows (and only fetching 1000) or actually defining an SQL cursor.
4. What was the load on the machine - CPU or DISK peaking?
5. What was the RAM usage like?
--
Richard Huxton
Archonet Ltd
Import Notes
Resolved by subject fallback
Robert Soeding wrote:
1. What configuration changes have you made?
None, both installations are default configured.
You'll want to do at least some tuning on PG. Try the URL below for a
quick introduction - just the basic stuff is a good start.
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
2. How many concurrent connections was this?
One.
OK. That keeps things nice and simple.
3. Were you selecting 1000 rows (LIMIT 1000), selecting all the
rows (and only fetching 1000) or actually defining an SQL cursor.I used "LIMIT 1000", resp. "TOP 1000" statements.
OK. The reason I asked was that if you ask for 1000000 rows, then PG
will find them all and return them all in a bunch. Most other servers
return the first row once it's available.
4. What was the load on the machine - CPU or DISK peaking? 5. What
was the RAM usage like?CPU (< 10%) and RAM (<10%) usage were very low.
- I guess (as you mentioned below) it's the NT file system. When
running PostgreSQL queries I can hear the harddisk buzzing, but not
with SQL Server queries.
If you're seeing hard-disk activity that means the data isn't cached.
On the other hand, if an application has to "fight" against the file
system, I would suppose it to increase RAM and CPU usage
significantly.
It shouldn't be fighting the file system, but it does use it, and rely
on it for caching (rather than bypassing your filesystem cache). For
MS-SQL server I'm guessing you're allocating a lot of memory to SQL
server and not much to the file-cache. For PG you'll want it the other
way around.
--
Richard Huxton
Archonet Ltd
On Thursday 25 Nov 2004 6:22 pm, Richard Huxton wrote:
On the other hand, if an application has to "fight" against the file
system, I would suppose it to increase RAM and CPU usage
significantly.It shouldn't be fighting the file system, but it does use it, and rely
on it for caching (rather than bypassing your filesystem cache). For
MS-SQL server I'm guessing you're allocating a lot of memory to SQL
server and not much to the file-cache. For PG you'll want it the other
way around.
Do we have some sort of document about how caching on windows works? It is
very simple on linux and BSDs but for other OSs, I haven't seen many
suggestions.
I am sure OS specific hints would help OP a lot.
Shridhar
Shridhar Daithankar wrote:
Do we have some sort of document about how caching on windows works? It is
very simple on linux and BSDs but for other OSs, I haven't seen many
suggestions.I am sure OS specific hints would help OP a lot.
Microsoft's own resources are good for this sort of stuff (apologies if
URLs wrap):
http://www.microsoft.com/technet/prodtechnol/windows2000serv/maintain/optimize/wperfch7.mspx
http://www.microsoft.com/resources/documentation/Windows/2000/server/reskit/en-us/Default.asp?url=/resources/documentation/Windows/2000/server/reskit/en-us/core/fnec_evl_ACKS.asp
The simplest setting is the simple application/file-sharing switch
detailed in article #2 (which was introduced in NT4 iirc).
--
Richard Huxton
Archonet Ltd
hi,
ON.KG wrote:
New question:
i have tables like
table_20041124,
table_20041125,
etc...i'm trying to make function (for example):
=====================================
CREATE FUNCTION get_count(text, text)
RETURNS int2 AS '
DECLARE
cnt int4;
BEGIN
SELECT INTO cnt COUNT(*)
FROM table_$1 -- That doesn't work
WHERE key = $2;RETURN cnt;
END;'
LANGUAGE 'plpgsql';
=====================================call this function by:
=====================================
SELECT get_count("20041124", "something");
=====================================string in funstion - FROM table_$1
how could i get a final correct table name here?
You can use execute for dynamic sql.
CREATE FUNCTION get_count(text, text) RETURNS int2 AS '
declare rec record;
begin
for rec in execute ''select COUNT(*) as num from table_''||$1||''
where key=''''||$2'''' '';
loop
return rec.num;
end loop;
return;
end;
PS: anyway, you want returns int2 , but you declared int4 :)
C.
Import Notes
Reply to msg id not found: 15933163937.20041125194221@on.kg
Hi,
Am Donnerstag, den 25.11.2004, 19:42 +0300 schrieb ON.KG:
New question:
i have tables like
table_20041124,
table_20041125,
etc...i'm trying to make function (for example):
=====================================
CREATE FUNCTION get_count(text, text)
RETURNS int2 AS '
DECLARE
cnt int4;
BEGIN
SELECT INTO cnt COUNT(*)
FROM table_$1 -- That doesn't work
WHERE key = $2;RETURN cnt;
END;'
LANGUAGE 'plpgsql';
=====================================call this function by:
=====================================
SELECT get_count("20041124", "something");
=====================================string in funstion - FROM table_$1
how could i get a final correct table name here?
You need to build your query in your function
as a big string and pass it to EXECUTE
(See
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
and
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING
)
however encoding data into table names does not sound so elegant
for me - why not make it a real column?
The advantage would be you'd have a real query and let postgres
compile a plan for it - which is not possible for execute.
If you are concerned on index size you can always use partial
indices based on your "table number".
Regards
Tino
Import Notes
Reply to msg id not found: 15933163937.20041125194221@on.kg
New question:
i have tables like
table_20041124,
table_20041125,
etc...
i'm trying to make function (for example):
=====================================
CREATE FUNCTION get_count(text, text)
RETURNS int2 AS '
DECLARE
cnt int4;
BEGIN
SELECT INTO cnt COUNT(*)
FROM table_$1 -- That doesn't work
WHERE key = $2;
RETURN cnt;
END;'
LANGUAGE 'plpgsql';
=====================================
call this function by:
=====================================
SELECT get_count("20041124", "something");
=====================================
string in funstion - FROM table_$1
how could i get a final correct table name here?
Thanx!
On Thu, 25 Nov 2004 11:36:33 +0100
"Robert Soeding" <robert.soeding@lisocon.de> wrote:
Hi, this is my first question here, and also, it's somewhat delicate.
So please be patient.
My question is, CAN PostGreSQL perform in the SQL Server area when it
comes to speed? In other words, are there explanations for the
results I found (see below)?
Thanks,
Robert-----
Background:
1. I read people were using PostGreSQL with TeraBytes of data
sometimes, or thousands of users. These are things that could easily
break SQL Server. - So I thought PostGreSQL might be similar fast to
SQL Server. 2. I did some tests: Windows XP SP2
Several GIGs free harddisk, ~400 MB free RAM
Java 1.5 / JDBC
PostGreSQL 8.0 beta (through Windows Installer), default
configuration, default driver SQL Server 2000 SP3a, default
configuration,
Here's your problem right here. You're never going to get a fair
comparison unless you tune the crap out of *both* contenders.
PostgreSQL's default configuration is extremely conservative to allow
it to be run on very limited resources.
JDTS driver Tablespaces of both databases on the same
partition Write-Test: Creating tables (slightly modified TCP-W
benchmark) Read-Test: Simple SELECT statements on all tables,
returning the first 1000 rows (cursor variants: read-only and
non-locking, resp. updatable and locking) Results:
Writing: SQL Server 25 times faster.
Reading: SQL Server 100 times faster.
--
Russ
Background:
1. I read people were using PostGreSQL with TeraBytes of data
sometimes, or thousands of users. These are things that could easily
break SQL Server. - So I thought PostGreSQL might be similar fast to
SQL Server. 2. I did some tests: Windows XP SP2
Several GIGs free harddisk, ~400 MB free RAM
Java 1.5 / JDBC
PostGreSQL 8.0 beta (through Windows Installer), default
configuration, default driver SQL Server 2000 SP3a, default
configuration,
SQL Server like mysql, is very 'noob friendly'...it doesn't require a
lot of configuration to be quick is forgiving of lousy access
strategies to the database (these are good things btw).
PostgreSQL 8.0b was still using old style win32 fsync which maps to
flushfilebuffers()...this is an an ultra conservative way of flushing
ot disk so this is not apples to apples comparison. That said, SQL
Server will be very hard to beat with write performance on windows
because it is very tuned to the win32 api and has very efficient disk
access. In the mean time for a fairer test you have to turn fsync off
in postgresql.conf and you definately want to be running pg 8.1 which
has tons of improvements for win32 and a more efficient
syncer...although in windows for production database I would highly
suggest targeting a caching raid controller.
I am right now in the development planning phase of a PostgreSQL
databse which has to scale well into the terabyte range. I can tell
you without hesitation that the database can handle it and deserves
the reputation of a workhorse that it has. Administrating pg is a bit
harder to get into but well help you understand what a database is
really all about and how to make your data access elegant and
efficient.
Merlin