SQL Server performing much better?!?!

Started by Christian Cabaneroabout 24 years ago12 messagesgeneral
Jump to latest
#1Christian Cabanero
chumpboy@yahoo.com

Hi,

I'm having a little bit of trouble figuring out a query and why it takes so
long to execute. As a preface, I have the same exact database (or at least
very closely duplicated) set up in micro$oft SQL Server, with the same
schema, data and the same indexes, etc. When I run this query on a dinky
Windows2000 machine (my laptop) it runs relatively fast and comes back in 7
seconds. I have postgreSQL set up on a sun solaris box with 1 GB of memory
and it takes over 25 seconds!

One caveat is that it's a shared managed server run by verio (VPS). But I
still figure it's beafier than my puny laptop! Both databases contain the
same data and the same indexes. I've even tried bumping up the sort_mem up
from 512 to 16384 (16 MB?). The result set that comes back is 8604 rows.
But even if i put a limit of 5 it still takes a long time to run.

Also, I've observed that when I run this query in the PostgreSQL database it
pegs the CPU on the server at 100%, I don't see any iowait or anything,
memory usage doesn't jump up or anything and the swap in use stays the same
(all monitored from top). From the plan it looks like it's using indexes
where appropriate and when it does do a table scan the cost doesn't seem to
be that much. The big operations seem to be when merge joins are performed.
I've included both the query and the plan bellow.

Any help would be HUGELY appreciated as I'm in the embarrasing situation
where SQL Server is heavily out performing PostgreSQL running on a sun box.
Heck, it's even running faster on my laptop running cygwin configured in the
same way! PLEASE HELP!

Thanks!
Christian

====================================================
Query:
====================================================

SELECT
a.user_id, b.sample_id
FROM
user_company a,
sample_manufacturer b,
samples c
WHERE
a.company_id = b.manufacturer_id AND
b.sample_id = c.sample_id AND
c.sample_state = 1
UNION
SELECT
a.user_id, b.sample_id
FROM
user_company a,
samples b,
users c
WHERE
a.company_id = b.sample_manufacturer_id AND
b.sample_state = 1 AND
b.sample_author_id = c.user_id AND
NOT EXISTS
(
SELECT
p.territory_id
FROM
territories p,
territory_ranges q,
manufacturer_territories r
WHERE
r.manufacturer_id = b.sample_manufacturer_id AND
r.assignment_flag = 2 AND
r.territory_id = p.territory_id AND
p.territory_id = q.territory_id AND p.type IN (1, 2)
AND
c.zip BETWEEN q.start_value AND q.end_value
)
limit 5

====================================================
Plan:
====================================================

Limit (cost=26137.82..26138.07 rows=5 width=33) (actual
time=25895.91..25896.01 rows=5 loops=1)
-> Unique (cost=26137.82..26251.99 rows=2283 width=33) (actual
time=25895.90..25895.98 rows=6 loops=1)
-> Sort (cost=26137.82..26137.82 rows=22834 width=33) (actual
time=25895.88..25895.90 rows=13 loops=1)
-> Append (cost=98.67..24484.74 rows=22834 width=33) (actual
time=82.38..25102.67 rows=19265 loops=1)
-> Subquery Scan *SELECT* 1 (cost=98.67..342.06
rows=16937 width=20) (actual time=82.37..609.69 rows=17098 loops=1)
-> Merge Join (cost=98.67..342.06 rows=16937
width=20) (actual time=82.35..434.78 rows=17098 loops=1)
-> Index Scan using
ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168
width=8) (actual time=0.41..11.83 rows=373 loops=1)
-> Sort (cost=98.67..98.67 rows=763
width=12) (actual time=79.78..108.86 rows=17098 loops=1)
-> Hash Join (cost=33.09..62.16
rows=763 width=12) (actual time=26.18..58.97 rows=769 loops=1)
-> Seq Scan on
sample_manufacturer b (cost=0.00..15.69 rows=769 width=8) (actual
time=0.12..13.54 rows=769 loops=1)
-> Hash (cost=31.59..31.59
rows=602 width=4) (actual time=23.93..23.93 rows=0 loops=1)
-> Seq Scan on samples c
(cost=0.00..31.59 rows=602 width=4) (actual time=0.17..18.48 rows=602
loops=1)
-> Subquery Scan *SELECT* 2 (cost=24038.44..24142.68
rows=5897 width=33) (actual time=24381.93..24456.97 rows=2167 loops=1)
-> Merge Join (cost=24038.44..24142.68 rows=5897
width=33) (actual time=24381.92..24433.35 rows=2167 loops=1)
-> Index Scan using
ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168
width=8) (actual time=0.06..10.58 rows=373 loops=1)
-> Sort (cost=24038.44..24038.44 rows=301
width=25) (actual time=24380.15..24383.39 rows=2165 loops=1)
-> Merge Join (cost=0.00..24026.05
rows=301 width=25) (actual time=1346.68..24378.12 rows=97 loops=1)
-> Index Scan using
ix_samples_author_id on samples b (cost=0.00..42.12 rows=602 width=12)
(actual time=0.46..32.05 rows=602 loops=1)
-> Index Scan using users_pkey
on users c (cost=0.00..60.58 rows=1144 width=13) (actual time=0.40..57.70
rows=1419 loops=1)
SubPlan
-> Nested Loop
(cost=0.00..53.30 rows=1 width=12) (actual time=40.27..40.27 rows=1
loops=602)
-> Nested Loop
(cost=0.00..48.43 rows=2 width=8) (actual time=40.18..40.18 rows=1
loops=602)
-> Index Scan
using ix_manufacturer_terr_combo3 on manufacturer_territories r
(cost=0.00..19.80 rows=5 width=4) (actual time=0.11..0.91 rows=27 loops=602)
-> Index Scan
using ix_territory_ranges_combo_1 on territory_ranges q (cost=0.00..5.71
rows=1 width=4) (actual time=1.46..1.46 rows=0 loops=15995)
-> Index Scan using
territories_pkey on territories p (cost=0.00..3.02 rows=1 width=4) (actual
time=0.08..0.08 rows=1 loops=505)
Total runtime: 25915.75 msec

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#2Christian Cabanero
chumpboy@yahoo.com
In reply to: Christian Cabanero (#1)
Re: SQL Server performing much better?!?!

Good point, here's the info I should've also provided:

- 7.2, installed from the source.
- Solaris 2.6
- Only compilation option changed was --prefix to install to a different
directory
- all config settings were left at default, but i also experimented with
upping sort_mem but that's it.
- to start all i do is use this command:
postmaster -i -S -D /usr/local/pgsql/data >& /usr/var/log/pgsql.log

That's pretty much it, obviously nothing to fancy.

Again, any help would be incredibly appreciated. Right now I've got the
small company I work for that used to be completely a Micro$oft shop totally
sold on unix and more importantly open-source, but this performance problem
sure isn't helping my little "campaign." ;-)

Thanks,
Christian

-----Original Message-----
From: fred@panda.baobab.home [mailto:fred@panda.baobab.home]On Behalf Of
Frederic Saincy
Sent: Tuesday, March 19, 2002 8:37 PM
To: Christian Cabanero
Subject: Re: [GENERAL] SQL Server performing much better?!?!

Hi,

Maybe you should post on the list

- your postgresql version (there are some bugs in older optimizer for
some sql syntax )
- your solaris version
- your compilation options
- your config files
- your starting options

...

I can't help right now, maybe someone could

Bye.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#3Neil Conway
neilc@samurai.com
In reply to: Christian Cabanero (#2)
Re: SQL Server performing much better?!?!

On Wed, 2002-03-20 at 01:50, Christian Cabanero wrote:

Good point, here's the info I should've also provided:
[snip]

I don't think many people on the list are MS SQL experts -- IMO, the
best we can do is concentrate on improving PostgreSQL performance.

To that end, can you tell us:

(1) the query that is slow in Postgres but fast in MS SQL

(2) the definition of any tables, indexes, views, etc used by the query
in #1.

(3) the output of EXPLAIN for the query in #1.

Also, have you run VACUUM ANALYZE?

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

#4Masaru Sugawara
rk73@sea.plala.or.jp
In reply to: Christian Cabanero (#1)
Re: SQL Server performing much better?!?!

On Tue, 19 Mar 2002 18:06:44 -0800
"Christian Cabanero" <chumpboy@yahoo.com> wrote:

(all monitored from top). From the plan it looks like it's using indexes
where appropriate and when it does do a table scan the cost doesn't seem to
be that much. The big operations seem to be when merge joins are performed.
I've included both the query and the plan bellow.

In case of using EXISTS clauses, it seems to be sometimes the case
that the results, for which those who execute queries are all agog, are not
returned quickly.

Regards,
Masaru Sugawara

#5Fernando Lozano
fsl@centroin.com.br
In reply to: Masaru Sugawara (#4)
Re: SQL Server performing much better?!?!

Hi Christian,

I'm having a little bit of trouble figuring out a query and why it takes so
long to execute. As a preface, I have the same exact database (or at least
very closely duplicated) set up in micro$oft SQL Server, with the same
schema, data and the same indexes, etc. When I run this query on a dinky
Windows2000 machine (my laptop) it runs relatively fast and comes back in 7
seconds. I have postgreSQL set up on a sun solaris box with 1 GB of memory
and it takes over 25 seconds!

Heck, it's even running faster on my laptop running cygwin configured in the
same way! PLEASE HELP!

If your laptop runs faster using PostgreSQL under cygwin there should be a problem on your solaris server configuration and not postgresql. PosgreSQL on cygwin is not considered production ready so I guess it is not surprise being slower than MS SQL Server, but the solaris machine should be much faster.

Hint: Do you run vacuum frequentily?

[s, Fernando Lozano

#6Ian Harding
ianh@tpchd.org
In reply to: Fernando Lozano (#5)
Re: SQL Server performing much better?!?!

MS SQL Server is a very good database. It has hundreds of well paid programmers working on it. It makes decisions for you about things that PostgreSQL will not.

I migrated from MSSQL Server to PostgreSQL and found that I had done a lot of things 'wrong' or at least inefficiently and MSSQL Server let me get away with it, PostgreSQL did not.

PostgreSQL will not probably ever outperform MSSQL in all areas. PostgreSQL will always outperform MSSQL Server in price, stability, portability, and support.

Ian A. Harding
Programmer/Analyst II
Tacoma-Pierce County Health Department
(253) 798-3549
mailto: iharding@tpchd.org

"Christian Cabanero" <chumpboy@yahoo.com> 03/19/02 06:06PM >>>

Hi,

I'm having a little bit of trouble figuring out a query and why it takes so
long to execute. As a preface, I have the same exact database (or at least
very closely duplicated) set up in micro$oft SQL Server, with the same
schema, data and the same indexes, etc. When I run this query on a dinky
Windows2000 machine (my laptop) it runs relatively fast and comes back in 7
seconds. I have postgreSQL set up on a sun solaris box with 1 GB of memory
and it takes over 25 seconds!

One caveat is that it's a shared managed server run by verio (VPS). But I
still figure it's beafier than my puny laptop! Both databases contain the
same data and the same indexes. I've even tried bumping up the sort_mem up
from 512 to 16384 (16 MB?). The result set that comes back is 8604 rows.
But even if i put a limit of 5 it still takes a long time to run.

Also, I've observed that when I run this query in the PostgreSQL database it
pegs the CPU on the server at 100%, I don't see any iowait or anything,
memory usage doesn't jump up or anything and the swap in use stays the same
(all monitored from top). From the plan it looks like it's using indexes
where appropriate and when it does do a table scan the cost doesn't seem to
be that much. The big operations seem to be when merge joins are performed.
I've included both the query and the plan bellow.

Any help would be HUGELY appreciated as I'm in the embarrasing situation
where SQL Server is heavily out performing PostgreSQL running on a sun box.
Heck, it's even running faster on my laptop running cygwin configured in the
same way! PLEASE HELP!

Thanks!
Christian

====================================================
Query:
====================================================

SELECT
a.user_id, b.sample_id
FROM
user_company a,
sample_manufacturer b,
samples c
WHERE
a.company_id = b.manufacturer_id AND
b.sample_id = c.sample_id AND
c.sample_state = 1
UNION
SELECT
a.user_id, b.sample_id
FROM
user_company a,
samples b,
users c
WHERE
a.company_id = b.sample_manufacturer_id AND
b.sample_state = 1 AND
b.sample_author_id = c.user_id AND
NOT EXISTS
(
SELECT
p.territory_id
FROM
territories p,
territory_ranges q,
manufacturer_territories r
WHERE
r.manufacturer_id = b.sample_manufacturer_id AND
r.assignment_flag = 2 AND
r.territory_id = p.territory_id AND
p.territory_id = q.territory_id AND p.type IN (1, 2)
AND
c.zip BETWEEN q.start_value AND q.end_value
)
limit 5

====================================================
Plan:
====================================================

Limit (cost=26137.82..26138.07 rows=5 width=33) (actual
time=25895.91..25896.01 rows=5 loops=1)
-> Unique (cost=26137.82..26251.99 rows=2283 width=33) (actual
time=25895.90..25895.98 rows=6 loops=1)
-> Sort (cost=26137.82..26137.82 rows=22834 width=33) (actual
time=25895.88..25895.90 rows=13 loops=1)
-> Append (cost=98.67..24484.74 rows=22834 width=33) (actual
time=82.38..25102.67 rows=19265 loops=1)
-> Subquery Scan *SELECT* 1 (cost=98.67..342.06
rows=16937 width=20) (actual time=82.37..609.69 rows=17098 loops=1)
-> Merge Join (cost=98.67..342.06 rows=16937
width=20) (actual time=82.35..434.78 rows=17098 loops=1)
-> Index Scan using
ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168
width=8) (actual time=0.41..11.83 rows=373 loops=1)
-> Sort (cost=98.67..98.67 rows=763
width=12) (actual time=79.78..108.86 rows=17098 loops=1)
-> Hash Join (cost=33.09..62.16
rows=763 width=12) (actual time=26.18..58.97 rows=769 loops=1)
-> Seq Scan on
sample_manufacturer b (cost=0.00..15.69 rows=769 width=8) (actual
time=0.12..13.54 rows=769 loops=1)
-> Hash (cost=31.59..31.59
rows=602 width=4) (actual time=23.93..23.93 rows=0 loops=1)
-> Seq Scan on samples c
(cost=0.00..31.59 rows=602 width=4) (actual time=0.17..18.48 rows=602
loops=1)
-> Subquery Scan *SELECT* 2 (cost=24038.44..24142.68
rows=5897 width=33) (actual time=24381.93..24456.97 rows=2167 loops=1)
-> Merge Join (cost=24038.44..24142.68 rows=5897
width=33) (actual time=24381.92..24433.35 rows=2167 loops=1)
-> Index Scan using
ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168
width=8) (actual time=0.06..10.58 rows=373 loops=1)
-> Sort (cost=24038.44..24038.44 rows=301
width=25) (actual time=24380.15..24383.39 rows=2165 loops=1)
-> Merge Join (cost=0.00..24026.05
rows=301 width=25) (actual time=1346.68..24378.12 rows=97 loops=1)
-> Index Scan using
ix_samples_author_id on samples b (cost=0.00..42.12 rows=602 width=12)
(actual time=0.46..32.05 rows=602 loops=1)
-> Index Scan using users_pkey
on users c (cost=0.00..60.58 rows=1144 width=13) (actual time=0.40..57.70
rows=1419 loops=1)
SubPlan
-> Nested Loop
(cost=0.00..53.30 rows=1 width=12) (actual time=40.27..40.27 rows=1
loops=602)
-> Nested Loop
(cost=0.00..48.43 rows=2 width=8) (actual time=40.18..40.18 rows=1
loops=602)
-> Index Scan
using ix_manufacturer_terr_combo3 on manufacturer_territories r
(cost=0.00..19.80 rows=5 width=4) (actual time=0.11..0.91 rows=27 loops=602)
-> Index Scan
using ix_territory_ranges_combo_1 on territory_ranges q (cost=0.00..5.71
rows=1 width=4) (actual time=1.46..1.46 rows=0 loops=15995)
-> Index Scan using
territories_pkey on territories p (cost=0.00..3.02 rows=1 width=4) (actual
time=0.08..0.08 rows=1 loops=505)
Total runtime: 25915.75 msec

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#7Masaru Sugawara
rk73@sea.plala.or.jp
In reply to: Christian Cabanero (#1)
Re: SQL Server performing much better?!?!

On Tue, 19 Mar 2002 18:06:44 -0800
"Christian Cabanero" <chumpboy@yahoo.com> wrote:

SELECT
a.user_id, b.sample_id
FROM
user_company a,
samples b,
users c
WHERE
a.company_id = b.sample_manufacturer_id AND
b.sample_state = 1 AND
b.sample_author_id = c.user_id AND
NOT EXISTS
(
SELECT
p.territory_id
FROM
territories p,
territory_ranges q,
manufacturer_territories r
WHERE
r.manufacturer_id = b.sample_manufacturer_id AND
r.assignment_flag = 2 AND
r.territory_id = p.territory_id AND
p.territory_id = q.territory_id AND p.type IN (1, 2)
AND
c.zip BETWEEN q.start_value AND q.end_value
)

If query 1, 2 don't return rows so much and query 2 isn't slow,
using EXCEPT ALL might work faster than using NOT EXISTS.
Thus, could you show us the results of EXPLAIN ANALYZE
they'll return respectively ?

-- query 1.
SELECT
COUNT(*) -- a.user_id, b.sample_id
FROM
user_company a,
samples b,
users c
WHERE
a.company_id = b.sample_manufacturer_id AND
b.sample_state = 1 AND
b.sample_author_id = c.user_id
;

-- query 2.
SELECT
COUNT(*) -- a.user_id, b.sample_id
FROM
user_company a,
samples b,
users c
territories p,
territory_ranges q,
manufacturer_territories r
WHERE
a.company_id = b.sample_manufacturer_id AND
b.sample_state = 1 AND
b.sample_author_id = c.user_id AND
r.manufacturer_id = b.sample_manufacturer_id AND
r.assignment_flag = 2 AND
r.territory_id = p.territory_id AND
p.territory_id = q.territory_id AND
p.type IN (1, 2) AND
c.zip BETWEEN q.start_value AND q.end_value
;

Regards,
Masaru Sugawara

#8Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Masaru Sugawara (#7)
Re: SQL Server performing much better?!?!

Hi Christian,

sold on unix and more importantly open-source, but this performance
problem
sure isn't helping my little "campaign." ;-)

No doubt !

um.. looks like Solaris is the issue here - If you can persuade your
server management company to play ball, how about :

1/ Change Platform

Use an Intel box with Linux or Freebsd, you can be assured of _vastly_
better performance with Postgresql in either case. (or...)

2/ Change OS

Run Linux on the Sparc.

Even if your management company will not help you out, the option 1/ is
pretty inexpensive... and you get a _more_ fully open solution (OS as
well)

best of luck

Mark

#9Thomas Lockhart
lockhart@fourpalms.org
In reply to: Mark Kirkwood (#8)
Re: SQL Server performing much better?!?!

(I haven't followed the thread closely, but would like to comment anyway
;)

sold on unix and more importantly open-source, but this performance
problem sure isn't helping my little "campaign." ;-)

um.. looks like Solaris is the issue here - If you can persuade your
server management company to play ball, how about :
1/ Change Platform
2/ Change OS

That is a pretty drastic set of alternatives. Solaris can do just fine
as a PostgreSQL platform. Certainly 2GHz Intel hotboxes can spin more
cpu cycles than a 200MHz Solaris box (I don't know the specifics of your
hosting environment), and if you are just flat out asking for more than
the box can do then swapping hosts is an alternative. But unless you
have exhausted the possibilities for tuning on your existing box you
likely still have some things you can do to make the most of what you
have.

Good luck!

- Thomas

#10Martín Marqués
martin@bugs.unl.edu.ar
In reply to: Thomas Lockhart (#9)
Re: SQL Server performing much better?!?!

On S�b 23 Mar 2002 10:32, Thomas Lockhart wrote:

(I haven't followed the thread closely, but would like to comment anyway
;)

Niether did I. :-)

sold on unix and more importantly open-source, but this performance
problem sure isn't helping my little "campaign." ;-)

um.. looks like Solaris is the issue here - If you can persuade your
server management company to play ball, how about :
1/ Change Platform
2/ Change OS

That is a pretty drastic set of alternatives. Solaris can do just fine
as a PostgreSQL platform. Certainly 2GHz Intel hotboxes can spin more

Yes, PostgreSQL works pretty good on Solaris/SPARC, which is my main DB
platform.
But at this moment Solaris is a almost dead platform (last release, 9, was
the last, and there will be no more), and the hardware is too expensive. If
you ask your SUN dealer next year what they can offer you, they will probably
tell you that they have these nice Intel servers with SUN/Linux inside. :-P

So, and to finish, I don't see them as so drastic, but more like realistic.
:-)

Saludos... :-)

--
Porqu� usar una base de datos relacional cualquiera,
si pod�s usar PostgreSQL?
-----------------------------------------------------------------
Mart�n Marqu�s | mmarques@unl.edu.ar
Programador, Administrador, DBA | Centro de Telematica
Universidad Nacional
del Litoral
-----------------------------------------------------------------

#11Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Thomas Lockhart (#9)
Re: SQL Server performing much better?!?!

On Sun, 2002-03-24 at 01:32, Thomas Lockhart wrote:

(I haven't followed the thread closely, but would like to comment anyway
;)

No problem... thats probably what I did ... :-)

That is a pretty drastic set of alternatives. Solaris can do just fine
as a PostgreSQL platform. Certainly 2GHz Intel hotboxes can spin more
cpu cycles than a 200MHz Solaris box (I don't know the specifics of your
hosting environment), and if you are just flat out asking for more than
the box can do then swapping hosts is an alternative. But unless you
have exhausted the possibilities for tuning on your existing box you
likely still have some things you can do to make the most of what you
have.

Good luck!

- Thomas

Agreed...I currently work with Solaris and its generally quite good...
but there seems to be a definite performance problem with Postgresql on
Solaris (see a previous thread with title "Solaris Performance").

I am planning to profile a test case query that I have, so the problem
can be identified...(soon hopefully)

Cheers

Mark

#12Holger Marzen
holger@marzen.de
In reply to: Mark Kirkwood (#11)
Re: SQL Server performing much better?!?!

On 25 Mar 2002, Mark kirkwood wrote:

Agreed...I currently work with Solaris and its generally quite good...
but there seems to be a definite performance problem with Postgresql on
Solaris (see a previous thread with title "Solaris Performance").

Solaris is slow when writing to ufs filesystem in default configuration.
I guess that postgresql writes out temporary data. The writes can
probably be speed up when writing to tempfs or by disabling the journal.

Writing data can probably speed up with the mount option "nologging".
Reading data can probably speed up with the mount option "noatime".

Can someone try this?

--
PGP/GPG Key-ID:
http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&amp;search=0xB5A1AFE1