Oracle vs PostgreSQL in real life

Started by Jean-Paul ARGUDOalmost 24 years ago37 messages
#1Jean-Paul ARGUDO
jean-paul.argudo@idealx.com

Okay...

I'm very sceptic today.

I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2

The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I
migrated successfully with ECPG. Other stuff with Connect by statments, thanks
to OpenACS guys, I migrated this Connect by statments too.

But finaly, with all my mind I explained all queries, made all good, I hope
everything has be done.

The "test" is a big batch that computes stuffs in the database. Here are the
timings of both Oracle and PG (7.2) :

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

Tests were made on the same machine, a pentium 3 600 MHz with 256 Megs RAM and
RAID 5.

We formatted the server and insstalled linux stuff after..

So what you think of SUCH difference between Oracle/NT and Linux/PG ?

I feel very bad in front of the customer, to tell there is a 1:15 ratio between
Oracle / Nt and Linux / PostgreSQL, since I'm real PG fan and DBA in french Open
Souce company...

Thanks a lot for support.

:-(((

--
Jean-Paul ARGUDO

#2mlw
markw@mohawksoft.com
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

Jean-Paul ARGUDO wrote:

Okay...

I'm very sceptic today.

I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2

The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I
migrated successfully with ECPG. Other stuff with Connect by statments, thanks
to OpenACS guys, I migrated this Connect by statments too.

But finaly, with all my mind I explained all queries, made all good, I hope
everything has be done.

The "test" is a big batch that computes stuffs in the database. Here are the
timings of both Oracle and PG (7.2) :

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

Wow! That is huge. Ok, let me ask some questions:

Did you do a "vacuum analyze" on the tables?
If you did not analyze the tables, it may be using table scans instead of
indexes. That would make a huge difference. Also, it may choose poorly between
hash joins and merge joins.

Did you tune "buffers" in postgresql.conf?
If you have too few buffers, you will get no caching effect on the queries.

#3Justin Clift
justin@postgresql.org
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

Hi Jean-Paul,

I know you've probably done this, but I'll ask just in case.

Did you tune the memory of the PostgreSQL server configuration?

i.e. the postgresql.conf file?

If so, what are the values you changed from default?

:-)

Regards and best wishes,

Justin Clift

Jean-Paul ARGUDO wrote:

Okay...

I'm very sceptic today.

I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2

The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I
migrated successfully with ECPG. Other stuff with Connect by statments, thanks
to OpenACS guys, I migrated this Connect by statments too.

But finaly, with all my mind I explained all queries, made all good, I hope
everything has be done.

The "test" is a big batch that computes stuffs in the database. Here are the
timings of both Oracle and PG (7.2) :

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

Tests were made on the same machine, a pentium 3 600 MHz with 256 Megs RAM and
RAID 5.

We formatted the server and insstalled linux stuff after..

So what you think of SUCH difference between Oracle/NT and Linux/PG ?

I feel very bad in front of the customer, to tell there is a 1:15 ratio between
Oracle / Nt and Linux / PostgreSQL, since I'm real PG fan and DBA in french Open
Souce company...

Thanks a lot for support.

:-(((

--
Jean-Paul ARGUDO

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

#4Doug McNaught
doug@wireboard.com
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

Jean-Paul ARGUDO <jean-paul.argudo@idealx.com> writes:

Okay...

I'm very sceptic today.

Did you adjust the shared buffers and other tuning settings for
Postgres?

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863

#5Alex Avriette
a_avriette@acs.org
In reply to: Doug McNaught (#4)
Re: Oracle vs PostgreSQL in real life

The "test" is a big batch that computes stuffs in the database. Here are the
timings of both Oracle and PG (7.2) :

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two
hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45
hours),
80 tps (eighty tps).

---

Jean-Paul, I think the problem here is not having postgres configured
properly. I am in a similar situation here where we are migrating data from
postgres into oracle. Postgres has been as much as 40x faster than Oracle in
many situations here. Note also that our oracle instance is on a quad
processor Sun 280R, and our postgres 'instance' is on a p3/1ghz. Iterating
over 440,000 xml 'text' fields in oracle takes about 4 days. In postgres it
takes 8 hours. Iterating over a 3.5M row table is just inconceivable for
oracle, and I do it in postgres all the time.

My suspicion is that our oracle instance is not tuned very well, and the
code that is manipulating the database (in this case perl) is much smarter
for postgres (we have separate developers to do perl-oracle interfaces).

Postgres is a fantastic, fast database. But you really must configure it,
and code intelligently to use it.

-alex

#6Marc Lavergne
mlavergne-pub@richlava.com
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

There is probably an explanation but "computes stuffs" doesn't provide
much information to go with. Do you think you could boil this down to a
test case? Also, expand on what the batch file does, the size
database, and which interface you are using. I'm sure people would like
to help, but there simply isn't enough information do derive and
conclusions here.

Cheers,

Marc

Jean-Paul ARGUDO wrote:

Show quoted text

Okay...

I'm very sceptic today.

I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2

The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I
migrated successfully with ECPG. Other stuff with Connect by statments, thanks
to OpenACS guys, I migrated this Connect by statments too.

But finaly, with all my mind I explained all queries, made all good, I hope
everything has be done.

The "test" is a big batch that computes stuffs in the database. Here are the
timings of both Oracle and PG (7.2) :

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

Tests were made on the same machine, a pentium 3 600 MHz with 256 Megs RAM and
RAID 5.

We formatted the server and insstalled linux stuff after..

So what you think of SUCH difference between Oracle/NT and Linux/PG ?

I feel very bad in front of the customer, to tell there is a 1:15 ratio between
Oracle / Nt and Linux / PostgreSQL, since I'm real PG fan and DBA in french Open
Souce company...

Thanks a lot for support.

:-(((

#7Jean-Paul ARGUDO
jean-paul.argudo@idealx.com
In reply to: Alex Avriette (#5)
Re: Oracle vs PostgreSQL in real life

Okay,

To answer many replies (thanks!), I'll try to put more details:

* DELL server
P3 600 MHZ
256 M ram
RAID 5

* kernel

Linux int2412 2.4.9-21SGI_XFS_1.0.2 #1 Thu Feb 7 16:50:37 CET 2002 i686 unknown

with aacraid-cox because aacraid had poor perfs with this server (at 1st we
tought about raid5 problems)

* postgresql.conf : here are _all_ uncomented parameters:

tcpip_socket = true
max_connections = 16
port = 5432

shared_buffers = 19000 # 2*max_connections, min 16
max_fsm_relations = 200 # min 10, fsm is free space map
max_fsm_pages = 12000 # min 1000, fsm is free space map
max_locks_per_transaction = 256 # min 10
wal_buffers = 24 # min 4

sort_mem = 8192 # min 32
vacuum_mem = 8192 # min 1024

wal_debug = 0 # range 0-16

fsync = true

silent_mode = true
log_connections = false
log_timestamp = false
log_pid = false

debug_level = 0 # range 0-16

debug_print_query = false
debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
show_parser_stats = false
show_planner_stats = false
show_executor_stats = false
show_query_stats = false

transform_null_equals = true

* /proc parameters:

proc/sys/kernel/shmall => 184217728 (more than 130M)
proc/sys/kernel/shmall => 184217728

* we made a bunch of vmstat logs too, we made graphics to understand, all in a
postscript file, with gun graph ... this is very interesting, but as I dont
know if attachments are autorized here, please tell me if I can post it too. It
shows swap in/out, memory, I/O, etc..

Thanks for your support!

--
Jean-Paul ARGUDO

#8Jean-Paul ARGUDO
jean-paul.argudo@idealx.com
In reply to: Marc Lavergne (#6)
Re: Oracle vs PostgreSQL in real life

The batch is originally wrotten in Pro*C for Oracle under Windows NT.

We transalted it thanks to fabulous ecpg client interface.

I posted details as asked before, hope this will help on some deeper analysis.

Thanks for your remarks.

--
Jean-Paul ARGUDO

#9Hannu Krosing
hannu@tm.ee
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

On Wed, 2002-02-27 at 16:46, Jean-Paul ARGUDO wrote:

Okay...

I'm very sceptic today.

I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2

The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I
migrated successfully with ECPG. Other stuff with Connect by statments, thanks
to OpenACS guys, I migrated this Connect by statments too.

But finaly, with all my mind I explained all queries, made all good, I hope
everything has be done.

What was the postgresql.conf set to ?

The "test" is a big batch that computes stuffs in the database.

Could you run this batch in smaller chunks to see if PG is slow from the
start or does it slow down as it goes ?

Here are the timings of both Oracle and PG (7.2) :

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

What kind of tps are these ?

I.e. what does each t do ?

-------------
Hannu

#10Jean-Paul ARGUDO
jean-paul.argudo@idealx.com
In reply to: Hannu Krosing (#9)
Re: Oracle vs PostgreSQL in real life

What was the postgresql.conf set to ?

I put parameters in another mail, please watch for it.

The "test" is a big batch that computes stuffs in the database.

Could you run this batch in smaller chunks to see if PG is slow from the
start or does it slow down as it goes ?

The batch starts really fast and past 2 minuts, begins to slow down dramatically
and never stops to get slower and slower

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

What kind of tps are these ?

Here's what we have in output:

This is the WINDOWS NT4 / Oracle 8.0 ouput when the batch is totally finished:

Time : 00:47:50

Transaction : 25696
Item : 344341
Transaction (in milliseconds) : 111
Item (in milliseconds) : 8

Errors : 0
Warnings : 0
PLU not found : 0
NOM not found : 0
Alloc NOM : 739
Free NOM : 739
Error 1555 : 0

Read : 2093582
Write : 1772364
Read/Write : 3865946

Free memory (RAM) : 117396 Ko / 261548 Ko

PLU SELECT : 344341
NOM SELECT : 1377364
T04 SELECT : 1840
T01 INSERT : 593
T01 UPDATE : 1376771
T02 INSERT : 28810
T02 UPDATE : 315531
T03 INSERT : 41199
T13 INSERT : 9460
RJT INSERT : 0
RJT SELECT : 0

--------------------
Beware "Transaction" does not mean transaction.. a "transaction" here contains one ore
more "item", in the context of the application/database.

What for real DML orders: 3.865.946 queries done in 47 min 50 secs. (the queries
are reparted in many tables, look for detail couting under "Free memory..."
line.. (a table name is 3 letters long)

Thats 1 347 queries per second... -ouch!

This is the Linux Red Hat 7.2 / PostgreSQL 7.2 port of the Pro*C program
producing the output

As you'll understand, it is not the COMPLETE batch, we had to stop it..:

Time : 00:16:26

Transaction : 750
Item : 7391
Transaction (ms) : 1314
Item (ms) : 133

Errors : 1
Warnings : 0
PLU not found : 0
NOM not found : 0
Alloc NOM : 739
Free NOM : 0
Error 1555 : 0

Read : 45127.000
Write : 37849.000
Read/Write : 82976.000

PLU SELECT : 7391
NOM SELECT : 29564
T04 SELECT : 31
T01 INSERT : 378
T01 UPDATE : 29186
T02 INSERT : 3385
T02 UPDATE : 4006
T03 INSERT : 613
T13 INSERT : 281
RJT INSERT : 0
RJT SELECT : 0

---------------- you see

we have 82.976 queries in 16 min 26 seconds thats a

84 queries per second

--

definitely nothing to do with Oracle :-((

Very bad for us since if this customers kicks Oracle to get PG, it can be really
fantastic, this customer has much influence on the business....

Thanks for helping me that much to all of you.
--
Jean-Paul ARGUDO

#11Hannu Krosing
hannu@krosing.net
In reply to: Jean-Paul ARGUDO (#10)
Re: Oracle vs PostgreSQL in real life

On Wed, 2002-02-27 at 23:21, Jean-Paul ARGUDO wrote:

What was the postgresql.conf set to ?

I put parameters in another mail, please watch for it.

The "test" is a big batch that computes stuffs in the database.

Could you run this batch in smaller chunks to see if PG is slow from the
start or does it slow down as it goes ?

The batch starts really fast and past 2 minuts, begins to slow down dramatically

This usually means that it is a good time to pause the patch and do a
"vacuum analyze" (or just "analyze" for 7.2)

In 7.2 you can probably do the vacuum analyze in parallel but it will
likely run faster when other backends are stopped.

and never stops to get slower and slower

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

What kind of tps are these ?

Here's what we have in output:

This is the WINDOWS NT4 / Oracle 8.0 ouput when the batch is totally finished:

Time : 00:47:50

Transaction : 25696
Item : 344341
Transaction (in milliseconds) : 111
Item (in milliseconds) : 8

Errors : 0
Warnings : 0
PLU not found : 0
NOM not found : 0
Alloc NOM : 739
Free NOM : 739
Error 1555 : 0

Read : 2093582
Write : 1772364
Read/Write : 3865946

Free memory (RAM) : 117396 Ko / 261548 Ko

Assuming these must be interpreted as TABLE COMMAND : COUNT

PLU SELECT : 344341
NOM SELECT : 1377364
T04 SELECT : 1840
T01 INSERT : 593
T01 UPDATE : 1376771

This means for postgres with no vacuum in between that you will have in
fact a 1.3M row table to search for 593 actual rows.

Running a (parallel) vacuum or vacuum full and possibly even reindex
will help a lot.

T02 INSERT : 28810
T02 UPDATE : 315531

here we have only 10/1 ratio on deleted/live records all of which have
unfortunately be checked for visibility in postgres.

T03 INSERT : 41199
T13 INSERT : 9460
RJT INSERT : 0
RJT SELECT : 0

--------------------
Beware "Transaction" does not mean transaction.. a "transaction" here contains one ore
more "item", in the context of the application/database.

I dont know ECPG very well, but are you sure that you are not running in
autocommit mode, i.e. that each command is not run in its own
transaction.

On the other end of spectrum - are you possibly running all the queries
in one transaction ?

What for real DML orders: 3.865.946 queries done in 47 min 50 secs. (the queries
are reparted in many tables, look for detail couting under "Free memory..."
line.. (a table name is 3 letters long)

Thats 1 347 queries per second... -ouch!

How complex are these queries ?

If much time is spent by backend on optimizing (vs. executing), then you
could win by rewriting some of these as PL/SQL or C procedures that do a
prepare/execute using SPI and use a stored plan.

This is the Linux Red Hat 7.2 / PostgreSQL 7.2 port of the Pro*C program
producing the output

As you'll understand, it is not the COMPLETE batch, we had to stop it..:

Can you run VACUUM ANALYZE and continue ?

Time : 00:16:26

Transaction : 750
Item : 7391
Transaction (ms) : 1314
Item (ms) : 133

Errors : 1
Warnings : 0
PLU not found : 0
NOM not found : 0
Alloc NOM : 739
Free NOM : 0
Error 1555 : 0

Read : 45127.000
Write : 37849.000
Read/Write : 82976.000

PLU SELECT : 7391
NOM SELECT : 29564
T04 SELECT : 31
T01 INSERT : 378

Was the T01 table empty at the start (does it have 378 rows) ?

T01 UPDATE : 29186

could you get a plan for an update on T01 at this point

does it look ok ?

can you make it faster by manipulating enable_xxx variables ?

T02 INSERT : 3385
T02 UPDATE : 4006
T03 INSERT : 613
T13 INSERT : 281
RJT INSERT : 0
RJT SELECT : 0

---------------- you see

we have 82.976 queries in 16 min 26 seconds thats a

84 queries per second

--

definitely nothing to do with Oracle :-((

Was oracle out-of-box or did you (or someone else) tune it too ?

Very bad for us since if this customers kicks Oracle to get PG, it can be really
fantastic, this customer has much influence on the business....

--------------
Hannu

#12Hannu Krosing
hannu@krosing.net
In reply to: Jean-Paul ARGUDO (#10)
Re: Oracle vs PostgreSQL in real life

On Wed, 2002-02-27 at 23:21, Jean-Paul ARGUDO wrote:

What was the postgresql.conf set to ?

I put parameters in another mail, please watch for it.

The "test" is a big batch that computes stuffs in the database.

Could you run this batch in smaller chunks to see if PG is slow from the
start or does it slow down as it goes ?

The batch starts really fast and past 2 minuts, begins to slow down dramatically
and never stops to get slower and slower

I did a small test run on my home computer (Celeron 350, IDE disks,
untuned 7.2 on RH 7.2)

I made a small table (int,text) with primary key on int and filled it
with values 1-512 for int.

then I ran a python script that updated 10000 random rows in patches of
10 updates.

the first run took

a) 1.28 - 112 tps

as it used seq scans

then I ran VACUUM ANALYZE and next runs were

1. 24 sec - 416 tps
2. 43 sec - 232 tps
3. 71 sec - 140 tps

then I tied the same query and run vacuum in another window manually
each 5 sec.

the result was similar to 1 - 24.5 sec

running vacuum every 10 sec slowed it to 25.1 sec, running every 3 sec
to 24.3 sec. Running vacuum in a tight loop slowed test down to 30.25
sec.

-------------------------
Hannu

#13Dann Corbit
DCorbit@connx.com
In reply to: Hannu Krosing (#12)
Re: Oracle vs PostgreSQL in real life

-----Original Message-----
From: Marc Lavergne [mailto:mlavergne-pub@richlava.com]
Sent: Wednesday, February 27, 2002 9:41 AM
To: Jean-Paul ARGUDO
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Oracle vs PostgreSQL in real life

There is probably an explanation but "computes stuffs" doesn't provide
much information to go with. Do you think you could boil this down to a
test case? Also, expand on what the batch file does, the size
database, and which interface you are using. I'm sure people would like
to help, but there simply isn't enough information do derive and
conclusions here.

----------------------------------------------------------------------

This seems a very important test case. If possible, and the client will
allow it, perhaps the relevant pieces of the schema could be published
to some ftp site along with the relevant C code. Then, we could
populate
the tables with dummy data and run the same tests. One of two things
will happen (I predict).

1. Someone will find a way to make it run fast.
OR
2. Someone will offer an improvement to PostgreSQL so that it can do as
well or better than Oracle for this application.

Without understanding the problem, we end up guessing.
<<----------------------------------------------------------------------

#14mlw
markw@mohawksoft.com
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

Jean-Paul ARGUDO wrote:

This is the Linux Red Hat 7.2 / PostgreSQL 7.2 port of the Pro*C program
producing the output

As you'll understand, it is not the COMPLETE batch, we had to stop it..:

Time : 00:16:26

Transaction : 750
Item : 7391
Transaction (ms) : 1314
Item (ms) : 133

Errors : 1
Warnings : 0
PLU not found : 0
NOM not found : 0
Alloc NOM : 739
Free NOM : 0
Error 1555 : 0

Read : 45127.000
Write : 37849.000
Read/Write : 82976.000

PLU SELECT : 7391
NOM SELECT : 29564
T04 SELECT : 31
T01 INSERT : 378
T01 UPDATE : 29186

Are you updating 29186 records in a table here? If so, is this table used in
the following queries?

T02 INSERT : 3385
T02 UPDATE : 4006

Ditto here, is T02 updated and then used in subsequent queries?

T03 INSERT : 613
T13 INSERT : 281
RJT INSERT : 0
RJT SELECT : 0

Are these queries run in this order, or are the inserts/updates/selects
intermingled?

A judicial vacuum on a couple of the tables may help.

Also, I noticed you had 19000 buffers. I did some experimentation with buffers
and found more is not always better. Depending on the nature of your database,
2048~4096 seem to be a sweet spot for some of he stuff that I do.

Again, have you "analyzed" the database? PostgreSQL will do badly if you have
not analyzed. (Oracle also benefits from analyzing, depending on the nature of
the data.)

Have you done an "explain" on the queries used in your batch? You may be able
to see what's going on.

#15Mattew T. O'Connor
matthew@rh71.postgresql.org
In reply to: Jean-Paul ARGUDO (#7)
Re: Oracle vs PostgreSQL in real life

shared_buffers = 19000 # 2*max_connections, min 16

This number sounds too high. If you only have 256M RAM, this is using over
150 of it. Are you swapping alot? What is the load on the server while it's
runing?

#16Jean-Paul ARGUDO
jean-paul.argudo@idealx.com
In reply to: Hannu Krosing (#11)
Re: Oracle vs PostgreSQL in real life

Hi all,

As I wrote it before there, it is an ECPG script that runs with bad perfs.
I put back trace/ notices/debug mode on the server.

Here is an example of what does the debug doesnt stop to do:

c... stuffs are CURSORS

it seems that on every commit, the cursor is closed

[... snip ...]
NOTICE: Closing pre-existing portal "csearcht04"
NOTICE: Closing pre-existing portal "csearcht30"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "csearcht04"
NOTICE: Closing pre-existing portal "csearcht30"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
[... snip ...]

c... stuffs are CURSORS

it seems that on every commit, the cursor is closed... and re-opened with new
variables'values

btw, as many asked me, queries are VERY simple, there is only a few queries.
Each query works on one table at a time. no joins for example. Only massive bulk
work with CURSORS.

Any way to avoid closing/opening of cursors?
Any tip on porting the best way cursors?;.

thanks in advance.

PS: I am currently testing vacuums between the script to pause the data
manipulation, make a vacuum analyze and continue the treatments.

--
Jean-Paul ARGUDO

#17Hannu Krosing
hannu@itmeedia.ee
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

----- Original Message -----
From: "Jean-Paul ARGUDO" <jean-paul.argudo@idealx.com>

it seems that on every commit, the cursor is closed... and re-opened with

new

variables'values

I think that currently the only way to reuse query plans would be migrating
some
of your logic to backend and using SPI prepared statements.

btw, as many asked me, queries are VERY simple, there is only a few

queries.

Each query works on one table at a time. no joins for example. Only

massive bulk

work with CURSORS.

Again, can't some of it be moved to backend, either using PL/PgSQL or C (or
pltcl, plperl, plpython ;)

PS: I am currently testing vacuums between the script to pause the data
manipulation, make a vacuum analyze and continue the treatments.

Starting with 7.2 you cand also run both analyze and simple vacuum in
parallel to the main app.

You will most likely need to run analyze once after tables are more or less
filled and then a parallel
vacuum every 5-30 sec to avoid tables growing too big. You could limit
vacuum to only those
tables that see a lot of updating (or delete/insert).

-----------
Hannu

#18Hannu Krosing
hannu@itmeedia.ee
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

----- Original Message -----
From: "Jean-Paul ARGUDO" <jean-paul.argudo@idealx.com>

PS: I am currently testing vacuums between the script to pause the data
manipulation, make a vacuum analyze and continue the treatments.

I ran a small test (Pg 7.2, RH 7.1, Athlon 850, 512Mb) that created a small
table of 2 fields with primary key on first,
filled it with 768 values and then run the following script:

--------------------------------
#!/usr/bin/python

import pg, random
con = pg.connect()
q = "update t01 set val='bum' where i = %s"
for trx in range(5000):
con.query('begin')
for cmd in range(20):
rn = random.randint(1,768)
con.query(q % rn)
con.query('commit')
--------------------------------

when run as is it made average of 152 updates/sec
[hannu@taru hannu]$ time ./abench.py

real 10m55.034s
user 0m27.270s
sys 0m4.700s

after doing a vacuum full i run it together with a parallel process
that was a simple loop sleeping 5 sec and then doing vacuum

--------------------------------
#!/usr/bin/python

import time, pg

con = pg.connect()

for trx in range(5000):
for cmd in range(20):
time.sleep(5)
print 'vacuum'
con.query('vacuum')
print 'done!'
--------------------------------

The same script runs now at average 917
[hannu@taru hannu]$ time ./abench.py

real 1m48.416s
user 0m16.840s
sys 0m3.300s

So here we have a case where the new vacuum can really save a day !

I also tried other vacuuming intervals and it seems that ~4 sec is the best
for this case

here are test results
interval - time - updates per sec

2 sec - 1.53.5 - 881
3 sec - 1.49.6 - 912
4 sec - 1.48.0 - 925
5 sec - 1.48.4 - 922
6 sec - 1.49.7 - 911
10 sec - 1.56.8 - 856
no vac - 10.55.0 - 152
--------------
Hannu

#19Jean-Paul ARGUDO
jean-paul.argudo@idealx.com
In reply to: Hannu Krosing (#17)
Re: Oracle vs PostgreSQL in real life

it seems that on every commit, the cursor is closed... and re-opened with
new
variables'values

I think that currently the only way to reuse query plans would be migrating
some
of your logic to backend and using SPI prepared statements.

btw, as many asked me, queries are VERY simple, there is only a few

queries.

Each query works on one table at a time. no joins for example. Only

massive bulk

work with CURSORS.

Again, can't some of it be moved to backend, either using PL/PgSQL or C (or
pltcl, plperl, plpython ;)

OK.

We read all the " Chapter 21. Server Programming Interface" with SPI doc.

This seems _really_ interresting, make me remember of outline statments in
Oracle.

So:

1) how to find some sample code? are SPI statments can be called from
/into ecpg?

2) if prepared statments and stored execution plan exist, why can't thos be used
from any client interface or simple sql?

3) You tell us we can "move to the backend" some queries: do you mean we would
have better performances with stored functions in plpgsql?

Thanks a lot Hannu, I promise to stop soon with questions :-)

This is _so_ important for us..

Best regards & wishes.

--
Jean-Paul ARGUDO

#20Karel Zak
zakkr@zf.jcu.cz
In reply to: Jean-Paul ARGUDO (#19)
Re: Oracle vs PostgreSQL in real life

On Thu, Feb 28, 2002 at 02:39:47PM +0100, Jean-Paul ARGUDO wrote:

2) if prepared statments and stored execution plan exist, why can't thos be used
from any client interface or simple sql?

There is "execute already parsed query plan" in SPI layout only.
The PostgreSQL hasn't SQL interface for this -- except my experimental
patch for 7.0 (I sometime think about port it to latest PostgreSQL
releace, but I haven't relevant motivation do it...)

3) You tell us we can "move to the backend" some queries: do you mean we would
have better performances with stored functions in plpgsql?

You needn't use plpgsql only. You can use C/C++, Tcl, Perl, Python.
IMHO best performance has C + SPI + "store execution plan".

But don't forget the path of query in PostgreSQL is not query
parser only. Use "execute already parsed query plan" has effect
if you use some query really often and the query spend in parser
longer time....

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#21bpalmer
bpalmer@crimelabs.net
In reply to: Alex Avriette (#5)
Re: Oracle vs PostgreSQL in real life

many situations here. Note also that our oracle instance is on a quad
processor Sun 280R, and our postgres 'instance' is on a p3/1ghz. Iterating

A 280r is a 2 way system, not 4 way (hence the 2 in 280).

- Brandon

----------------------------------------------------------------------------
c: 646-456-5455 h: 201-798-4983
b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5

#22Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#20)
Re: Oracle vs PostgreSQL in real life

On Thu, 2002-02-28 at 15:58, Karel Zak wrote:

On Thu, Feb 28, 2002 at 02:39:47PM +0100, Jean-Paul ARGUDO wrote:

2) if prepared statments and stored execution plan exist, why can't thos be used
from any client interface or simple sql?

There is "execute already parsed query plan" in SPI layout only.
The PostgreSQL hasn't SQL interface for this -- except my experimental
patch for 7.0 (I sometime think about port it to latest PostgreSQL
releace, but I haven't relevant motivation do it...)

I did some testing

5000*20 runs of update on non-existing key

(send query+parse+optimise+update 0 rows)

[hannu@taru abench]$ time ./abench.py 2>/dev/null

real 0m38.992s
user 0m6.590s
sys 0m1.860s

5000*20 runs of update on random existing key

(send query+parse+optimise+update 1 row)

[hannu@taru abench]$ time ./abench.py 2>/dev/null

real 1m48.380s
user 0m17.330s
sys 0m2.940s

the backend wallclock time for first is 39.0 - 6.6 = 32.4
the backend wallclock time for second is 108.4 - 17.3 = 91.1

so roughly 1/3 of time is spent on

communication+parse+optimize+locate

and 2/3 on actually updating the tuples

if we could save half of parse/optimise time by saving query plans, then
the backend performance would go up from 1097 to 100000/(91.1-16.2)=1335
updates/sec.

------------------

As an ad hoc test for parsing-planning-optimising costs I did the
following

backend time for "explain update t01 set val='bum'"
30.0 - 5.7 = 24.3

[hannu@taru abench]$ time ./abench.py 2>/dev/null

real 0m30.038s
user 0m5.660s
sys 0m2.800s

backend time for "explain update t01 set val='bum' where i = %s"
39.8 - 8.0 = 31.8

[hannu@taru abench]$ time ./abench.py 2>/dev/null

real 0m39.883s
user 0m8.000s
sys 0m2.620s

so adding "where i=n" to a query made
(parse+plan+show plan) run 1.3 times slower

some of it must be communication overhead, but sure
some is parsing/planning/optimising time.

--------------
Hannu

In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

Jean-Paul ARGUDO wrote:

As I wrote it before there, it is an ECPG script that runs with bad perfs.
I put back trace/ notices/debug mode on the server.

Here is an example of what does the debug doesnt stop to do:

c... stuffs are CURSORS

it seems that on every commit, the cursor is closed

[... snip ...]
NOTICE: Closing pre-existing portal "csearcht04"

...

NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
NOTICE: Closing pre-existing portal "cfindplu"
[... snip ...]

c... stuffs are CURSORS

it seems that on every commit, the cursor is closed... and re-opened with new
variables'values

By default, Postgres executes transactions in autocommit mode.
This means that each statement is executed in its own transaction and a
commit is performed
at the end of the statement, what is much slower than executing all
statements inside a
begin ... commit block.
To disable the autocommit mode you have to compile the ECPG script with
the -t option.
I Hope that it helps.

Regards,

Antonio Sergio

#24Michael Meskes
meskes@postgresql.org
In reply to: Jean-Paul ARGUDO (#7)
Re: Oracle vs PostgreSQL in real life

On Wed, Feb 27, 2002 at 06:44:53PM +0100, Jean-Paul ARGUDO wrote:

To answer many replies (thanks!), I'll try to put more details:
...
Linux int2412 2.4.9-21SGI_XFS_1.0.2 #1 Thu Feb 7 16:50:37 CET 2002 i686 unknown

But you know that kernels up to 2.4.10 had huge problems with virtual
memory, don#t you. I'd recommend testing it either on 2.4.17 (which seems to
run stable for me) or, if you want to be sure and do not need SMP, use
2.2.20.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

#25Michael Meskes
meskes@postgresql.org
In reply to: Jean-Paul ARGUDO (#16)
Re: Oracle vs PostgreSQL in real life

On Thu, Feb 28, 2002 at 10:32:48AM +0100, Jean-Paul ARGUDO wrote:

As I wrote it before there, it is an ECPG script that runs with bad perfs.
...
it seems that on every commit, the cursor is closed

Cursors shouldn't be closed, but prepared statements are deallocated on each
commit. AFAIK this is what the standard says.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

#26Michael Meskes
meskes@postgresql.org
In reply to: Antonio Sergio de Mello e Souza (#23)
Re: Oracle vs PostgreSQL in real life

On Thu, Feb 28, 2002 at 01:18:29PM -0500, Antonio Sergio de Mello e Souza wrote:

By default, Postgres executes transactions in autocommit mode.

That of course is true.

To disable the autocommit mode you have to compile the ECPG script with
the -t option.

That unfortunately is not. It's just the opposite way. ecpg per default uses
the Oracle way and issues a BEGIN after each commit automatically. Thus you
only have to specify COMMIT every now and then to end the transaction. If
you use "-t" or SET AUTOCOMMIT ON, then you run in the normal PostgreSQL
environment and get each command inside its own transaction. To manually
start and end transactions you have to use "-t" resp. EXEC SQL SET AUTOCOMMIT ON
and then issue a BEGIN.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

#27Karel Zak
zakkr@zf.jcu.cz
In reply to: Hannu Krosing (#22)
Re: Oracle vs PostgreSQL in real life

On Thu, Feb 28, 2002 at 06:21:34PM +0200, Hannu Krosing wrote:

On Thu, 2002-02-28 at 15:58, Karel Zak wrote:

On Thu, Feb 28, 2002 at 02:39:47PM +0100, Jean-Paul ARGUDO wrote:

2) if prepared statments and stored execution plan exist, why can't thos be used
from any client interface or simple sql?

There is "execute already parsed query plan" in SPI layout only.
The PostgreSQL hasn't SQL interface for this -- except my experimental
patch for 7.0 (I sometime think about port it to latest PostgreSQL
releace, but I haven't relevant motivation do it...)

I did some testing

5000*20 runs of update on non-existing key

(send query+parse+optimise+update 0 rows)

[hannu@taru abench]$ time ./abench.py 2>/dev/null

real 0m38.992s
user 0m6.590s
sys 0m1.860s

5000*20 runs of update on random existing key

(send query+parse+optimise+update 1 row)

[hannu@taru abench]$ time ./abench.py 2>/dev/null

real 1m48.380s
user 0m17.330s
sys 0m2.940s

the backend wallclock time for first is 39.0 - 6.6 = 32.4
the backend wallclock time for second is 108.4 - 17.3 = 91.1

so roughly 1/3 of time is spent on

communication+parse+optimize+locate

and 2/3 on actually updating the tuples

if we could save half of parse/optimise time by saving query plans, then
the backend performance would go up from 1097 to 100000/(91.1-16.2)=1335
updates/sec.

It depend on proportion between time-in-parser and time-in-executor. If
your query spend a lot of time in parser and optimizer is a query plan
cache interesting for you. Because the PostgreSQL has dynamic functions
and operators the time in parser can be for some queries very interesting.

We have good notion about total queries time now (for example from
bench tests), but we haven't real time statistics about path-of-query
in backend. How long time spend a query in the parser, how log in the
optimizer or executor? (... maybe use profiling, but I not sure
with it). All my suggestion for memory managment was based on result
of control messages those I wrote into mmgr. And for example Tom was
surprised of often realloc usage. I want say, we need more and more
data from code, else we can't good optimize it ;-)

suggestion: "TODO: solid path-of-query time profiling for developers" :-)

Karel
--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#28Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Karel Zak (#27)
Re: Oracle vs PostgreSQL in real life

As I wrote it before there, it is an ECPG script that runs with bad perfs.
...
it seems that on every commit, the cursor is closed

Cursors shouldn't be closed, but prepared statements are deallocated on each
commit. AFAIK this is what the standard says.

Wow, this sure sounds completely bogus to me.
Imho CURSORS opened inside a transanction (after BEGIN WORK) are supposed to
be closed (at least with autocommit=yes).

I do not think COMMIT is supposed to do anything with a prepared
statement. That is what EXEC SQL FREE :statementid is for.

That would then match e.g. Informix esql/c.

Andreas

#29Michael Tiemann
tiemann@redhat.com
In reply to: Alex Avriette (#5)
Re: Oracle vs PostgreSQL in real life

The number 2.4.9-21 corresponds to the (Red Hat) kernel I'm running right now.
Yes, 2.4.X as released from kernel.org had huge problems with virtual memory
(for virually all values of X), but many of these problems have been addressed
by keeping the kernel relatively frozen and just working on VM problems (which
is one of the things we've been doing at Red Hat). I'm not saying we've got it
totally nailed just yet, but I want to present the view that some branches of
the Linux kernel *have* been given the attention they need to avoid some of the
well-known problems that linux.org kernels are (essentially--through Linus's
law) designed to find.

M

Michael Meskes wrote:

Show quoted text

On Wed, Feb 27, 2002 at 06:44:53PM +0100, Jean-Paul ARGUDO wrote:

To answer many replies (thanks!), I'll try to put more details:
...
Linux int2412 2.4.9-21SGI_XFS_1.0.2 #1 Thu Feb 7 16:50:37 CET 2002 i686 unknown

But you know that kernels up to 2.4.10 had huge problems with virtual
memory, don#t you. I'd recommend testing it either on 2.4.17 (which seems to
run stable for me) or, if you want to be sure and do not need SMP, use
2.2.20.

Michael

In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life

Michael Meskes wrote:

On Thu, Feb 28, 2002 at 01:18:29PM -0500, Antonio Sergio de Mello e Souza wrote:

By default, Postgres executes transactions in autocommit mode.

That of course is true.

To disable the autocommit mode you have to compile the ECPG script with
the -t option.

That unfortunately is not. It's just the opposite way. ecpg per default uses
the Oracle way and issues a BEGIN after each commit automatically. Thus you
only have to specify COMMIT every now and then to end the transaction. If
you use "-t" or SET AUTOCOMMIT ON, then you run in the normal PostgreSQL
environment and get each command inside its own transaction. To manually
start and end transactions you have to use "-t" resp. EXEC SQL SET AUTOCOMMIT ON
and then issue a BEGIN.

Many thanks for the explanation! Sorry for the wrong advice... :-(

Regards,

Antonio Sergio

#31Jean-Paul ARGUDO
jean-paul.argudo@IDEALX.com
In reply to: mlw (#2)
Re: Oracle vs PostgreSQL in real life : NEWS!!!

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

Well... Where to start?

We work on a team of two. The other one is a C/C++ senior coder. He
mailed me a remark about datatypes on the database. Here is what he sent
me:

Our database has different datatypes, here are a count of distinct
datatypes in all tables:

197 numeric(x)
19 numeric(x,2)
2 varchar(x)
61 char(x)
36 datetime

He asked me about numeric(x) and he questioned my about how PG managed
the NUMERIC types.

I gave him a pointer on "numeric.c" in the PG srcs.

I analyzed this source and found that NUMERIC types are much most
expensive than simple INTEGER.

I really fall on the floor.. :-( I was sure with as good quality PG is,
when NUMERIC(x) columns are declared, It would be translated in INTEGER
(int2, 4 or 8, whatever...).

So, I made a pg_dump of the current database, made some perl
remplacements NUMERIC(x,0) to INTEGER.

I loaded the database and launched treatments: the results are REALLY
IMPRESIVE: here what I have:

((it is a port of Oracle/WinNT stuff to PostgreSQL/Red Hat 7.2)):

Oracle PG72 with NUMERICs PG72 with INTEGERS
--------------------------------------------------------------------------
sample
connect by
query ported 350ms 750ms 569ms
to PG
(thanks to
OpenACS code!)
--------------------------------------------------------------------------
sample "big"
query with
connect bys 3 min 30s 8 min 40s 5 min 1s
and many
sub-queries
--------------------------------------------------------------------------
Big Batch
treatment 1300 queries/s 80 queries/s 250 queries/s
queries

PRO*C to 45 min to go ~4 to 6 DAYS not yet
ECPG to go tested fully

Ratio 1:1 1:21 not yet ..
21 times slower!

--------------------------------------------------------------------------
((but this batch will be yet re-writen in pure C + libpq + SPI,
so we think we'll have better results again))

So as you see, DATA TYPES are REALLY important, as I did write on a
techdocs article ( I should have tought in this earlier )

Then?

I'll inform you of what's going on with this Oracle/winnt 2 PG/linux port :-))

And We thank you _very_ much of all the help you gave us.

Best regards and Wishes,

--
Jean-Paul ARGUDO

#32Michael Meskes
meskes@postgresql.org
In reply to: Antonio Sergio de Mello e Souza (#30)
Re: Oracle vs PostgreSQL in real life

On Fri, Mar 01, 2002 at 10:14:34AM -0500, Antonio Sergio de Mello e Souza wrote:

Many thanks for the explanation! Sorry for the wrong advice... :-(

No problem. I have to apologize for the lack of docs. :-)

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

#33D'Arcy J.M. Cain
darcy@druid.net
In reply to: Jean-Paul ARGUDO (#31)
Re: Oracle vs PostgreSQL in real life : NEWS!!!

On March 1, 2002 01:44 pm, Jean-Paul ARGUDO wrote:

I analyzed this source and found that NUMERIC types are much most
expensive than simple INTEGER.

I really fall on the floor.. :-( I was sure with as good quality PG is,
when NUMERIC(x) columns are declared, It would be translated in INTEGER
(int2, 4 or 8, whatever...).

So, I made a pg_dump of the current database, made some perl
remplacements NUMERIC(x,0) to INTEGER.

I loaded the database and launched treatments: the results are REALLY
IMPRESIVE: here what I have:

Any chance you can try it with the MONEY type? It does use integers to
store the data. It isn't really designed for general numeric use but it
would be interesting to see how it fares.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#34Doug McNaught
doug@wireboard.com
In reply to: Jean-Paul ARGUDO (#1)
Re: Oracle vs PostgreSQL in real life : NEWS!!!

"D'Arcy J.M. Cain" <darcy@druid.net> writes:

Any chance you can try it with the MONEY type? It does use integers to
store the data. It isn't really designed for general numeric use but it
would be interesting to see how it fares.

I think the MONEY type is deprecated...

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863

#35D'Arcy J.M. Cain
darcy@druid.net
In reply to: Doug McNaught (#34)
Re: Oracle vs PostgreSQL in real life : NEWS!!!

On March 2, 2002 09:02 am, Doug McNaught wrote:

"D'Arcy J.M. Cain" <darcy@druid.net> writes:

Any chance you can try it with the MONEY type? It does use integers to
store the data. It isn't really designed for general numeric use but it
would be interesting to see how it fares.

I think the MONEY type is deprecated...

I keep hearing that but I use it heavily and I hope it never goes away. The
NUMERIC type is nice but I still think that the MONEY type works well for
certain things. I bet it will be shown to be more efficient. Certainly
it has limitations but within those limitations it works well.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#36Hannu Krosing
hannu@krosing.net
In reply to: Jean-Paul ARGUDO (#31)
Re: Oracle vs PostgreSQL in real life : NEWS!!!

On Fri, 2002-03-01 at 23:44, Jean-Paul ARGUDO wrote:

Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred
tps)

Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours),
80 tps (eighty tps).

Well... Where to start?

We work on a team of two. The other one is a C/C++ senior coder. He
mailed me a remark about datatypes on the database. Here is what he sent
me:

Our database has different datatypes, here are a count of distinct
datatypes in all tables:

197 numeric(x)
19 numeric(x,2)
2 varchar(x)
61 char(x)
36 datetime

He asked me about numeric(x) and he questioned my about how PG managed
the NUMERIC types.

I gave him a pointer on "numeric.c" in the PG srcs.

I analyzed this source and found that NUMERIC types are much most
expensive than simple INTEGER.

I really fall on the floor.. :-( I was sure with as good quality PG is,
when NUMERIC(x) columns are declared, It would be translated in INTEGER
(int2, 4 or 8, whatever...).

Postgres does not do any silent type replacing based on data type max
length.

So, I made a pg_dump of the current database, made some perl
remplacements NUMERIC(x,0) to INTEGER.

I loaded the database and launched treatments: the results are REALLY
IMPRESIVE: here what I have:

((it is a port of Oracle/WinNT stuff to PostgreSQL/Red Hat 7.2)):

Oracle PG72 with NUMERICs PG72 with INTEGERS
--------------------------------------------------------------------------
sample
connect by
query ported 350ms 750ms 569ms
to PG
(thanks to
OpenACS code!)

Did you rewrite your CONNECT BY queries as recursive functions or did
you use varbit tree position pointers ?

--------------------------------------------------------------------------
sample "big"
query with
connect bys 3 min 30s 8 min 40s 5 min 1s
and many
sub-queries

Could you give more information on this query - i suspect this can be
made at least as fast as Oracle :)

--------------------------------------------------------------------------
Big Batch
treatment 1300 queries/s 80 queries/s 250 queries/s
queries

PRO*C to 45 min to go ~4 to 6 DAYS not yet
ECPG to go tested fully

Ratio 1:1 1:21 not yet ..
21 times slower!

Did you run concurrent vacuum for both PG results ?

From my limited testing it seems that such vacuum is absolutely needed
for big batches of mostly updates.

And btw 45min * 21 = 15h45 not 4-6 days :)

--------------------------------------------------------------------------
((but this batch will be yet re-writen in pure C + libpq + SPI,
so we think we'll have better results again))

You probably will get better results :)

I rerun my test (5000 transactions of 20 updates on random unique key
between 1 and 768, with concurrent vacuum running every 4 sec) moving
the inner loop of 20 random updates to server, both without SPI prepared
statements and then using prepared statements.

Test hardware - Athlon 859, IDE, 512MB ram

update of random row i=1..768
all queries sent from client 2:02 = 820 updates sec
[hannu@taru abench]$ time ./abench.py
real 2m1.522s
user 0m20.260s
sys 0m3.320s
[hannu@taru abench]$ time ./abench.py
real 2m2.320s
user 0m19.830s
sys 0m3.490s

using plpython without prepared statements 1:35 = 1052 updates/sec
[hannu@taru abench]$ time ./abenchplpy2.py
real 1m34.587s
user 0m1.280s
sys 0m0.400s
[hannu@taru abench]$ time ./abenchplpy2.py
real 1m36.919s
user 0m1.350s
sys 0m0.450s

using plpython with SPI prepared statements 1:06.30 = 1503 updates/sec
[hannu@taru abench]$ time ./abenchplpy.py
real 1m6.134s
user 0m1.400s
sys 0m0.720s
[hannu@taru abench]$ time ./abenchplpy.py
real 1m7.186s
user 0m1.580s
sys 0m0.570s

plpython non-functional with SPI prepared
statements - update where i=1024 0:17.65 = 5666 non-updates sec
[hannu@taru abench]$ time ./abenchplpy.py
real 0m17.650s
user 0m0.990s
sys 0m0.290s

So as you see, DATA TYPES are REALLY important, as I did write on a
techdocs article ( I should have tought in this earlier )

Yes they are.

But running concurrent vacuum is _much_ more important if the number of
updates is much bigger than number of records (thanks Tom!)

------------------
Hannu

#37Michael Meskes
meskes@postgresql.org
In reply to: Jean-Paul ARGUDO (#31)
Re: Oracle vs PostgreSQL in real life : NEWS!!!

On Fri, Mar 01, 2002 at 07:44:10PM +0100, Jean-Paul ARGUDO wrote:

((but this batch will be yet re-writen in pure C + libpq + SPI,
so we think we'll have better results again))

You mean instead of using ecpg? I'd really be interested in the results of
this.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!