doc/FAQ* files are hosed
Comparing the current-CVS FAQ files with those in the REL6_4 branch
is depressing --- almost every one is messed up in one branch or the
other.
The main FAQ is newer in the 6.4 branch than in the current branch; the
6.4 version seems to have numerous small fixes that are not in the main
branch. FAQ_CVS is also newer in 6.4; it's been de-HTMLized in 6.4
(good) but not in current. FAQ_HPUX, FAQ_Irix, and FAQ_Linux are
completely trashed in the 6.4 branch --- all three have been overwritten
with some generic document.
I'd check in some fixes, but a look at the CVS logs makes me think that
part of the problem is too many cooks stirring the broth already.
Whoever is primary maintainer of these files probably should take
responsibility for getting the branches back in sync.
regards, tom lane
Comparing the current-CVS FAQ files with those in the REL6_4 branch
is depressing --- almost every one is messed up in one branch or the
other.The main FAQ is newer in the 6.4 branch than in the current branch; the
6.4 version seems to have numerous small fixes that are not in the main
branch. FAQ_CVS is also newer in 6.4; it's been de-HTMLized in 6.4
(good) but not in current. FAQ_HPUX, FAQ_Irix, and FAQ_Linux are
completely trashed in the 6.4 branch --- all three have been overwritten
with some generic document.
I must have done that during the 6.4.2 release process. I decided to
grab them from the web site, but it looks like I got the wrong pages
somehow.
They get over-written with ever release, so I have another chance to try
it for 6.5.
This does not apply to the FAQ_CVS. That's someone else.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <maillist@candle.pha.pa.us> writes:
Comparing the current-CVS FAQ files with those in the REL6_4 branch
is depressing --- almost every one is messed up in one branch or the
other.
They get over-written with ever release, so I have another chance to try
it for 6.5.
Oh? Aren't the CVS versions considered the master copies? If not,
where *are* the master copies?
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofSat16Jan1999204359-0500199901170143.UAA00861@candle.pha.pa.us | Resolved by subject fallback
Hello -
I've been lurking on this list for a little while. I have just done a
little tinkering with Postgres 6.4.2, comparing it to Oracle and mySQL on
Linux. It would appear that just doing a lot of inserts (ie 40000) in a
loop is enormously slower in Postgres in two ways that I tried it.
One - using a loop in Java that makes a JDBC call to insert.
Two- using plpgsql as a comparision to PL/SQL.
Perhaps these are bad techniques and a C based proc would do better?
I really like the idea of an open source DB and am impressed with a lot I
see about PostgreSQL - however the speed difference is pretty bad -
4.5 minutes versus about 20 seconds.
Is that just the way it goes? Are there options that would make it faster
that are not the defaults?
Any reaction would be appreciated.
John Holland
John Holland wrote:
Hello -
I've been lurking on this list for a little while. I have just done a
little tinkering with Postgres 6.4.2, comparing it to Oracle and mySQL on
Linux. It would appear that just doing a lot of inserts (ie 40000) in a
loop is enormously slower in Postgres in two ways that I tried it.
One - using a loop in Java that makes a JDBC call to insert.
Two- using plpgsql as a comparision to PL/SQL.Perhaps these are bad techniques and a C based proc would do better?
I really like the idea of an open source DB and am impressed with a lot I
see about PostgreSQL - however the speed difference is pretty bad -
4.5 minutes versus about 20 seconds.Is that just the way it goes? Are there options that would make it faster
that are not the defaults?
Oracle uses chained transaction mode (i.e. - all queries run in
single transaction untill explicit COMMIT/ABORT), MySQL hasn't
transaction at all...
PostgreSQL uses un-chained transaction mode: each query runs
in own transaction - 40000 transaction commits !!!
Try to use BEGIN/END to run all inserts in single transaction
and please let us know results.
Vadim
Bruce Momjian <maillist@candle.pha.pa.us> writes:
Comparing the current-CVS FAQ files with those in the REL6_4 branch
is depressing --- almost every one is messed up in one branch or the
other.They get over-written with ever release, so I have another chance to try
it for 6.5.Oh? Aren't the CVS versions considered the master copies? If not,
where *are* the master copies?
The master copies are on the web site. In fact, people send e-mail to a
special account here, and they get transfered to the web site
automatically.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Sun, 17 Jan 1999, Tom Lane wrote:
I tried this myself and found that wrapping BEGIN/END around a series of
INSERT statements didn't make much difference at all.
Using what API? Some APIs control autocommit for you, so execing
"BEGIN" and "END" commands may not do anything. You might want to check
your API for a way to set autocommit off. In fact, I would suggest that
you never use BEGIN and END except via cli.
Tom
Import Notes
Reply to msg id not found: 21212.916630479@sss.pgh.pa.us | Resolved by subject fallback
On Sun, 17 Jan 1999, Tom Lane wrote:
Tom <tom@sdf.com> writes:
On Sun, 17 Jan 1999, Tom Lane wrote:
I tried this myself and found that wrapping BEGIN/END around a series of
INSERT statements didn't make much difference at all.Using what API?
Sorry, I neglected to specify that it was psql (being driven by
hand-trimmed pg_dump scripts).
This also wouldn't you give you any benefit from "prepare" that would
speed up repeated executions of the same statement. PostgreSQL does allow
statements to be prepared right? If it doesn't, chances are COPY has a
way of doing it. It saves a lot of parsing and planning time.
Some APIs control autocommit for you, so execing
"BEGIN" and "END" commands may not do anything.That's not the case for psql. If it were, I would have measured no
difference in speed, rather than a small difference.regards, tom lane
Tom
Import Notes
Reply to msg id not found: 25968.916633305@sss.pgh.pa.us | Resolved by subject fallback
Vadim Mikheev <vadim@krs.ru> writes:
John Holland wrote:
I've been lurking on this list for a little while. I have just done a
little tinkering with Postgres 6.4.2, comparing it to Oracle and mySQL on
Linux. It would appear that just doing a lot of inserts (ie 40000) in a
loop is enormously slower in Postgres in two ways that I tried it.
One - using a loop in Java that makes a JDBC call to insert.
Two- using plpgsql as a comparision to PL/SQL.I really like the idea of an open source DB and am impressed with a lot I
see about PostgreSQL - however the speed difference is pretty bad -
4.5 minutes versus about 20 seconds.
Try to use BEGIN/END to run all inserts in single transaction
and please let us know results.
I tried this myself and found that wrapping BEGIN/END around a series of
INSERT statements didn't make much difference at all.
On an HP 715 (75MHz RISC box, not very heavy iron by current standards),
I inserted about 13000 records into an initially-empty table having 38
columns (just because it's data I had handy...). I timed it at:
Individual INSERT statements (as generated by pg_dump -d):
33 inserts/sec
Same with BEGIN/END wrapped around 'em:
34 inserts/sec
Single COPY statement (as generated by pg_dump without -d):
1400 inserts/sec
This was for a simple table with no indexes. In reality, this table
type has four b-tree indexes on different columns in our normal usage.
The COPY speed dropped to 325 inserts/sec when I had the indexes in
place. I didn't bother trying the INSERTs that way.
The conventional wisdom is that you should use COPY for bulk loading,
and this result supports it...
John didn't say what hardware he's using, so I don't know how comparable
my result is to his 150 inserts/sec --- that might have been on a table
with many fewer columns, or maybe his machine is just faster.
As for where the time is going: "top" showed that the time for the
INSERT ops was almost all going into backend CPU time. My guess is
that most of the cost is in parsing and planning the INSERT statements.
Pre-planned statement skeletons might help, but I think the real answer
will be to find a way to avoid constructing and parsing SQL statements
for simple operations like INSERT. (One reason I'm interested in the
idea of a CORBA interface is that it might help here.)
My guess is that Oracle and mySQL have APIs that allow the construction
of an SQL INSERT command to be bypassed, and that's why they show up
better on this operation.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofMon18Jan1999092945+070036A29C99.303EBCA4@krs.ru | Resolved by subject fallback
Tom <tom@sdf.com> writes:
On Sun, 17 Jan 1999, Tom Lane wrote:
I tried this myself and found that wrapping BEGIN/END around a series of
INSERT statements didn't make much difference at all.
Using what API?
Sorry, I neglected to specify that it was psql (being driven by
hand-trimmed pg_dump scripts).
Some APIs control autocommit for you, so execing
"BEGIN" and "END" commands may not do anything.
That's not the case for psql. If it were, I would have measured no
difference in speed, rather than a small difference.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofSun17Jan1999191405-0800Pine.BSF.4.05.9901171910380.28683-100000@misery.sdf.com | Resolved by subject fallback
Tom Lane wrote:
Try to use BEGIN/END to run all inserts in single transaction
and please let us know results.I tried this myself and found that wrapping BEGIN/END around a series of
INSERT statements didn't make much difference at all.On an HP 715 (75MHz RISC box, not very heavy iron by current standards),
I inserted about 13000 records into an initially-empty table having 38
^^
columns (just because it's data I had handy...). I timed it at:
^^^^^^^
So much.
Individual INSERT statements (as generated by pg_dump -d):
33 inserts/sec
Same with BEGIN/END wrapped around 'em:
34 inserts/sec
Single COPY statement (as generated by pg_dump without -d):
1400 inserts/sec
...
John didn't say what hardware he's using, so I don't know how comparable
my result is to his 150 inserts/sec --- that might have been on a table
with many fewer columns, or maybe his machine is just faster.As for where the time is going: "top" showed that the time for the
INSERT ops was almost all going into backend CPU time. My guess is
that most of the cost is in parsing and planning the INSERT statements.
Pre-planned statement skeletons might help, but I think the real answer
will be to find a way to avoid constructing and parsing SQL statements
for simple operations like INSERT. (One reason I'm interested in the
idea of a CORBA interface is that it might help here.)My guess is that Oracle and mySQL have APIs that allow the construction
of an SQL INSERT command to be bypassed, and that's why they show up
better on this operation.
For the table with single int4 column & 2000 INSERTs I have:
BEGIN/END: 3.5sec ~600insert/sec
Without: 151sec ~13insert/sec
(IDE disk!)
I guess that you don't use -F flag..?
Vadim
I wrote:
As for where the time is going: "top" showed that the time for the
INSERT ops was almost all going into backend CPU time. My guess is
that most of the cost is in parsing and planning the INSERT
statements.
Having an idle hour this evening, I thought it'd be interesting to build
a backend with profiling enabled, so as to confirm or deny the above
guess. It seems that indeed a lot of time is being wasted, but where
it's being wasted might surprise you!
I profiled a backend built from this morning's current CVS sources,
being driven by a single psql run reading a series of exactly 5000
INSERT commands. The first few are:
INSERT INTO "fehistory_1" values (64,14,252,'D','-','-','s',1,4027,NULL,'S',1,4088,NULL,32,'Thu Jan 07 15:15:00 1999 EST','Fri Jan 08 08:30:38 1999 EST',4027,32,NULL,NULL,46,NULL);
INSERT INTO "fehistory_1" values (65,14,252,'P','-','-','S',1,4086,NULL,NULL,NULL,NULL,NULL,32,'Fri Jan 08 16:00:00 1999 EST',NULL,NULL,NULL,NULL,NULL,45,NULL);
INSERT INTO "fehistory_1" values (66,14,135,'P','-','-','S',1,13619,NULL,'s',1,12967,NULL,100,'Fri Jan 08 02:00:00 1999 EST',NULL,NULL,NULL,NULL,NULL,44,NULL);
and it goes on for a long while in that vein :-). Total runtime was
278.09 CPU seconds in the backend, 307 seconds elapsed realtime. The
table being inserted into has no indexes and was initially empty.
The profiler is GNU gprof, which provides both flat and structured
profile data. The top few entries in the flat profile are:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
34.96 97.21 97.21 _mcount
7.18 117.19 19.98 8047816 0.00 0.00 tree_cut
6.56 135.43 18.24 962187 0.02 0.02 recv
3.55 145.31 9.88 8234278 0.00 0.00 tree_insert
3.49 155.02 9.71 4292947 0.00 0.01 malloc
3.31 164.22 9.20 1965056 0.00 0.01 SearchSysCache
3.24 173.23 9.01 4291775 0.00 0.01 free
2.44 180.01 6.78 6171667 0.00 0.00 tree_concatenate
2.17 186.04 6.03 8234069 0.00 0.00 tree_delete
1.07 189.02 2.98 10250 0.29 0.29 write
1.05 191.94 2.92 _psort_cmp
0.95 194.57 2.63 2983301 0.00 0.00 newNode
0.92 197.14 2.57 4251762 0.00 0.00 OrderedElemPop
0.88 199.60 2.46 2746172 0.00 0.00 comphash
0.82 201.88 2.28 4252495 0.00 0.01 AllocSetAlloc
0.77 204.02 2.14 1965056 0.00 0.00 SearchSysCacheTuple
0.73 206.05 2.03 265000 0.01 0.01 yylex
0.69 207.96 1.91 $$remU
0.68 209.86 1.90 4252500 0.00 0.00 OrderedElemPushHead
0.62 211.58 1.72 4247473 0.00 0.00 palloc
0.60 213.26 1.68 25004 0.07 0.07 send
0.56 214.81 1.55 1965056 0.00 0.00 CatalogCacheComputeHashIndex
0.55 216.33 1.52 5000 0.30 1.07 yyparse
0.54 217.82 1.49 4251762 0.00 0.01 AllocSetFree
0.53 219.29 1.47 4252494 0.00 0.00 MemoryContextAlloc
0.51 220.71 1.42 4252500 0.00 0.00 OrderedElemPushInto
0.49 222.07 1.36 3430844 0.00 0.00 strncmp
0.48 223.41 1.34 4205327 0.00 0.00 OrderedSetGetHead
0.47 224.73 1.32 elog
0.45 225.99 1.26 1964976 0.00 0.00 DLRemove
0.43 227.19 1.20 2510857 0.00 0.00 strcmp
0.43 228.38 1.19 4200327 0.00 0.00 OrderedElemGetBase
0.42 229.56 1.18 4245740 0.00 0.01 PortalHeapMemoryAlloc
0.38 230.62 1.06 4252500 0.00 0.00 OrderedElemPush
0.37 231.66 1.04 4205327 0.00 0.00 AllocSetGetFirst
0.36 232.66 1.00 1965145 0.00 0.00 DLAddHead
_mcount is part of the profiling apparatus and can be ignored (although
the fact that it's so high up implies that we're executing an awful lot
of short routine calls, because mcount runs once per entry and exit of
profilable routines). tree_cut, tree_insert, tree_concatenate, and
tree_delete are subroutines of malloc/free. Altogether, malloc/free
and friends accounted for 61.39 seconds out of the 180.88 non-overhead
CPU seconds in this run.
In other words, we're spending a third of our time mallocing and freeing
memory. A tad high, what?
Actually, it's worse than that, because AllocSetAlloc,
PortalHeapMemoryAlloc, AllocSetFree, and all of the OrderedElemXXX
routines represent our own bookkeeping layer atop malloc/free.
That's another 18.66 seconds spent in these top routines, which means
that we are real close to expending half the backend's runtime on
memory bookkeeping. This needs work.
The other thing that jumps out here is the unreasonably high position of
recv(), which is called 962187 times. The script being read by psql was
only 957186 characters. Evidently we're invoking a kernel recv() call
once per character read from the frontend. I suspect this is an
inefficiency introduced by Magnus Hagander's recent rewrite of backend
libpq (see, I told you there was a reason for using stdio ;-)). We're
gonna have to do something about that, though it's not as critical as
the memory-allocation issue. It also appears that send() is now being
invoked multiple times per backend reply, which is going to create
inefficiencies outside the backend (ie, multiple packets per reply).
On a test case with a lot of SELECTs that would show up more than it
does here.
Moving on to the dynamic profile, the most interesting items are:
index % time self children called name
-----------------------------------------------
0.05 118.06 1/1 DoBackend [7]
[8]: 65.3 0.05 118.06 1 PostgresMain [8] 0.04 57.64 5000/5000 pg_exec_query [9] 0.01 39.83 5000/5000 CommitTransactionCommand [11] 0.02 18.86 5001/5001 ReadCommand [27] 0.01 1.18 5000/5000 StartTransactionCommand [118] 0.01 0.36 5001/5001 ReadyForQuery [190] 0.04 0.00 5001/5001 EnableNotifyInterrupt [385] 0.04 0.00 5000/5000 strspn [388] 0.00 0.01 1/1 InitPostgres [495] 0.00 0.00 5000/548573 strlen [172] 0.00 0.00 5001/93349 memset [366] 0.00 0.00 1/1 read_pg_options [601] 0.00 0.00 2/2 pq_putint [637] 0.00 0.00 1/15002 pq_putnchar [123] 0.00 0.00 2/10013 getenv [241] 0.00 0.00 1/1 SetPgUserName [683] 0.00 0.00 1/1 pq_init [687] 0.00 0.00 1/1 proc_exit [704] 0.00 0.00 1/1 pq_close [721] 0.00 0.00 1/5002 getpgusername [521] 0.00 0.00 1/2 FindExec [722] 0.00 0.00 5/8 getopt [728] 0.00 0.00 5001/5001 DisableNotifyInterrupt [761] 0.00 0.00 9/34 pqsignal [806] 0.00 0.00 2/5 atoi [837] 0.00 0.00 1/2 SetProcessingMode [868] 0.00 0.00 1/10002 TransactionFlushEnabled [749] 0.00 0.00 1/1 sigsetjmp [934]
0.04 57.64 5000/5000 pg_exec_query [9]
0.01 39.83 5000/5000 CommitTransactionCommand [11]22.0 0.01 39.83 5000 CommitTransactionCommand [11] 0.02 39.81 5000/5000 CommitTransaction [12] ----------------------------------------------- 0.02 39.81 5000/5000 CommitTransactionCommand [11]
0.02 18.86 5001/5001 ReadCommand [27]
0.01 1.18 5000/5000 StartTransactionCommand [118]
0.01 0.36 5001/5001 ReadyForQuery [190]
0.04 0.00 5001/5001 EnableNotifyInterrupt [385]
0.04 0.00 5000/5000 strspn [388]
0.00 0.01 1/1 InitPostgres [495]
0.00 0.00 5000/548573 strlen [172]
0.00 0.00 5001/93349 memset [366]
0.00 0.00 1/1 read_pg_options [601]
0.00 0.00 2/2 pq_putint [637]
0.00 0.00 1/15002 pq_putnchar [123]
0.00 0.00 2/10013 getenv [241]
0.00 0.00 1/1 SetPgUserName [683]
0.00 0.00 1/1 pq_init [687]
0.00 0.00 1/1 proc_exit [704]
0.00 0.00 1/1 pq_close [721]
0.00 0.00 1/5002 getpgusername [521]
0.00 0.00 1/2 FindExec [722]
0.00 0.00 5/8 getopt [728]
0.00 0.00 5001/5001 DisableNotifyInterrupt [761]
0.00 0.00 9/34 pqsignal [806]
0.00 0.00 2/5 atoi [837]
0.00 0.00 1/2 SetProcessingMode [868]
0.00 0.00 1/10002 TransactionFlushEnabled [749]
0.00 0.00 1/1 sigsetjmp [934]
("self" is the amount of time (CPU sec) spent directly in the named function,
while "children" is the amount of time spent in its callees, pro-rated
by the number of calls. For example, strlen was actually measured to
use 0.47 sec in the whole run, but since PostgresMain called it less
than 1% of all the times it was called, PostgresMain gets blamed for
less than 0.01 sec of that total.)
The interesting thing here is the relative times indicated for
pg_exec_query, CommitTransactionCommand, and ReadCommand. We already
found out why ReadCommand is so slow (recv() per character) ... but why
is CommitTransactionCommand so high, when I already discovered that
doing only one commit doesn't help much? Patience, we'll get there.
-----------------------------------------------
0.03 57.61 5000/5000 pg_exec_query [9]
[10]: 31.9 0.03 57.61 5000 pg_exec_query_dest [10] 0.09 39.12 5000/5000 pg_parse_and_plan [13] 0.00 17.37 5000/5000 ProcessQuery [36] 0.02 0.60 5000/5000 SetQuerySnapshot [159] 0.01 0.34 5000/5000 CommandCounterIncrement [196] 0.02 0.05 10000/4291775 free [22] ----------------------------------------------- 0.09 39.12 5000/5000 pg_exec_query_dest [10]
0.09 39.12 5000/5000 pg_parse_and_plan [13]21.7 0.09 39.12 5000 pg_parse_and_plan [13] 0.02 19.28 5000/5000 parser [26] 0.03 17.63 5000/5000 planner [34] 0.00 1.98 5000/5000 QueryRewrite [90] 0.02 0.05 10000/4292947 malloc [21] 0.02 0.05 10000/4291775 free [22] 0.00 0.02 5000/497779 lappend [91] 0.01 0.00 5000/5000 IsAbortedTransactionBlockState [529] 0.00 0.00 5000/72779 length [342]
0.00 17.37 5000/5000 ProcessQuery [36]
0.02 0.60 5000/5000 SetQuerySnapshot [159]
0.01 0.34 5000/5000 CommandCounterIncrement [196]
0.02 0.05 10000/4291775 free [22]
-----------------------------------------------
0.09 39.12 5000/5000 pg_exec_query_dest [10]31.9 0.03 57.61 5000 pg_exec_query_dest [10] 0.09 39.12 5000/5000 pg_parse_and_plan [13] 0.00 17.37 5000/5000 ProcessQuery [36] 0.02 0.60 5000/5000 SetQuerySnapshot [159] 0.01 0.34 5000/5000 CommandCounterIncrement [196] 0.02 0.05 10000/4291775 free [22] ----------------------------------------------- 0.09 39.12 5000/5000 pg_exec_query_dest [10]
[13]: 21.7 0.09 39.12 5000 pg_parse_and_plan [13] 0.02 19.28 5000/5000 parser [26] 0.03 17.63 5000/5000 planner [34] 0.00 1.98 5000/5000 QueryRewrite [90] 0.02 0.05 10000/4292947 malloc [21] 0.02 0.05 10000/4291775 free [22] 0.00 0.02 5000/497779 lappend [91] 0.01 0.00 5000/5000 IsAbortedTransactionBlockState [529] 0.00 0.00 5000/72779 length [342]
0.02 19.28 5000/5000 parser [26]10.7 0.02 19.28 5000 parser [26] 0.02 13.88 5000/5000 parse_analyze [41] 1.52 3.81 5000/5000 yyparse [54] 0.01 0.01 5000/5000 init_io [474] 0.00 0.01 5000/420496 pstrdup [124] 0.01 0.00 5000/5000 parser_init [533]
0.03 17.63 5000/5000 planner [34]
0.00 1.98 5000/5000 QueryRewrite [90]
0.02 0.05 10000/4292947 malloc [21]
0.02 0.05 10000/4291775 free [22]
0.00 0.02 5000/497779 lappend [91]
0.01 0.00 5000/5000 IsAbortedTransactionBlockState [529]
0.00 0.00 5000/72779 length [342]
In other words, parsing, planning, and executing an INSERT each take
about the same amount of time.
-----------------------------------------------
0.01 39.83 5000/5000 PostgresMain [8]65.3 0.05 118.06 1 PostgresMain [8] 0.04 57.64 5000/5000 pg_exec_query [9] 0.01 39.83 5000/5000 CommitTransactionCommand [11] 0.02 18.86 5001/5001 ReadCommand [27] 0.01 1.18 5000/5000 StartTransactionCommand [118] 0.01 0.36 5001/5001 ReadyForQuery [190] 0.04 0.00 5001/5001 EnableNotifyInterrupt [385] 0.04 0.00 5000/5000 strspn [388] 0.00 0.01 1/1 InitPostgres [495] 0.00 0.00 5000/548573 strlen [172] 0.00 0.00 5001/93349 memset [366] 0.00 0.00 1/1 read_pg_options [601] 0.00 0.00 2/2 pq_putint [637] 0.00 0.00 1/15002 pq_putnchar [123] 0.00 0.00 2/10013 getenv [241] 0.00 0.00 1/1 SetPgUserName [683] 0.00 0.00 1/1 pq_init [687] 0.00 0.00 1/1 proc_exit [704] 0.00 0.00 1/1 pq_close [721] 0.00 0.00 1/5002 getpgusername [521] 0.00 0.00 1/2 FindExec [722] 0.00 0.00 5/8 getopt [728] 0.00 0.00 5001/5001 DisableNotifyInterrupt [761] 0.00 0.00 9/34 pqsignal [806] 0.00 0.00 2/5 atoi [837] 0.00 0.00 1/2 SetProcessingMode [868] 0.00 0.00 1/10002 TransactionFlushEnabled [749] 0.00 0.00 1/1 sigsetjmp [934]
[11]: 22.0 0.01 39.83 5000 CommitTransactionCommand [11] 0.02 39.81 5000/5000 CommitTransaction [12] ----------------------------------------------- 0.02 39.81 5000/5000 CommitTransactionCommand [11]
0.02 39.81 5000/5000 CommitTransaction [12]22.0 0.02 39.81 5000 CommitTransaction [12] 0.00 38.12 5000/5000 AtCommit_Memory [16] 0.01 1.10 5000/5000 RecordTransactionCommit [122] 0.00 0.30 5000/5000 AtCommit_Locks [211] 0.02 0.10 5000/5000 AtEOXact_portals [286] 0.03 0.07 5000/5000 DestroyTempRels [305] 0.03 0.00 5000/5000 _lo_commit [425] 0.01 0.00 5000/5000 AtCommit_Notify [522] 0.01 0.00 5000/5000 CloseSequences [523] 0.01 0.00 5000/5000 RelationPurgeLocalRelation [531] 0.00 0.00 5000/10000 AtCommit_Cache [750] ----------------------------------------------- 0.00 38.12 5000/5000 CommitTransaction [12]
-----------------------------------------------
0.02 39.81 5000/5000 CommitTransactionCommand [11]22.0 0.01 39.83 5000 CommitTransactionCommand [11] 0.02 39.81 5000/5000 CommitTransaction [12] ----------------------------------------------- 0.02 39.81 5000/5000 CommitTransactionCommand [11]
[12]: 22.0 0.02 39.81 5000 CommitTransaction [12] 0.00 38.12 5000/5000 AtCommit_Memory [16] 0.01 1.10 5000/5000 RecordTransactionCommit [122] 0.00 0.30 5000/5000 AtCommit_Locks [211] 0.02 0.10 5000/5000 AtEOXact_portals [286] 0.03 0.07 5000/5000 DestroyTempRels [305] 0.03 0.00 5000/5000 _lo_commit [425] 0.01 0.00 5000/5000 AtCommit_Notify [522] 0.01 0.00 5000/5000 CloseSequences [523] 0.01 0.00 5000/5000 RelationPurgeLocalRelation [531] 0.00 0.00 5000/10000 AtCommit_Cache [750] ----------------------------------------------- 0.00 38.12 5000/5000 CommitTransaction [12]
0.00 38.12 5000/5000 AtCommit_Memory [16]21.1 0.00 38.12 5000 AtCommit_Memory [16] 0.01 38.11 5000/5000 EndPortalAllocMode [17] 0.00 0.00 5000/10552 MemoryContextSwitchTo [747] ----------------------------------------------- 0.01 38.11 5000/5000 AtCommit_Memory [16]
0.01 1.10 5000/5000 RecordTransactionCommit [122]
0.00 0.30 5000/5000 AtCommit_Locks [211]
0.02 0.10 5000/5000 AtEOXact_portals [286]
0.03 0.07 5000/5000 DestroyTempRels [305]
0.03 0.00 5000/5000 _lo_commit [425]
0.01 0.00 5000/5000 AtCommit_Notify [522]
0.01 0.00 5000/5000 CloseSequences [523]
0.01 0.00 5000/5000 RelationPurgeLocalRelation [531]
0.00 0.00 5000/10000 AtCommit_Cache [750]
-----------------------------------------------
0.00 38.12 5000/5000 CommitTransaction [12]22.0 0.02 39.81 5000 CommitTransaction [12] 0.00 38.12 5000/5000 AtCommit_Memory [16] 0.01 1.10 5000/5000 RecordTransactionCommit [122] 0.00 0.30 5000/5000 AtCommit_Locks [211] 0.02 0.10 5000/5000 AtEOXact_portals [286] 0.03 0.07 5000/5000 DestroyTempRels [305] 0.03 0.00 5000/5000 _lo_commit [425] 0.01 0.00 5000/5000 AtCommit_Notify [522] 0.01 0.00 5000/5000 CloseSequences [523] 0.01 0.00 5000/5000 RelationPurgeLocalRelation [531] 0.00 0.00 5000/10000 AtCommit_Cache [750] ----------------------------------------------- 0.00 38.12 5000/5000 CommitTransaction [12]
[16]: 21.1 0.00 38.12 5000 AtCommit_Memory [16] 0.01 38.11 5000/5000 EndPortalAllocMode [17] 0.00 0.00 5000/10552 MemoryContextSwitchTo [747] ----------------------------------------------- 0.01 38.11 5000/5000 AtCommit_Memory [16]
0.01 38.11 5000/5000 EndPortalAllocMode [17]21.1 0.01 38.11 5000 EndPortalAllocMode [17] 0.54 37.55 5000/5000 AllocSetReset [18] 0.01 0.00 5000/10000 PortalHeapMemoryGetVariableMemory [449] 0.01 0.00 5000/5000 FixedStackPop [528] 0.00 0.00 5000/51434 MemoryContextFree [367] ----------------------------------------------- 0.54 37.55 5000/5000 EndPortalAllocMode [17]
0.00 0.00 5000/10552 MemoryContextSwitchTo [747]
-----------------------------------------------
0.01 38.11 5000/5000 AtCommit_Memory [16]21.1 0.00 38.12 5000 AtCommit_Memory [16] 0.01 38.11 5000/5000 EndPortalAllocMode [17] 0.00 0.00 5000/10552 MemoryContextSwitchTo [747] ----------------------------------------------- 0.01 38.11 5000/5000 AtCommit_Memory [16]
[17]: 21.1 0.01 38.11 5000 EndPortalAllocMode [17] 0.54 37.55 5000/5000 AllocSetReset [18] 0.01 0.00 5000/10000 PortalHeapMemoryGetVariableMemory [449] 0.01 0.00 5000/5000 FixedStackPop [528] 0.00 0.00 5000/51434 MemoryContextFree [367] ----------------------------------------------- 0.54 37.55 5000/5000 EndPortalAllocMode [17]
0.54 37.55 5000/5000 AllocSetReset [18]21.1 0.54 37.55 5000 AllocSetReset [18] 1.47 32.51 4200327/4251762 AllocSetFree [20] 1.04 2.53 4205327/4205327 AllocSetGetFirst [64]
0.01 0.00 5000/10000 PortalHeapMemoryGetVariableMemory [449]
0.01 0.00 5000/5000 FixedStackPop [528]
0.00 0.00 5000/51434 MemoryContextFree [367]
-----------------------------------------------
0.54 37.55 5000/5000 EndPortalAllocMode [17]21.1 0.01 38.11 5000 EndPortalAllocMode [17] 0.54 37.55 5000/5000 AllocSetReset [18] 0.01 0.00 5000/10000 PortalHeapMemoryGetVariableMemory [449] 0.01 0.00 5000/5000 FixedStackPop [528] 0.00 0.00 5000/51434 MemoryContextFree [367] ----------------------------------------------- 0.54 37.55 5000/5000 EndPortalAllocMode [17]
[18]: 21.1 0.54 37.55 5000 AllocSetReset [18] 1.47 32.51 4200327/4251762 AllocSetFree [20] 1.04 2.53 4205327/4205327 AllocSetGetFirst [64]
1.47 32.51 4200327/4251762 AllocSetFree [20]
1.04 2.53 4205327/4205327 AllocSetGetFirst [64]
In other words, essentially *all* of the CPU time spent in
CommitTransaction is spent freeing memory. That's probably why
ganging the transactions doesn't help --- it's the same number of
memory blocks getting allocated and freed.
0.02 19.28 5000/5000 pg_parse_and_plan [13]21.7 0.09 39.12 5000 pg_parse_and_plan [13] 0.02 19.28 5000/5000 parser [26] 0.03 17.63 5000/5000 planner [34] 0.00 1.98 5000/5000 QueryRewrite [90] 0.02 0.05 10000/4292947 malloc [21] 0.02 0.05 10000/4291775 free [22] 0.00 0.02 5000/497779 lappend [91] 0.01 0.00 5000/5000 IsAbortedTransactionBlockState [529] 0.00 0.00 5000/72779 length [342]
[26]: 10.7 0.02 19.28 5000 parser [26] 0.02 13.88 5000/5000 parse_analyze [41] 1.52 3.81 5000/5000 yyparse [54] 0.01 0.01 5000/5000 init_io [474] 0.00 0.01 5000/420496 pstrdup [124] 0.01 0.00 5000/5000 parser_init [533]
0.02 13.88 5000/5000 parse_analyze [41]
1.52 3.81 5000/5000 yyparse [54]
0.01 0.01 5000/5000 init_io [474]
0.00 0.01 5000/420496 pstrdup [124]
0.01 0.00 5000/5000 parser_init [533]
Thomas might know why parse_analyze is taking so much time compared to
the rest of the parsing machinery...
I won't bother the list with the rest of the profile, although I'll be
happy to send it to anyone who wants to see all of it. Our immediate
performance problems seem pretty clear, however: time to rethink memory
management.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofSun17Jan1999223439-050021212.916630479@sss.pgh.pa.us | Resolved by subject fallback
On Sun, Jan 17, 1999 at 09:08:28PM -0500, John Holland wrote:
I've been lurking on this list for a little while. I have just done a
little tinkering with Postgres 6.4.2, comparing it to Oracle and mySQL on
Linux. It would appear that just doing a lot of inserts (ie 40000) in a
loop is enormously slower in Postgres in two ways that I tried it.
One - using a loop in Java that makes a JDBC call to insert.
Two- using plpgsql as a comparision to PL/SQL.
Did you use the '-F' option? I tried somethings similar from C and found
similar results when calling fsync() after each write. Once I used -F it was
way faster and comparable to Oracle.
Perhaps these are bad techniques and a C based proc would do better?
Feel free to try src/interfaces/ecpg/test/perftest.c
Michael
--
Michael Meskes | Go SF 49ers!
Th.-Heuss-Str. 61, D-41812 Erkelenz | Go Rhein Fire!
Tel.: (+49) 2431/72651 | Use Debian GNU/Linux!
Email: Michael.Meskes@gmx.net | Use PostgreSQL!
On Sun, Jan 17, 1999 at 07:23:15PM -0800, Tom wrote:
This also wouldn't you give you any benefit from "prepare" that would
speed up repeated executions of the same statement. PostgreSQL does allow
statements to be prepared right? If it doesn't, chances are COPY has a
Since when? I thought we're still looking for someone to implement it, i.e.
at least the PREPARE statement.
Michael
--
Michael Meskes | Go SF 49ers!
Th.-Heuss-Str. 61, D-41812 Erkelenz | Go Rhein Fire!
Tel.: (+49) 2431/72651 | Use Debian GNU/Linux!
Email: Michael.Meskes@gmx.net | Use PostgreSQL!
The other thing that jumps out here is the unreasonably high
position of
recv(), which is called 962187 times. The script being read
by psql was
only 957186 characters. Evidently we're invoking a kernel recv() call
once per character read from the frontend. I suspect this is an
inefficiency introduced by Magnus Hagander's recent rewrite of backend
libpq (see, I told you there was a reason for using stdio ;-)). We're
gonna have to do something about that, though it's not as critical as
the memory-allocation issue.
Could be because of that. I noticed that the backend calls pq_getchar() a
_lot_ of times, looping for reading a single character. It did that before
too. The difference was that pq_getchar() called fgetc() then, and calls
recv() now.
I don't know, maybe recv() is more expensive than fgetc()? But I really
can't see any reason it shuold be called more often now than before.
An interesting fact is that pq_getchar() doesn't show up at all. Could be
because it's fast, but still executed many times, right? Or it could be that
the 'inner loops' in pq_getchar(), pq_peekchar(), or pqGetNBytes() don't
work as expected. On my system (Linux 2.2), I only get one recv() call for
each entry into these functions - as it should be - might it be different on
yours?
Ok, so I give up, perhaps we need a buffer after all :-)
//Magnus
Import Notes
Resolved by subject fallback
Could be because of that. I noticed that the backend calls pq_getchar() a
_lot_ of times, looping for reading a single character. It did that before
too. The difference was that pq_getchar() called fgetc() then, and calls
recv() now.
I don't know, maybe recv() is more expensive than fgetc()? But I really
can't see any reason it shuold be called more often now than before.
An interesting fact is that pq_getchar() doesn't show up at all. Could be
because it's fast, but still executed many times, right? Or it could be that
the 'inner loops' in pq_getchar(), pq_peekchar(), or pqGetNBytes() don't
work as expected. On my system (Linux 2.2), I only get one recv() call for
each entry into these functions - as it should be - might it be different on
yours?
It is very possible that fgetc() is a macro on your platform. See
/usr/include/stdio.h. If so, it has no function call overhead.
On BSD/OS, it used to be a macro, but now with threads, it is not. They
have a macro version, but is under a different name.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Tom Lane wrote:
The other thing that jumps out here is the unreasonably high position of
recv(), which is called 962187 times. The script being read by psql was
only 957186 characters. Evidently we're invoking a kernel recv() call
once per character read from the frontend. I suspect this is an
inefficiency introduced by Magnus Hagander's recent rewrite of backend
libpq (see, I told you there was a reason for using stdio ;-)).
At least part of the problem is also the fe-be protocol itself, maybe
not in the backend receiving side, but certainly while front-end is
receiving.
The main problem is that you must very often wait for a specific end
character (instead of sending first the length and then reading the
required amount in one chunk), and it requires some trickery to do it
efficiently without reading each character separately.
I once did the fe-be protocol in python (an interpreted scripting
language, see http://www.python.org), it was for v6.2. I was quite
amazed by the baroqueness of the protocol - it uses a mix three
techniques - 1. wait for EOT char, 2. receive a pascal style string and
3.
get a record consisting of a mix of 1 and 2.
We're
gonna have to do something about that, though it's not as critical as
the memory-allocation issue. It also appears that send() is now being
invoked multiple times per backend reply, which is going to create
inefficiencies outside the backend (ie, multiple packets per reply).
On a test case with a lot of SELECTs that would show up more than it
does here.
I am sure that we need to change the protocol sometime soon, if we
want to increase the performance. I have contemplated something like
the X-window protocol, that seems to be quite well designed.
And of course we need some sort of CLI that can do prepared statements
and that can use binary fields (not only the file interface).
Currently we have some of it, bu only in SPI.
Unfortunately most inserts are not done using SPI :(
It may be the CORBA interface that somebody may be working on, or
it may be something simpler.
---------------------
Hannu Krosing
I tried to do this.....
What I've done is create a plpgpsql function that does the inserts.
There is a begin/end in the function of course...I also tried doing it
with a begin work before calling the function in psql, and it did seem
that this worked as it should - ie a rollback work would get rid of the
40000 inserts.
Still, it took about 3.5 minutes, and Oracle does it in about 30 seconds.
factor of 7.....
On the plus side for Postgres, trying to find some queries against the
data for other speed tests, I found that it seemed to respond quickly -
not much thinking to do with one table of integers I guess. Actually I
couldn't come up with a query that would take very long - need another
table I suppose.
I found that UPDATE TEST SET A = B -1 took only 20 seconds for 40000
records.
The real plus is that Oracle couldn't do that transaction because it was
running out of rollback segment space - something I have fixed in the past
but I recall being a pain to fix - need to make bigger rollback datafiles
or whatever.
Any suggested speed tests would be appreciated.
Show quoted text
Try to use BEGIN/END to run all inserts in single transaction
and please let us know results.Vadim
Tom Lane wrote
Vadim Mikheev <vadim@krs.ru> writes:
John Holland wrote:
I've been lurking on this list for a little while. I have just done a
little tinkering with Postgres 6.4.2, comparing it to Oracle and mySQL on
Linux. It would appear that just doing a lot of inserts (ie 40000) in a
loop is enormously slower in Postgres in two ways that I tried it.
One - using a loop in Java that makes a JDBC call to insert.
Two- using plpgsql as a comparision to PL/SQL.I really like the idea of an open source DB and am impressed with a lot I
see about PostgreSQL - however the speed difference is pretty bad -
4.5 minutes versus about 20 seconds.Try to use BEGIN/END to run all inserts in single transaction
and please let us know results.I tried this myself and found that wrapping BEGIN/END around a series of
INSERT statements didn't make much difference at all.On an HP 715 (75MHz RISC box, not very heavy iron by current standards),
I inserted about 13000 records into an initially-empty table having 38
columns (just because it's data I had handy...). I timed it at:Individual INSERT statements (as generated by pg_dump -d):
33 inserts/sec
Same with BEGIN/END wrapped around 'em:
34 inserts/sec
Single COPY statement (as generated by pg_dump without -d):
1400 inserts/secThis was for a simple table with no indexes. In reality, this table
type has four b-tree indexes on different columns in our normal usage.
The COPY speed dropped to 325 inserts/sec when I had the indexes in
place. I didn't bother trying the INSERTs that way.The conventional wisdom is that you should use COPY for bulk loading,
and this result supports it...John didn't say what hardware he's using, so I don't know how comparable
my result is to his 150 inserts/sec --- that might have been on a table
with many fewer columns, or maybe his machine is just faster.As for where the time is going: "top" showed that the time for the
INSERT ops was almost all going into backend CPU time. My guess is
that most of the cost is in parsing and planning the INSERT statements.
Pre-planned statement skeletons might help, but I think the real answer
will be to find a way to avoid constructing and parsing SQL statements
for simple operations like INSERT. (One reason I'm interested in the
idea of a CORBA interface is that it might help here.)My guess is that Oracle and mySQL have APIs that allow the construction
of an SQL INSERT command to be bypassed, and that's why they show up
better on this operation.
Oracle has something called direct path loading. It is very fast, but
fussy. You need to drop indexes, do the direct path load, and restore
the indexes. If you do a direct path load with indexes, or stop
halfway (? was a while ago ?) the table gets stuck in a funny mode
and you have to drop/recreate it. See the sqlload manual for
details.
We used this, plus segmenting the database, plus ping-ponging between
two tables[1]I.e have two tables, a and b. Load b while accesses are going to a, then when b is all loaded and indexed, have the apps switch to looking up using b and load the data into a. I think we used views or something to do the switch. on one big project. Worked great except they cancelled
the project two weeks before we were supposed to go on-line. Still
bitter about that one.
-- cary
[1]: I.e have two tables, a and b. Load b while accesses are going to a, then when b is all loaded and indexed, have the apps switch to looking up using b and load the data into a. I think we used views or something to do the switch.
a, then when b is all loaded and indexed, have the apps switch to looking
up using b and load the data into a. I think we used views or something
to do the switch.
Import Notes
Resolved by subject fallback
Magnus Hagander <mha@sollentuna.net> writes:
I don't know, maybe recv() is more expensive than fgetc()?
Vastly.
recv() is a kernel call. You have the overhead of getting control into
the kernel, normally several times more expensive than a function call;
of passing parameters back and forth from user space to kernel space
(for example, the kernel will probably have to translate and range-check
the buffer pointer you pass it, to ensure you can't fool the kernel into
scribbling on some other process's memory); of verifying that the
descriptor number you pass is open and you have permission to read it;
and of finding the associated data buffer. Plus the scheduler may
run to reconsider whether to give control back to you, or switch off to
another user process. Etc etc etc.
fgetc() is a plain C function that normally just has to fetch the next
byte out of a buffer that's already been read into your address space
--- that is, when you're using stdio, you pay all the above-described
kernel interaction overhead once per bufferload, not once per character.
If you use getc(), which is allowed to be a macro, you don't even pay
the function-call overhead; that form is probably less than a dozen
instructions, except when the buffer is empty. Judging by the profile
numbers, recv()'ing a single character takes close to 1400 instructions
on my system.
An interesting fact is that pq_getchar() doesn't show up at all. Could be
because it's fast, but still executed many times, right?
Right, it doesn't run long enough to get itself into the top functions.
It's there though --- the dynamic profile shows:
-----------------------------------------------
0.03 18.83 5001/5001 ReadCommand [27]
[28]: 10.4 0.03 18.83 5001 SocketBackend [28] 0.00 18.73 5000/5000 pq_getstr [30] 0.01 0.09 5001/5001 pq_getnchar [303] ----------------------------------------------- 0.00 18.73 5000/5000 SocketBackend [28]
0.00 18.73 5000/5000 pq_getstr [30]10.4 0.00 18.73 5000 pq_getstr [30] 0.11 18.62 5000/5000 pqGetString [29] ----------------------------------------------- 0.11 18.62 5000/5000 pq_getstr [30]
0.01 0.09 5001/5001 pq_getnchar [303]
-----------------------------------------------
0.00 18.73 5000/5000 SocketBackend [28]10.4 0.03 18.83 5001 SocketBackend [28] 0.00 18.73 5000/5000 pq_getstr [30] 0.01 0.09 5001/5001 pq_getnchar [303] ----------------------------------------------- 0.00 18.73 5000/5000 SocketBackend [28]
[30]: 10.4 0.00 18.73 5000 pq_getstr [30] 0.11 18.62 5000/5000 pqGetString [29] ----------------------------------------------- 0.11 18.62 5000/5000 pq_getstr [30]
0.11 18.62 5000/5000 pqGetString [29]10.4 0.11 18.62 5000 pqGetString [29] 0.47 18.15 957186/957186 pq_getchar [31] ----------------------------------------------- 0.47 18.15 957186/957186 pqGetString [29]
-----------------------------------------------
0.11 18.62 5000/5000 pq_getstr [30]10.4 0.00 18.73 5000 pq_getstr [30] 0.11 18.62 5000/5000 pqGetString [29] ----------------------------------------------- 0.11 18.62 5000/5000 pq_getstr [30]
[29]: 10.4 0.11 18.62 5000 pqGetString [29] 0.47 18.15 957186/957186 pq_getchar [31] ----------------------------------------------- 0.47 18.15 957186/957186 pqGetString [29]
0.47 18.15 957186/957186 pq_getchar [31]10.3 0.47 18.15 957186 pq_getchar [31] 18.15 0.00 957186/962187 recv [32] ----------------------------------------------- 0.09 0.00 5001/962187 pqGetNBytes [315] 18.15 0.00 957186/962187 pq_getchar [31]
-----------------------------------------------
0.47 18.15 957186/957186 pqGetString [29]10.4 0.11 18.62 5000 pqGetString [29] 0.47 18.15 957186/957186 pq_getchar [31] ----------------------------------------------- 0.47 18.15 957186/957186 pqGetString [29]
[31]: 10.3 0.47 18.15 957186 pq_getchar [31] 18.15 0.00 957186/962187 recv [32] ----------------------------------------------- 0.09 0.00 5001/962187 pqGetNBytes [315] 18.15 0.00 957186/962187 pq_getchar [31]
18.15 0.00 957186/962187 recv [32]10.1 18.24 0.00 962187 recv [32] -----------------------------------------------
-----------------------------------------------
0.09 0.00 5001/962187 pqGetNBytes [315]
18.15 0.00 957186/962187 pq_getchar [31]10.3 0.47 18.15 957186 pq_getchar [31] 18.15 0.00 957186/962187 recv [32] ----------------------------------------------- 0.09 0.00 5001/962187 pqGetNBytes [315] 18.15 0.00 957186/962187 pq_getchar [31]
[32]: 10.1 18.24 0.00 962187 recv [32] -----------------------------------------------
-----------------------------------------------
In the old code with fgetc(), the execution time of fgetc() was probably
not much worse than pq_getchar --- ie, about half a second not 18
seconds for this test sequence...
What we need to do here is to re-introduce the buffering ability of
stdio into backend libpq. If you compare the current frontend libpq,
you'll notice that it reads or writes the socket a bufferload at a time,
not a character at a time.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofMon18Jan1999100726+0100215896B6B5E1CF11BC5600805FFEA821012A31C2@sirius.edu.sollentuna.se | Resolved by subject fallback