low performance

Started by Andreas Wernitznigalmost 25 years ago8 messagesbugs

andreas@insilico.com

almost 25 years ago

I am running the precomplied binary of Postgreql 7.1.2 on a Redhat 7.1 (on a Dual Celeron System with 256MB, kernel 2.4.4 and 2.4.5) System.
(The installation of the new 7.1.3 doesn't seem to solve the problem)

I am connecting to the DB with a Perl Program (using Perl 5.6.0 with DBD-Pg-1.01 and DBI-1.19).
The program inserts some million rows into a db with about 30 tables. The processing takes (if everyting works fine) about 10 hours to complete. Usually the my Perl-Script and the database share the available CPU time 50:50.
But sometimes the database is very slow eating up most (>98%) of the available CPU time.
(Of cause I know VACUUM and VACUUM ANALYZE, this is not the problem).

The only thing that seems to help then, is killing the perl script, stopping postgresql, running "ipcclean", and start again from the beginning. If it works from the beginning, the database is ususally very fast until all data are processed.

But if someone else connects (using psql), sometimes the database gets very slow until it is using all the CPU time.

There are no error messages at postgres-startup.
I already increased the number of buffers to 2048 (doesn't help)

I cannot reproduce these problems, sometimes the db is fast, sometimes very slow. The perl script doesn't seem to be the problem, because I wrote all SQL Commands to a file and processed them later ("psql dbname postgres < SQL-File").
Same thing: sometimes slow sometimes fast.

Andreas

Andreas Wernitznig

andreas@insilico.com

almost 25 years ago

In reply to: Andreas Wernitznig (#1)

Re: low performance

I am aware of the performance drawbacks because of indices and triggers. In fact I have a trigger and an index on the most populated table.
It is not possible in my case to remove the primary keys during insert, because the database structure and foreign keys validate my data during import.

The problem is, that sometimes the performance is good, and sometimes the database is awfully slow.
If it is slow, postgres is eating up all CPU time and it takes at least 150 times longer to insert the data.
I don't know why and what to do against that.

Andreas

On Mon, 20 Aug 2001 19:39:31 -0400
Jonas Lindholm <jlindholm@rcn.com> wrote:

Show quoted text

Do you have any index on the tables ? Any triggers ?

If you want to insert 1 million rows you should drop the index, insert the data and then recreate the index.
You should also try the COPY command to insert the data.

You should also avoid having anyone to connect to the database when you insert a lot of rows, and 1 million rows are a lot of rows for any database.

I've been able to insert, in one table, 17 million record in ~3 hours on a Compaq SMP 750 Mhz with 512MB
by dropping the index, using several COPY commands at the same time loading different parts of the data and then creating the index again.
At the time of the inserts no other processes than the COPY's was connected to the database.

/Jonas Lindholm

Andreas Wernitznig wrote:

I am running the precomplied binary of Postgreql 7.1.2 on a Redhat 7.1 (on a Dual Celeron System with 256MB, kernel 2.4.4 and 2.4.5) System.
(The installation of the new 7.1.3 doesn't seem to solve the problem)

I am connecting to the DB with a Perl Program (using Perl 5.6.0 with DBD-Pg-1.01 and DBI-1.19).
The program inserts some million rows into a db with about 30 tables. The processing takes (if everyting works fine) about 10 hours to complete. Usually the my Perl-Script and the database share the available CPU time 50:50.
But sometimes the database is very slow eating up most (>98%) of the available CPU time.
(Of cause I know VACUUM and VACUUM ANALYZE, this is not the problem).

The only thing that seems to help then, is killing the perl script, stopping postgresql, running "ipcclean", and start again from the beginning. If it works from the beginning, the database is ususally very fast until all data are processed.

But if someone else connects (using psql), sometimes the database gets very slow until it is using all the CPU time.

There are no error messages at postgres-startup.
I already increased the number of buffers to 2048 (doesn't help)

I cannot reproduce these problems, sometimes the db is fast, sometimes very slow. The perl script doesn't seem to be the problem, because I wrote all SQL Commands to a file and processed them later ("psql dbname postgres < SQL-File").
Same thing: sometimes slow sometimes fast.

Andreas

Tom Lane

tgl@sss.pgh.pa.us

almost 25 years ago

In reply to: Andreas Wernitznig (#2)

Re: Re: low performance

Andreas Wernitznig <andreas@insilico.com> writes:

I am aware of the performance drawbacks because of indices and
triggers. In fact I have a trigger and an index on the most populated
table. It is not possible in my case to remove the primary keys
during insert, because the database structure and foreign keys
validate my data during import.

Foreign keys eh?

The problem is, that sometimes the performance is good, and sometimes
the database is awfully slow. If it is slow, postgres is eating up
all CPU time and it takes at least 150 times longer to insert the
data. I don't know why and what to do against that.

We found some foreign-key-related performance problems not long ago,
and it could be you're happening on another one. However there's not
enough info here to figure it out. I can offer you two alternatives:

1. Compile up the backend with profiling enabled (if you're using gcc
then "make PROFILE=-pg clean all" in src/backend should do the trick).
Collect profiles for both a "normal" and a "slow" run and send them in.

2. Develop a self-contained example that exhibits the problem, and send
it along for someone else to profile.

regards, tom lane

Andreas Wernitznig

andreas@insilico.com

almost 25 years ago

In reply to: Tom Lane (#3)

Re: Re: low performance

I took option 1 and managed to create a profile of a slow and a fast run:

The frequent functions of the FAST run:

% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 15725437 0.00 0.00 AllocSetAlloc
0.00 0.00 0.00 15380742 0.00 0.00 MemoryContextAlloc
0.00 0.00 0.00 11296700 0.00 0.00 ExecEvalExpr
0.00 0.00 0.00 8276639 0.00 0.00 newNode
0.00 0.00 0.00 5430717 0.00 0.00 MemoryContextSwitchTo
0.00 0.00 0.00 4492641 0.00 0.00 LockBuffer
0.00 0.00 0.00 4425642 0.00 0.00 AllocSetFree
0.00 0.00 0.00 4356571 0.00 0.00 pfree
0.00 0.00 0.00 3873174 0.00 0.00 pq_getbyte
0.00 0.00 0.00 3799725 0.00 0.00 appendStringInfoChar

The frequent functions of the SLOW run:

% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 27832819 0.00 0.00 ExecEvalExpr
0.00 0.00 0.00 19040887 0.00 0.00 AllocSetAlloc
0.00 0.00 0.00 18976313 0.00 0.00 MemoryContextAlloc
0.00 0.00 0.00 18722462 0.00 0.00 LockBuffer
0.00 0.00 0.00 18684832 0.00 0.00 MemoryContextSwitchTo
0.00 0.00 0.00 18442039 0.00 0.00 pg_detoast_datum
0.00 0.00 0.00 16947638 0.00 0.00 AllocSetFree
0.00 0.00 0.00 16934648 0.00 0.00 pfree
0.00 0.00 0.00 9716164 0.00 0.00 SpinAcquire
0.00 0.00 0.00 9716164 0.00 0.00 SpinRelease

Since these files are to big for a posting, I have put the whole profile files on:
ftp://ftp.insilico.com/out.fast.gz
ftp://ftp.insilico.com/out.slow.gz

I don't know why the time column and number of seconds is zero in all the cases.
I am using the Redhat 7.1 binutils (binutils-2.10.91.0.2-3).

On Tue, 21 Aug 2001 17:38:23 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Andreas Wernitznig <andreas@insilico.com> writes:

I am aware of the performance drawbacks because of indices and
triggers. In fact I have a trigger and an index on the most populated
table. It is not possible in my case to remove the primary keys
during insert, because the database structure and foreign keys
validate my data during import.

Foreign keys eh?

The problem is, that sometimes the performance is good, and sometimes
the database is awfully slow. If it is slow, postgres is eating up
all CPU time and it takes at least 150 times longer to insert the
data. I don't know why and what to do against that.

We found some foreign-key-related performance problems not long ago,
and it could be you're happening on another one. However there's not
enough info here to figure it out. I can offer you two alternatives:

1. Compile up the backend with profiling enabled (if you're using gcc
then "make PROFILE=-pg clean all" in src/backend should do the trick).
Collect profiles for both a "normal" and a "slow" run and send them in.

2. Develop a self-contained example that exhibits the problem, and send
it along for someone else to profile.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Tom Lane

tgl@sss.pgh.pa.us

almost 25 years ago

In reply to: Andreas Wernitznig (#4)

Re: Re: low performance

Andreas Wernitznig <andreas@insilico.com> writes:

I took option 1 and managed to create a profile of a slow and a fast run:

It's difficult to compare these profiles, because they seem to be taken
over very different numbers of queries --- did you let the "fast" run
process more queries than the "slow" one?

However, I think what is happening is that some queries are being done
as indexscans in the fast case and seqscans in the slow case. The
ratio of ExecIndexScan calls to ExecSeqScan calls is vastly different
in the two profiles.

It looks like you are doing vacuums fairly frequently, so I speculate
that the statistics gathered by vacuum are changing just enough to alter
the planner's choice between indexscan and seqscan for some
often-executed query type.  Evidently the planner is guessing the costs
to be nearly the same, so a small change in stats might tip the choice
--- but in reality the costs are quite a bit different, thus you observe
fast and slow behavior.

The next step should be to get EXPLAIN results for the queries used
by your application in both fast and slow states. This will help us
narrow down where the planner's misprediction is occurring.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

almost 25 years ago

In reply to: Andreas Wernitznig (#1)

Re: Re: low performance

Andreas Wernitznig <andreas@insilico.com> writes:

To make it more comparable I have made two additional runs, a slow and
a fast one with exactly the same number of inserts (about 20500) and
put it on our ftp server:

However, I think what is happening is that some queries are being done
as indexscans in the fast case and seqscans in the slow case. The
ratio of ExecIndexScan calls to ExecSeqScan calls is vastly different
in the two profiles.

Does the new profiles proof that assumption ?

Yes, see for yourself:
def.fast:
0.00 0.00 0.00 22481 0.00 0.00 ExecSeqScan
0.00 0.00 0.00 20161 0.00 0.00 ExecIndexScan
def.slow:
0.00 0.01 0.00 41940 0.00 0.00 ExecSeqScan
0.00 0.01 0.00 702 0.00 0.00 ExecIndexScan

So there are about 19500 queries that are being done as indexscans in
one case and seqscans in the other.

If I run "vacuum" and "vacuum analyze" on an empty database, the
following run will be a SLOW one.

The whole point of vacuum analyze is to give the planner some statistics
about the contents of the tables. Vacuum analyze when a table is empty
is useless (even counterproductive, if the table shortly thereafter
becomes large --- the planner will still think it is empty).

regards, tom lane

Import Notes

Reply to msg id not found: 20010828171601.2fbf28c0.andreas@insilico.com

Andreas Wernitznig

andreas@insilico.com

almost 25 years ago

In reply to: Tom Lane (#6)

Re: Re: low performance

Yes, I understand very clearly what you mean.

Maybe my mails were to confused, that's why I try to explain my problem once more:

step 1. An empty table with a primary key (=index key) where an "explain" tells me, that a Seq Scan is used to SELECT a special row.
step 2. Then I start to fill data into that table.
step 3. Then I run a vacuum analyze to update the planner statistics.
step 4. I run an "EXPLAIN select * from <mytable> where <pk-column> = 999;"
step 5. Then I fill in additional data.

What I expect is, that from step 5 on the pk-trigger (I don't know if this mechanism that checks for uniqueness is really a trigger) uses the Index to check for possible double entries.
Although "EXPLAIN" in step 4 pretend to use an Index Scan the data insert becomes slower and slower (>98% of the processor is used by a postmaster). All these steps are done with a single connection (postmaster).

The only way to make it faster after step 3 is to close that connection (and stop that postmaster thread with it) and establish a new one.
It seems like the planner (at least for pk checking) of an *established* connection to a database doesn't receive the information gained from "vacuum analyze".

Greetings
Andreas

On Mon, 03 Sep 2001 12:26:39 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Andreas Wernitznig <andreas@insilico.com> writes:

To make it more comparable I have made two additional runs, a slow and
a fast one with exactly the same number of inserts (about 20500) and
put it on our ftp server:

However, I think what is happening is that some queries are being done
as indexscans in the fast case and seqscans in the slow case. The
ratio of ExecIndexScan calls to ExecSeqScan calls is vastly different
in the two profiles.

Does the new profiles proof that assumption ?

Yes, see for yourself:
def.fast:
0.00 0.00 0.00 22481 0.00 0.00 ExecSeqScan
0.00 0.00 0.00 20161 0.00 0.00 ExecIndexScan
def.slow:
0.00 0.01 0.00 41940 0.00 0.00 ExecSeqScan
0.00 0.01 0.00 702 0.00 0.00 ExecIndexScan

So there are about 19500 queries that are being done as indexscans in
one case and seqscans in the other.

If I run "vacuum" and "vacuum analyze" on an empty database, the
following run will be a SLOW one.

The whole point of vacuum analyze is to give the planner some statistics
about the contents of the tables. Vacuum analyze when a table is empty
is useless (even counterproductive, if the table shortly thereafter
becomes large --- the planner will still think it is empty).

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

almost 25 years ago

In reply to: Andreas Wernitznig (#1)

Re: Re: low performance

Andreas Wernitznig <andreas@insilico.com> writes:

The only way to make it faster after step 3 is to close that connection (and stop that postmaster thread with it) and establish a new one.
It seems like the planner (at least for pk checking) of an *established* connection to a database doesn't receive the information gained from "vacuum analyze".

Ah. Yes, you are quite right: the queries used for PK checking are
planned just once when first executed, and thereafter the plans are
cached and reused for the life of that backend. No doubt you are seeing
continued use of a no-longer-appropriate PK plan.

We have a TODO item to be smarter about dropping cached plans when
relevant context changes, but it's not done yet.

regards, tom lane

Import Notes

Reply to msg id not found: 20010903205038.40b4bcbd.andreas@insilico.com