UPDATE performance degradation (6.5.1)

Started by Oleg Bartunovover 26 years ago9 messageshackers
Jump to latest
#1Oleg Bartunov
oleg@sai.msu.su

Hi,

after I got DBIlogging work, I run several tests and noticed performance
degradation when doing sequential updating of *one* row.

I have 5 processes updated the same row. I use
LOCK TABLE hits IN SHARE ROW EXCLUSIVE MODE

When I run 200 requests I got about 16 req/sec, which is quite enough
for my purposes. I expected the same speed if I just increase a number of
requests, but it decreases. for 2000 requests I got about 10 req/sec
and for 20,000 - about 2.5 req/sec !
I see no reason for such performance degradation - no way to use
postgres for logging in 24*7*365 Web-site. Probably this is very
specific case when several processes updates only one row,
but again, I see no reason for such big degradation.
Table hits itself contains only 1 row !
I'll try to elimanate httpd, perl in my test bench to test only
postgres, I dont' have right now such a tool, probable someone
already did this ? What tool I can use for testing concurrent update

Regards,
Oleg

This is my home machine, Linux 2.2.10. postgres 6.5.1
Load is about 2-2.5

Typical output of ps:

11:21[om]:/usr/local/apache/logs>psg disc
1036 ? S 24:17 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK
1040 ? R 24:09 /usr/local/pgsql/bin/postgres localhost httpd discovery idle
1042 ? S 24:02 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK
1044 ? R 23:51 /usr/local/pgsql/bin/postgres localhost httpd discovery idle
1046 ? S 23:49 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK
1048 ? S 23:47 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK

I see only one process with SELECT, this is what I expected when use
IN SHARE ROW EXCLUSIVE MODE. Right ?

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Oleg Bartunov (#1)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

Probably I found the problem. After running my test, whiich became
very slow I looked at /usr/local/pgsql/data/base/discovery

-rw------- 1 postgres users 5070848 Jul 27 16:14 hits
-rw------- 1 postgres users 1409024 Jul 27 16:14 hits_pkey

This is for table with one row after a lot of updates.
Too much. vacuum analyze this table was a good medicine !
Is this a design problem ?

Regards,
Oleg

On Tue, 27 Jul 1999, Oleg Bartunov wrote:

Date: Tue, 27 Jul 1999 12:51:07 +0400 (MSD)
From: Oleg Bartunov <oleg@sai.msu.su>
To: pgsql-hackers@postgreSQL.org
Subject: [HACKERS] UPDATE performance degradation (6.5.1)

Hi,

after I got DBIlogging work, I run several tests and noticed performance
degradation when doing sequential updating of *one* row.

I have 5 processes updated the same row. I use
LOCK TABLE hits IN SHARE ROW EXCLUSIVE MODE

When I run 200 requests I got about 16 req/sec, which is quite enough
for my purposes. I expected the same speed if I just increase a number of
requests, but it decreases. for 2000 requests I got about 10 req/sec
and for 20,000 - about 2.5 req/sec !
I see no reason for such performance degradation - no way to use
postgres for logging in 24*7*365 Web-site. Probably this is very
specific case when several processes updates only one row,
but again, I see no reason for such big degradation.
Table hits itself contains only 1 row !
I'll try to elimanate httpd, perl in my test bench to test only
postgres, I dont' have right now such a tool, probable someone
already did this ? What tool I can use for testing concurrent update

Regards,
Oleg

This is my home machine, Linux 2.2.10. postgres 6.5.1
Load is about 2-2.5

Typical output of ps:

11:21[om]:/usr/local/apache/logs>psg disc
1036 ? S 24:17 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK
1040 ? R 24:09 /usr/local/pgsql/bin/postgres localhost httpd discovery idle
1042 ? S 24:02 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK
1044 ? R 23:51 /usr/local/pgsql/bin/postgres localhost httpd discovery idle
1046 ? S 23:49 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK
1048 ? S 23:47 /usr/local/pgsql/bin/postgres localhost httpd discovery LOCK

I see only one process with SELECT, this is what I expected when use
IN SHARE ROW EXCLUSIVE MODE. Right ?

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oleg Bartunov (#2)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

Oleg Bartunov <oleg@sai.msu.su> writes:

Probably I found the problem. After running my test, whiich became
very slow I looked at /usr/local/pgsql/data/base/discovery

-rw------- 1 postgres users 5070848 Jul 27 16:14 hits
-rw------- 1 postgres users 1409024 Jul 27 16:14 hits_pkey

This is for table with one row after a lot of updates.
Too much. vacuum analyze this table was a good medicine !

If the table contains only one row, why are you bothering with an
index on it?

Is this a design problem ?

Only that space in tables and indexes can't be re-used until vacuum.
I'm not sure if there's any good way around that or not...

regards, tom lane

#4Oleg Bartunov
oleg@sai.msu.su
In reply to: Tom Lane (#3)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

On Tue, 27 Jul 1999, Tom Lane wrote:

Date: Tue, 27 Jul 1999 10:39:40 -0400
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Oleg Bartunov <oleg@sai.msu.su>
Cc: pgsql-hackers@postgreSQL.org
Subject: Re: [HACKERS] UPDATE performance degradation (6.5.1)

Oleg Bartunov <oleg@sai.msu.su> writes:

Probably I found the problem. After running my test, whiich became
very slow I looked at /usr/local/pgsql/data/base/discovery

-rw------- 1 postgres users 5070848 Jul 27 16:14 hits
-rw------- 1 postgres users 1409024 Jul 27 16:14 hits_pkey

This is for table with one row after a lot of updates.
Too much. vacuum analyze this table was a good medicine !

If the table contains only one row, why are you bothering with an
index on it?

This table with one row is just for test. In production it will
contain many thousands of msg_id. I didn't test yet waht will happens
if I populate table by thousands of row. But could imagine how long
it will be updated. Ooh.

Is this a design problem ?

Only that space in tables and indexes can't be re-used until vacuum.
I'm not sure if there's any good way around that or not...

So, I need a cron job to vaccuum database. I'm curious how mysql works
so fast and has no problem in Web environment. I know some sites with
mysql logging and millions of updates every day.

Oleg

18:54[om]:/usr/local/apache/comps/discovery/db>psql discovery -c 'select * from hits'
msg_id|count|first_access |last_access
------+-----+----------------------------+----------------------------
1463|44417|Tue 27 Jul 10:30:18 1999 MSD|Tue 27 Jul 18:44:31 1999 MSD
123|58814|Mon 26 Jul 22:54:54 1999 MSD|Tue 27 Jul 10:29:54 1999 MSD
4| 219|Mon 26 Jul 22:48:48 1999 MSD|Mon 26 Jul 22:49:02 1999 MSD
2| 418|Mon 26 Jul 22:47:28 1999 MSD|Mon 26 Jul 22:48:12 1999 MSD
1| 211|Mon 26 Jul 22:46:44 1999 MSD|Mon 26 Jul 22:47:09 1999 MSD
13| 1|Sat 24 Jul 23:56:57 1999 MSD|
1464| 1|Tue 27 Jul 18:17:51 1999 MSD|
(7 rows)

and after vacuum analyze:

-rw------- 1 postgres users 8192 Jul 27 18:54 hits
-rw------- 1 postgres users 1703936 Jul 27 18:54 hits_pkey

Why hits_pkey is so big ? I have only 7 rows in the table.

regards, tom lane

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#5Andreas Zeugswetter
andreas.zeugswetter@telecom.at
In reply to: Oleg Bartunov (#4)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

This is for table with one row after a lot of updates.
Too much. vacuum analyze this table was a good medicine !
Is this a design problem ?

In PostgreSQL an update always adds a new row to the table.
The old rows get eliminated by vacuum that is the whole business of vacuum.
There has been some discussion for implementing row reuse,
but that is a major task.

Andreas

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andreas Zeugswetter (#5)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

Oleg Bartunov <oleg@sai.msu.su> writes:

and after vacuum analyze:
-rw------- 1 postgres users 8192 Jul 27 18:54 hits
-rw------- 1 postgres users 1703936 Jul 27 18:54 hits_pkey
Why hits_pkey is so big ? I have only 7 rows in the table.

Looks like vacuum reclaims the extra space in the table itself,
but does not do so with indexes. Ugh.

I've thought for some time that vacuum ought to drop and rebuild
indexes instead of trying to update them. This might be another
reason for doing that...

regards, tom lane

#7Oleg Bartunov
oleg@sai.msu.su
In reply to: Andreas Zeugswetter (#5)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

On Tue, 27 Jul 1999, Zeugswetter Andreas IZ5 wrote:

Date: Tue, 27 Jul 1999 16:57:06 +0200
From: Zeugswetter Andreas IZ5 <Andreas.Zeugswetter@telecom.at>
To: 'Oleg Bartunov' <oleg@sai.msu.su>
Cc: "'hackers@postgresql.org'" <hackers@postgresql.org>
Subject: Re: [HACKERS] UPDATE performance degradation (6.5.1)

This is for table with one row after a lot of updates.
Too much. vacuum analyze this table was a good medicine !
Is this a design problem ?

In PostgreSQL an update always adds a new row to the table.
The old rows get eliminated by vacuum that is the whole business of vacuum.
There has been some discussion for implementing row reuse,
but that is a major task.

Ok, I understand now the size of the table. What's about index file ?
Why it's so big. Look. just did delete from hits and vacuum analyze.

om:/usr/local/pgsql/data/base/discovery$ l hits*
-rw------- 1 postgres users 0 Jul 27 19:14 hits
-rw------- 1 postgres users 2015232 Jul 27 19:14 hits_pkey

after 6500 updates:

om:/usr/local/pgsql/data/base/discovery$ l hits*
-rw------- 1 postgres users 344064 Jul 27 19:23 hits
-rw------- 1 postgres users 2097152 Jul 27 19:23 hits_pkey

and it took a lot of time. Also I populate table hits by 10,000 rows
and run the same test. It was incredibly slow.

It seems index file doesn't affected by vacuum analyze !
Could we consider this as a bug ?

Regards,

Oleg

Andreas

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#8Oleg Bartunov
oleg@sai.msu.su
In reply to: Tom Lane (#6)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

On Tue, 27 Jul 1999, Tom Lane wrote:

Date: Tue, 27 Jul 1999 10:57:36 -0400
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Oleg Bartunov <oleg@sai.msu.su>
Cc: pgsql-hackers@postgreSQL.org
Subject: Re: [HACKERS] UPDATE performance degradation (6.5.1)

Oleg Bartunov <oleg@sai.msu.su> writes:

and after vacuum analyze:
-rw------- 1 postgres users 8192 Jul 27 18:54 hits
-rw------- 1 postgres users 1703936 Jul 27 18:54 hits_pkey
Why hits_pkey is so big ? I have only 7 rows in the table.

Looks like vacuum reclaims the extra space in the table itself,
but does not do so with indexes. Ugh.

And do we consider this as a bug ? How do correcting of vacuum
could change poor performance ?

I just rebuild my table without using indices and performace increased
a lot. But this is undesirable because it will slowdown my application.
I'll try dbm files for logging instead of postgres. What's the shame :-)

regards,
Oleg

I've thought for some time that vacuum ought to drop and rebuild
indexes instead of trying to update them. This might be another
reason for doing that...

regards, tom lane

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#9Michael Robinson
robinson@netrinsics.com
In reply to: Oleg Bartunov (#8)
Re: [HACKERS] UPDATE performance degradation (6.5.1)

Oleg Bartunov <oleg@sai.msu.su> writes:

So, I need a cron job to vaccuum database. I'm curious how mysql works
so fast and has no problem in Web environment. I know some sites with
mysql logging and millions of updates every day.

The mysql faq explains this in detail. The short answer is that mysql
has been highly optimized for a small subset of possible RDBMS applications by
eliminating support for many important RDBMS features (transactions,
referential integrity, etc., etc.).

Not only is mysql faster than postgres on, e.g., simple web logging, it
is also much faster than any commercial RDBMS, such as Oracle, Sybase, etc.

In reality, mysql is little more than a flat-file database with an SQL
query interface. But if that's all you need for your application, then
there is no reason not to use it. It's what my hosting service uses, and
I've learned to live with it for simple Web stuff.

-Michael Robinson