High activity short table and locks

Started by Guillaume Bogover 17 years ago6 messagesgeneral
Jump to latest
#1Guillaume Bog
guibog@gmail.com

Hi!

We have been using postgresql since a while without problems. But now I find
we experience some slowness and the weird thing is that it seems to happen
because of a very short table (less than 200 lines), called "lockers" (see
below it's structure).

This table is accessed very often by a lot of different sessions, with
SELECT, UPDATE and DELETE statements. We keep in this table some shared
state about our application users, so we make sure they don't work together
on the same things, thus the name.

Maybe we do something really wrong. Maybe we even should not use a database
for this kind of persistence (we tried a simple file but it was much worse).
We have improved the code by merging many little SELECTs into one bigger
with all the results needed. This ridiculously short table should hold in
cache memory, I'd suppose, so why is it so slow?

Today I found something that could help me to find an answer: by running the
"locks" sql below I have above 100 lines of results, with many locks
detected.

I have added the indexes sometime ago because, to my great surprise, it did
really improve the speed. I just tried to remove them on the fly and it was
worse, *but* I had less locks.

It seems I'm a bit stuck here. I'd appreciate some help. My main general
question is "how to handle very small but hot status table that has to be
updated every 30 seconds by 100 different persons, read and updated from
many sides, and also joined with some more common tables (i.e. much larger
but less hot)"

Thanks for your help. Please find below some tech info.

Table structure

Column | Type |
Modifiers
-----------------------+--------------------------+--------------------------------------------------------
l_id | integer | not null default
nextval('lockers_l_id_seq'::regclass)
l_xref_u_id | integer |
l_type | character varying |
l_what | character varying |
l_status | character varying |
l_tech_modification | timestamp with time zone | not null default now()
l_tech_creation | timestamp with time zone | not null default now()
l_tech_deleted | boolean | not null default false
l_status_modification | timestamp with time zone |
l_comment | character varying |
l_csl | character varying |
Indexes:
"lockers_pkey" PRIMARY KEY, btree (l_id)
"lockers_l_csl_idx" btree (l_csl)
"lockers_l_type_idx" btree (l_type)
"lockers_l_what_idx" btree (l_what)
"lockers_l_xref_u_id_idx" btree (l_xref_u_id)

Lockers SQL

SELECT pg_stat_activity.datname,pg_class.relname,pg_locks.transaction,
pg_locks.mode, pg_locks.granted,
pg_stat_activity.usename, substr(pg_stat_activity.current_query,1,20),
to_char(pg_stat_activity.query_start,'HH24:MI'),
to_char(age(now(),pg_stat_activity.query_start),'HH24:MI') AS "age",
pg_stat_activity.procpid
FROM pg_stat_activity,pg_locks LEFT OUTER JOIN pg_class ON
(pg_locks.relation = pg_class.oid)
WHERE pg_locks.pid=pg_stat_activity.procpid
ORDER BY query_start;

Sample result of lockers SQL

vf_cn2fr | lockers | 468474452 | RowExclusiveLock
| t | vf_cn2fr | UPDATE lockers SET l | 15:50 | 00:00 | 30395
vf_cn2fr | lockers | 468474458 | AccessShareLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | lockers_l_csl_idx | 468474458 | AccessShareLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | lockers_l_what_idx | 468474458 | AccessShareLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | lockers_l_type_idx | 468474458 | AccessShareLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | lockers_l_xref_u_id_idx | 468474458 | AccessShareLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | lockers_pkey | 468474458 | AccessShareLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | | 468474458 | ExclusiveLock
| t | vf_cn2fr | SELECT * FROM locker | 15:50 | 00:00 | 30414
vf_cn2fr | pg_locks | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_database | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_authid_oid_index | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_class | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | | 468474459 | ExclusiveLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_class_relname_nsp_index | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_class_oid_index | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_authid | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_authid_rolname_index | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_stat_activity | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_database_datname_index | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008
vf_cn2fr | pg_database_oid_index | 468474459 | AccessShareLock
| t | postgres | SELECT pg_stat_activ | 15:50 | 00:00 | 18008

#2Richard Huxton
dev@archonet.com
In reply to: Guillaume Bog (#1)
Re: High activity short table and locks

Guillaume Bog wrote:

It seems I'm a bit stuck here. I'd appreciate some help. My main general
question is "how to handle very small but hot status table that has to be
updated every 30 seconds by 100 different persons, read and updated from
many sides, and also joined with some more common tables (i.e. much larger
but less hot)"

Remove all indexes except the one backing the primary-key. Run a VACUUM
FULL and REINDEX or CLUSTER the table. Vacuum *very frequently* -
you'll want custom values in pg_autovacuum. Add indexes back one at a
time to see what's really necessary. If you can keep the dead rows to a
reasonable level, I'd have thought you could get by without indexes.

You might want to consider setting synchronous_commit=off for updates to
the table. I'm assuming the information in the table isn't vital in the
event of a system crash, and that could reduce WAL activity if you're
limited by disk bandwidth.

--
Richard Huxton
Archonet Ltd

#3Guillaume Bog
guibog@gmail.com
In reply to: Richard Huxton (#2)
Re: High activity short table and locks

On Wed, Jul 23, 2008 at 4:50 PM, Richard Huxton <dev@archonet.com> wrote:

Guillaume Bog wrote:

It seems I'm a bit stuck here. I'd appreciate some help. My main general
question is "how to handle very small but hot status table that has to be
updated every 30 seconds by 100 different persons, read and updated from
many sides, and also joined with some more common tables (i.e. much larger
but less hot)"

Remove all indexes except the one backing the primary-key. Run a VACUUM
FULL and REINDEX or CLUSTER the table. Vacuum *very frequently* - you'll
want custom values in pg_autovacuum. Add indexes back one at a time to see
what's really necessary. If you can keep the dead rows to a reasonable
level, I'd have thought you could get by without indexes.

Yes, such a small table, very frequently updated, would suggest no index at
all. I understand that I may have many dead rows, that would explain that
fact that indexes do speed up the selects. We have a daily VACUUM FULL on
the database, but it may be not enough. I'll check tomorrow if a verboze
vacuum tell me that many rows are dead. The problem is that I need to work
directly on production server, as everything goes very well when there is
not enough people actually working.

You might want to consider setting synchronous_commit=off for updates to
the table. I'm assuming the information in the table isn't vital in the
event of a system crash, and that could reduce WAL activity if you're
limited by disk bandwidth.

Yes, this table's data is very short lived and can be lost without problem
in case of a crash. I could even have no WAL at all for this table if it is
possible. In my mind, this data should be stored and modified in a fixed
chunk of RAM and never go to hard-drive.

Thanks for your help. I'll try your suggestions tomorrow.

#4Richard Huxton
dev@archonet.com
In reply to: Guillaume Bog (#1)
Re: High activity short table and locks

Guillaume Bog wrote:

On Wed, Jul 23, 2008 at 11:17 PM, Richard Huxton <dev@archonet.com> wrote:

I tried a vacuum full and had to stop it as it was blocking the server for
too long. Below is the partial results I got. It seems you are right:
enormous amount of dead space and rows. I did the same vacuum later and it
seems to have improved a lot the performance. I need to check again
tomorrow.

We don't have autovacuum, but as it seems autovacuum cannot target a
specific table, I may prefer do it by cron every minute, as you suggest.

There's a pg_autovacuum system table that lets you tune things
table-by-table. See the manual for details. In your case, a manual
vacuum once a minute will be a huge step forward anyway.

vf_cn2fr=# VACUUM FULL VERBOSE lockers ;
INFO: vacuuming "public.lockers"
INFO: "lockers": found 4228421 removable, 107 nonremovable row versions in
64803 pages

Well, that table is at least 1000 times larger than it needs to be.

If you've run a VACUUM FULL, you'll want to run REINDEX on all the
indexes on that table too.

64803 pages containing 512643700 free bytes are potential move destinations.

Ouch! that's a 500MB table holding 100 live rows.

You could fiddle around setting up ramdisks and pointing tablespaces there,
but I'm not sure it's worth it.

If it is possible to have no WAL at all on this table, I'd prefer to try it.
It seems completely useless and is probably taking a fair amount of i/o.

It's a bit early to be sure if the solution is there, but I feel you already
did throw some good light on my dark path, I have to thank you for that.

Afraid not. The synchronous_commit setting can reduce the disk I/O though.

--
Richard Huxton
Archonet Ltd

#5Guillaume Bog
guibog@gmail.com
In reply to: Richard Huxton (#4)
Re: High activity short table and locks

Hello,

My first impression is that vacuuming the offending table very often helps a
lot. I'm doing it by hand for now but I will have a cronjob for this. By the
way, it seems I don't need thoses indexes anymore. Thanks a lot for your
helpful advices.

On Thu, Jul 24, 2008 at 6:19 PM, Richard Huxton <dev@archonet.com> wrote:

Show quoted text

Guillaume Bog wrote:

On Wed, Jul 23, 2008 at 11:17 PM, Richard Huxton <dev@archonet.com>
wrote:

I tried a vacuum full and had to stop it as it was blocking the server for
too long. Below is the partial results I got. It seems you are right:
enormous amount of dead space and rows. I did the same vacuum later and it
seems to have improved a lot the performance. I need to check again
tomorrow.

We don't have autovacuum, but as it seems autovacuum cannot target a
specific table, I may prefer do it by cron every minute, as you suggest.

There's a pg_autovacuum system table that lets you tune things
table-by-table. See the manual for details. In your case, a manual vacuum
once a minute will be a huge step forward anyway.

vf_cn2fr=# VACUUM FULL VERBOSE lockers ;

INFO: vacuuming "public.lockers"
INFO: "lockers": found 4228421 removable, 107 nonremovable row versions
in
64803 pages

Well, that table is at least 1000 times larger than it needs to be.

If you've run a VACUUM FULL, you'll want to run REINDEX on all the indexes
on that table too.

64803 pages containing 512643700 free bytes are potential move

destinations.

Ouch! that's a 500MB table holding 100 live rows.

You could fiddle around setting up ramdisks and pointing tablespaces

there,
but I'm not sure it's worth it.

If it is possible to have no WAL at all on this table, I'd prefer to try
it.
It seems completely useless and is probably taking a fair amount of i/o.

It's a bit early to be sure if the solution is there, but I feel you
already
did throw some good light on my dark path, I have to thank you for that.

Afraid not. The synchronous_commit setting can reduce the disk I/O though.

--
Richard Huxton
Archonet Ltd

#6Richard Huxton
dev@archonet.com
In reply to: Guillaume Bog (#5)
Re: High activity short table and locks

Guillaume Bog wrote:

Hello,

My first impression is that vacuuming the offending table very often helps a
lot. I'm doing it by hand for now but I will have a cronjob for this. By the
way, it seems I don't need thoses indexes anymore. Thanks a lot for your
helpful advices.

Excellent! Oh, when you set up your cron-job, it's probably better to
vacuum this sort of table too often rather than not enough.

--
Richard Huxton
Archonet Ltd