update functions locking tables

Started by Clodoaldo Pintoover 20 years ago15 messagesgeneral
Jump to latest
#1Clodoaldo Pinto
clodoaldo.pinto@gmail.com

I have almost completed one web site migration from mysql to pgsql. It
is already running totally pgsql.

But there is one last conversion problem. Most of the queries use
tables populated every 3 hours.

In mysql, for the site to not be unavailable during updating i was
creating temporary tables, populating them, dropping the main tables
and then renaming the temp tables to the main tables. The updating is
not trivial (for me) and needs some coding effort to be done.

Since pgsql has MVCC I wanted to eliminate the table rotation step and
use a transaction to update the tables. But what is happening is that
the plpgsql update functions are locking the tables and this is what
the web clients are getting (from ps ax):

26873 ? S 0:18 postgres: user database 127.0.0.1(38946) SELECT waiting
23973 ? S 0:22 postgres: user database 127.0.0.1(43396) SELECT waiting
31212 ? S 0:02 postgres: user database 127.0.0.1(60619) SELECT waiting
21601 ? S 0:00 postgres: user database 127.0.0.1(46770) SELECT waiting

I've been trying for 3 days to figure out what is happening to no
avail. What am i missing about transactions and MVCC? What could make
a plpgsql update function lock a table? The indexes are default btree.
Otherwise the functions are behaving exactly as expected.

Regards, Clodoaldo Pinto

#2Michael Fuhr
mike@fuhr.org
In reply to: Clodoaldo Pinto (#1)
Re: update functions locking tables

On Mon, Aug 29, 2005 at 09:41:21PM -0300, Clodoaldo Pinto wrote:

26873 ? S 0:18 postgres: user database 127.0.0.1(38946) SELECT waiting
23973 ? S 0:22 postgres: user database 127.0.0.1(43396) SELECT waiting
31212 ? S 0:02 postgres: user database 127.0.0.1(60619) SELECT waiting
21601 ? S 0:00 postgres: user database 127.0.0.1(46770) SELECT waiting

I've been trying for 3 days to figure out what is happening to no
avail. What am i missing about transactions and MVCC? What could make
a plpgsql update function lock a table? The indexes are default btree.

In general, writers shouldn't block readers. Have you examined
pg_locks? Do you know exactly what the blocked queries are, or can
you find out from pg_stat_activity (stats_command_string must be
enabled)? Are you doing any explicit locking (LOCK statement)?

--
Michael Fuhr

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Clodoaldo Pinto (#1)
Re: update functions locking tables

Clodoaldo Pinto <clodoaldo.pinto@gmail.com> writes:

I've been trying for 3 days to figure out what is happening to no
avail. What am i missing about transactions and MVCC? What could make
a plpgsql update function lock a table?

What is the function doing to the table, exactly? DDL changes generally
take exclusive locks ...

regards, tom lane

#4Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Michael Fuhr (#2)
Re: update functions locking tables

2005/8/29, Michael Fuhr <mike@fuhr.org>:

In general, writers shouldn't block readers. Have you examined
pg_locks? Do you know exactly what the blocked queries are, or can
you find out from pg_stat_activity (stats_command_string must be
enabled)? Are you doing any explicit locking (LOCK statement)?

This is one of the blocked queries:
select count (*) from times_producao where pontos_0 - pontos_7 > 0;

These selects were done during the updating:

select * from pg_locks;
relation | database | transaction | pid | mode | granted
----------+----------+-------------+-------+---------------------+---------
1813938 | 1813868 | | 7040 | AccessShareLock | t
1813938 | 1813868 | | 7040 | RowExclusiveLock | t
1813938 | 1813868 | | 7040 | ShareLock | t
1813938 | 1813868 | | 7040 | AccessExclusiveLock | t
1813939 | 1813868 | | 7040 | AccessShareLock | t
1813939 | 1813868 | | 7040 | RowExclusiveLock | t
1813939 | 1813868 | | 7040 | ShareLock | t
1813939 | 1813868 | | 7040 | AccessExclusiveLock | t
1813914 | 1813868 | | 24012 | AccessShareLock | f
1813892 | 1813868 | | 7040 | AccessShareLock | t
1813892 | 1813868 | | 7040 | RowExclusiveLock | t
1813914 | 1813868 | | 7040 | AccessShareLock | t
1813914 | 1813868 | | 7040 | RowExclusiveLock | t
1813914 | 1813868 | | 7040 | ShareLock | t
1813914 | 1813868 | | 7040 | AccessExclusiveLock | t
1813896 | 1813868 | | 7040 | AccessShareLock | t
16839 | 1813868 | | 12751 | AccessShareLock | t
2314110 | 1813868 | | 26871 | AccessShareLock | f
1813914 | 1813868 | | 26844 | AccessShareLock | f
| | 288553 | 26844 | ExclusiveLock | t
| | 288561 | 24012 | ExclusiveLock | t
| | 288548 | 7040 | ExclusiveLock | t
| | 288558 | 26871 | ExclusiveLock | t
1813914 | 1813868 | | 31212 | AccessShareLock | f
2314110 | 1813868 | | 7040 | AccessShareLock | t
2314110 | 1813868 | | 7040 | RowExclusiveLock | t
2314110 | 1813868 | | 7040 | ShareLock | t
2314110 | 1813868 | | 7040 | AccessExclusiveLock | t
| | 288556 | 31212 | ExclusiveLock | t
| | 288562 | 12751 | ExclusiveLock | t
1813887 | 1813868 | | 7040 | AccessShareLock | t
2314112 | 1813868 | | 7040 | ShareLock | t
2314112 | 1813868 | | 7040 | AccessExclusiveLock | t
1813907 | 1813868 | | 7040 | AccessShareLock | t
1813911 | 1813868 | | 7040 | AccessShareLock | t
(35 rows)

select *
from pg_stat_user_tables as a
inner join pg_locks as b
on a.relid = b.relation
;

relid | schemaname | relname | seq_scan | seq_tup_read |
idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
relation | database | transaction | pid | mode |
granted
---------+------------+-------------------+----------+--------------+----------+---------------+-----------+-----------+-----------+----------+----------+-------------+-------+---------------------+---------
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 24012 | AccessShareLock | f
1813892 | public | last_date | 0 | 0 |
| | 0 | 0 | 0 | 1813892
| 1813868 | | 7040 | AccessShareLock | t
1813892 | public | last_date | 0 | 0 |
| | 0 | 0 | 0 | 1813892
| 1813868 | | 7040 | RowExclusiveLock | t
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 7040 | AccessShareLock | t
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 7040 | RowExclusiveLock | t
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 7040 | ShareLock | t
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 7040 | AccessExclusiveLock | t
1813896 | public | times | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813896
| 1813868 | | 7040 | AccessShareLock | t
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 31383 | AccessShareLock | f
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 12351 | AccessShareLock | f
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 26871 | AccessShareLock | f
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 26844 | AccessShareLock | f
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 24021 | AccessShareLock | f
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 31212 | AccessShareLock | f
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 7040 | AccessShareLock | t
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 7040 | RowExclusiveLock | t
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 7040 | ShareLock | t
2314110 | public | times_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 2314110
| 1813868 | | 7040 | AccessExclusiveLock | t
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 26872 | AccessShareLock | f
1813887 | public | datas | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813887
| 1813868 | | 7040 | AccessShareLock | t
1813914 | public | usuarios_producao | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813914
| 1813868 | | 8875 | AccessShareLock | f
1813907 | public | usuarios | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813907
| 1813868 | | 7040 | AccessShareLock | t
1813911 | public | usuarios_indice | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1813911
| 1813868 | | 7040 | AccessShareLock | t
(23 rows)

I had just enabled stats_command_string and in about 15 hours i will
be able to post pg_stat_activity.

Regards, Clodoaldo Pinto

#5Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Tom Lane (#3)
Re: update functions locking tables

2005/8/29, Tom Lane <tgl@sss.pgh.pa.us>:

What is the function doing to the table, exactly? DDL changes generally
take exclusive locks ...

This is the transaction:

begin;
select update_last_date();
truncate times_producao;
select kstime(), insert_times_producao(), kstime();
select kstime(), update_ranking_times(), kstime();
truncate usuarios_producao;
select kstime(), insert_usuarios_producao(), kstime();
analyze usuarios_producao;
select kstime(), update_ranking_usuarios(), kstime();
select kstime(), update_ranking_usuarios_time(), kstime();
select kstime(), update_team_active_members(), kstime();
commit;

This is one of the functions:

CREATE OR REPLACE FUNCTION update_ranking_usuarios()
RETURNS void AS
$BODY$declare
linha record;
rank integer;
begin
rank := 0;
for linha in
select usuario
from usuarios_producao
order by pontos_0 desc, pontos_7 desc, pontos_24 desc
loop
rank := rank + 1;
update usuarios_producao
set rank_0 = rank
where usuario = linha.usuario
;
end loop;
-- ----------------------------------------------------------
rank := 0;
for linha in
select usuario
from usuarios_producao
order by pontos_0 + (pontos_7 / 7) desc, pontos_0 desc
loop
rank := rank + 1;
update usuarios_producao
set rank_24 = rank
where usuario = linha.usuario
;
end loop;
-- ----------------------------------------------------------
rank := 0;
for linha in
select usuario
from usuarios_producao
order by pontos_0 + pontos_7 desc, pontos_0 desc
loop
rank := rank + 1;
update usuarios_producao
set rank_7 = rank
where usuario = linha.usuario
;
end loop;
-- ----------------------------------------------------------
rank := 0;
for linha in
select usuario
from usuarios_producao
order by pontos_0 + (pontos_7 * 30 / 7) desc, pontos_0 desc
loop
rank := rank + 1;
update usuarios_producao
set rank_30 = rank
where usuario = linha.usuario
;
end loop;
return;
end;$BODY$
LANGUAGE 'plpgsql' VOLATILE STRICT;

There is no DDL inside the functions.

Regards, Clodoaldo Pinto

#6Michael Fuhr
mike@fuhr.org
In reply to: Clodoaldo Pinto (#5)
Re: update functions locking tables

On Tue, Aug 30, 2005 at 08:39:52AM -0300, Clodoaldo Pinto wrote:

begin;
select update_last_date();
truncate times_producao;

TRUNCATE acquires an AccessExclusiveLock, which conflicts with all
other lock types. Locks are held until the transaction completes,
so once this lock is acquired no other transactions will be able
to access the table until this transaction commits or rolls back.

DELETE is slower than TRUNCATE but it won't block readers in other
transactions.

--
Michael Fuhr

#7Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Michael Fuhr (#6)
Re: update functions locking tables

2005/8/30, Michael Fuhr <mike@fuhr.org>:

TRUNCATE acquires an AccessExclusiveLock, which conflicts with all
other lock types. Locks are held until the transaction completes,
so once this lock is acquired no other transactions will be able
to access the table until this transaction commits or rolls back.

DELETE is slower than TRUNCATE but it won't block readers in other
transactions.

I think it is of great help. I will change it and let you know what happened.

Regards, Clodoaldo Pinto

#8Bruce Momjian
bruce@momjian.us
In reply to: Clodoaldo Pinto (#5)
Re: update functions locking tables

Clodoaldo Pinto <clodoaldo.pinto@gmail.com> writes:

2005/8/29, Tom Lane <tgl@sss.pgh.pa.us>:

What is the function doing to the table, exactly? DDL changes generally
take exclusive locks ...

This is the transaction:

begin;
select update_last_date();
truncate times_producao;

I think truncate takes a table lock.
Just change it to "delete from times_producao".

Also, if consider doing a "vacuum full" or "cluster" after the batch job to
clear up the free space (not in a large transaction). That will still take a
table lock but it may be a small enough downtime to be worth the speed
increase the rest of the day.

--
greg

#9Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Bruce Momjian (#8)
Re: update functions locking tables

30 Aug 2005 09:10:51 -0400, Greg Stark <gsstark@mit.edu>:

I think truncate takes a table lock.
Just change it to "delete from times_producao".

Thanks, i will try it.

Also, if consider doing a "vacuum full" or "cluster" after the batch job to
clear up the free space (not in a large transaction). That will still take a
table lock but it may be a small enough downtime to be worth the speed
increase the rest of the day.

I'm already doing a vacuum (not full) once a day.

A vacuum full or a cluster is totally out of reach since each take
about one hour. The biggest table is 170 million rows long.

Regards, Clodoaldo Pinto

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Clodoaldo Pinto (#9)
Re: update functions locking tables

On Tue, Aug 30, 2005 at 10:18:20AM -0300, Clodoaldo Pinto wrote:

30 Aug 2005 09:10:51 -0400, Greg Stark <gsstark@mit.edu>:

Also, if consider doing a "vacuum full" or "cluster" after the batch job to
clear up the free space (not in a large transaction). That will still take a
table lock but it may be a small enough downtime to be worth the speed
increase the rest of the day.

I'm already doing a vacuum (not full) once a day.

A vacuum full or a cluster is totally out of reach since each take
about one hour.

Even if you cluster/vacuum only the just-loaded table?

The biggest table is 170 million rows long.

I hope this is not the one you are loading daily ...

--
Alvaro Herrera <alvherre[]alvh.no-ip.org> Architect, www.EnterpriseDB.com
"El destino baraja y nosotros jugamos" (A. Schopenhauer)

#11Bruce Momjian
bruce@momjian.us
In reply to: Clodoaldo Pinto (#9)
Re: update functions locking tables

Clodoaldo Pinto <clodoaldo.pinto@gmail.com> writes:

I'm already doing a vacuum (not full) once a day.

A vacuum full or a cluster is totally out of reach since each take
about one hour. The biggest table is 170 million rows long.

Well a regular vacuum will mark the free space for reuse. If you insert or
update any records the new ones will go into those spots. Make sure you set
the fsm_* parameters high enough to cover all the updates and inserts for the
entire day (or repeat the vacuum periodically even if there are no deletes or
updates going on to create more free space).

You should realize that what's going on here is that the old records are still
in your table, marked as deleted. So any sequential scan will take twice as
long as otherwise. I think even index scans could take twice as long too
depending on the distribution of values.

I'm not saying that's untenable. If all your queries are fast enough then
you're set and it's just a cost of having no downtime.

--
greg

#12Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Alvaro Herrera (#10)
Re: update functions locking tables

2005/8/30, Alvaro Herrera <alvherre@alvh.no-ip.org>:

On Tue, Aug 30, 2005 at 10:18:20AM -0300, Clodoaldo Pinto wrote:

A vacuum full or a cluster is totally out of reach since each take
about one hour.

Even if you cluster/vacuum only the just-loaded table?

No, that would be much faster. The biggest just updated is about 600
thousand rows. I will consider it.

The biggest table is 170 million rows long.

I hope this is not the one you are loading daily ...

I load daily 8 times 700+ thousand rows.

#13Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Bruce Momjian (#11)
Re: update functions locking tables

30 Aug 2005 10:35:31 -0400, Greg Stark <gsstark@mit.edu>:

Well a regular vacuum will mark the free space for reuse. If you insert or
update any records the new ones will go into those spots. Make sure you set
the fsm_* parameters high enough to cover all the updates and inserts for the
entire day (or repeat the vacuum periodically even if there are no deletes or
updates going on to create more free space).

I will check those fsm_* parameters.

Show quoted text

You should realize that what's going on here is that the old records are still
in your table, marked as deleted. So any sequential scan will take twice as
long as otherwise. I think even index scans could take twice as long too
depending on the distribution of values.

I'm not saying that's untenable. If all your queries are fast enough then
you're set and it's just a cost of having no downtime.

--
greg

#14Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Clodoaldo Pinto (#4)
Re: update functions locking tables

On Tue, Aug 30, 2005 at 08:13:15AM -0300, Clodoaldo Pinto wrote:

2005/8/29, Michael Fuhr <mike@fuhr.org>:

In general, writers shouldn't block readers. Have you examined
pg_locks? Do you know exactly what the blocked queries are, or can
you find out from pg_stat_activity (stats_command_string must be
enabled)? Are you doing any explicit locking (LOCK statement)?

This is one of the blocked queries:
select count (*) from times_producao where pontos_0 - pontos_7 > 0;

FWIW, that where clause might be more efficient as
WHERE pontos_0 > pontos_7. Some databases would be able to use indexes
to answer that (not sure if PostgreSQL could), plus it removes an
operator. It also seems to be cleaner code to me. :)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com 512-569-9461

#15Clodoaldo Pinto
clodoaldo.pinto@gmail.com
In reply to: Jim Nasby (#14)
Re: update functions locking tables

2005/8/30, Jim C. Nasby <jnasby@pervasive.com>:

FWIW, that where clause might be more efficient as
WHERE pontos_0 > pontos_7. Some databases would be able to use indexes
to answer that (not sure if PostgreSQL could), plus it removes an
operator. It also seems to be cleaner code to me. :)
--

Done, thanks.

Regards, Clodoaldo Pinto