Deadlock Problem

Started by Matthias Schmittabout 22 years ago5 messagesgeneral
Jump to latest
#1Matthias Schmitt
freak002@mmp.lu

Hello,

I am in trouble with more and more deadlock problems. We are
programming a web application with multiple users editing content at
the same time. Multiple times a day PostgreSQL runs into a deadlock,
which can only be resolved by killing some of the clients. Here is an
example from the pg_locks table:

SELECT l.mode, l.granted, l.pid, l.transaction, d.datname, r.relname  
FROM pg_locks l, pg_database d, pg_class r WHERE d.oid=l.database AND  
r.oid=l.relation;
        mode       | granted |  pid  | transaction |      datname        
|         relname
------------------+---------+-------+-------------+-------------------- 
+--------------------------
  AccessShareLock  | t       | 12708 |             | p247_website_1_1_0  
| pg_locks
  AccessShareLock  | t       | 12708 |             | p247_website_1_1_0  
| pg_class
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| upd_dependency_id
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| preset
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| preset
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| file
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| node_pkey
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| descr_node
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| systemuser_usergroup_rel
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| permgroup_permission_rel
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| usergroup_permgroup_rel
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| account_permission_grant
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| permission
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| account_permgroup_grant
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| environment
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| environment
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| systemuser
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| account
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| account
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| nodetype
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| nodetype
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| upd_templatefile
  RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0  
| upd_template
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| upd_template
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0  
| upd_template
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| upd_template
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| upd_pagevalue
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| descr_upd_page
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0  
| upd_page
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| upd_page
  RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0  
| upd_page
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| upd_page
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0  
| upd_dependency
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| upd_dependency
  RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0  
| node
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0  
| node
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0  
| node
  RowShareLock     | t       | 12714 |             | p247_website_1_1_0  
| node
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| node
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0  
| descr_node_en
(40 rows)

I expected deadlocks to be reported by PostgreSQL after the configured
timeout with an error message, but this one is just hanging up all
clients.

We are running PostgreSQL 7.3.6. Client software is written in Perl
using the DBI interface.

I would really appreciate any comments where to search for the problem.

Matthias Schmitt

magic moving pixel s.a. Phone: +352 54 75 75 - 0
Technoport Schlassgoart Fax : +352 54 75 75 - 54
66, rue de Luxembourg URL : http://www.mmp.lu
L-4221 Esch-sur-Alzette

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias Schmitt (#1)
Re: Deadlock Problem

Matthias Schmitt <freak002@mmp.lu> writes:

I am in trouble with more and more deadlock problems. We are
programming a web application with multiple users editing content at
the same time. Multiple times a day PostgreSQL runs into a deadlock,
which can only be resolved by killing some of the clients. Here is an
example from the pg_locks table:

All of the rows you showed us have granted=t. No blockage is evident,
let alone any deadlock.

regards, tom lane

#3Matthias Schmitt
freak002@mmp.lu
In reply to: Tom Lane (#2)
Re: Deadlock Problem

Matthias Schmitt <freak002@mmp.lu> writes:

I am in trouble with more and more deadlock problems. We are
programming a web application with multiple users editing content at
the same time. Multiple times a day PostgreSQL runs into a deadlock,
which can only be resolved by killing some of the clients. Here is an
example from the pg_locks table:

All of the rows you showed us have granted=t. No blockage is evident,
let alone any deadlock.

Hello,

we tried to reduce the possible error sources. So we logged the last
statements sent to the database and were able to reproduce our problem
with psql alone. I did the following in two psql shell environments:

shell no 1:

CREATE TABLE the_test (
id int4 PRIMARY KEY,
name varchar(32)
);

insert into the_test values (1, 'hello world');

begin;
update the_test set name = 'still alive' where id = 1;

To keep the transaction open I did not issue any commit or rollback
command.

shell no 2:

begin;
update the_test set name = 'still alive' where id = 1;

The second shell hangs now forever. The pg_locks table shows:

select * from pg_locks;
relation | database | transaction | pid | mode | granted
----------+----------+-------------+-------+------------------+---------
1980976 | 1980969 | | 16034 | AccessShareLock | t
16757 | 1 | | 16100 | AccessShareLock | t
| | 762472 | 16036 | ExclusiveLock | t
| | 762473 | 16034 | ExclusiveLock | t
| | 762472 | 16034 | ShareLock | f
1980974 | 1980969 | | 16036 | AccessShareLock | t
1980974 | 1980969 | | 16036 | RowExclusiveLock | t
| | 762478 | 16100 | ExclusiveLock | t
1980974 | 1980969 | | 16034 | AccessShareLock | t
1980974 | 1980969 | | 16034 | RowExclusiveLock | t
(10 rows)

In our applications it is possible that multiple records of different
tables are updated in different sequences, depending on the task to
fulfill. Shouldn't a time-out error resolve those problems?

Thank you.

Matthias Schmitt

magic moving pixel s.a. Phone: +352 54 75 75 - 0
Technoport Schlassgoart Fax : +352 54 75 75 - 54
66, rue de Luxembourg URL : http://www.mmp.lu
L-4221 Esch-sur-Alzette

#4Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Matthias Schmitt (#3)
Re: Deadlock Problem

On Tue, Mar 16, 2004 at 09:50:43AM +0100, Matthias Schmitt wrote:

begin;
update the_test set name = 'still alive' where id = 1;

To keep the transaction open I did not issue any commit or rollback
command.

shell no 2:

begin;
update the_test set name = 'still alive' where id = 1;

The second shell hangs now forever. The pg_locks table shows:

First, you haven't tried to COMMIT anywhere. Nothing will happen in
that case. For all you know, one of these is going to ROLLBACK and
resolve the lock on its own. The second one you issue is just going
to sit there, sure.

You actually haven't describe a deadlock here. This is just a
straightforward wait-and-see lock for transaction 2: that update
statement will indeed wait forever until it sees what 1 has done.
You'd only get a deadlock in case transaction 2 first did something
that T1 _next_ had to depend on. You really need two conflicting
locks for a deadlock to happen.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias Schmitt (#3)
Re: Deadlock Problem

Matthias Schmitt <freak002@mmp.lu> writes:

I did the following in two psql shell environments:

shell no 1:

CREATE TABLE the_test (
id int4 PRIMARY KEY,
name varchar(32)
);

insert into the_test values (1, 'hello world');

begin;
update the_test set name = 'still alive' where id = 1;

To keep the transaction open I did not issue any commit or rollback
command.

shell no 2:

begin;
update the_test set name = 'still alive' where id = 1;

The second shell hangs now forever.

Well, of course. It has to wait to see if the previous update of the
row commits or not, so that it knows which version of the row to start
from. (In this trivial case it doesn't really matter, but in more
complex cases such as where different fields are being updated, it
does.)

This is *not* a deadlock, however, as transaction 1 is free to make
progress. The fact that you've got a client holding an open transaction
and not doing anything is a client-side design error, not a deadlock.

Shouldn't a time-out error resolve those problems?

Sure, and it works fine:

regression=# set statement_timeout TO 10000;
SET
regression=# begin;
BEGIN
regression=# update the_test set name = 'still alive' where id = 1;
-- about ten seconds elapse, then:
ERROR: canceling query due to user request
regression=#

regards, tom lane