Deadlock? idle in transaction
A customer's machine hangs from time to time. All we could find so far is
that postgres seems to be in state "idle in transaction":
postgres 19317 0.0 0.3 8168 392 ? S Oct05 0:00 /usr/lib/postgresql/bin/postmaster -D /var/lib/postgres/data
postgres 19983 0.0 0.8 8932 1020 ? S Oct05 0:01 postgres: postgres rabatt 192.168.50.222 idle in transaction
postgres 21005 0.0 0.0 3484 4 ? S Oct06 0:00 /usr/lib/postgresql/bin/psql -t -q -d template1
postgres 21014 0.0 0.7 8892 952 ? S Oct06 0:01 postgres: postgres rabatt [local] VACUUM waiting
postgres 21833 0.0 0.4 3844 572 ? S Oct06 0:00 /usr/lib/postgresql/bin/pg_dump rabatt
postgres 21841 0.0 1.2 9716 1564 ? S Oct06 0:00 postgres: postgres rabatt [local] COPY waiting
postgres 22135 0.0 0.9 8856 1224 ? S Oct06 0:00 postgres: postgres rabatt 192.168.50.223 idle in transaction waiting
I'm not sure what's happening here and I have no remote access to the
machine myself. Any idea what could be the reason for this?
There may be some client processes running at the time the dump and the
vacuum commands are issued that have an open transaction doing nothing. That
is the just issued a BEGIN command. Thinking about it run some inserts at
the very same time, although that's not likely.
Any hints are appreciated. Thanks in advance.
Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!
On Thu, 11 Oct 2001, Michael Meskes wrote:
A customer's machine hangs from time to time. All we could find so far is
that postgres seems to be in state "idle in transaction":postgres 19317 0.0 0.3 8168 392 ? S Oct05 0:00 /usr/lib/postgresql/bin/postmaster -D /var/lib/postgres/data
postgres 19983 0.0 0.8 8932 1020 ? S Oct05 0:01 postgres: postgres rabatt 192.168.50.222 idle in transaction
postgres 21005 0.0 0.0 3484 4 ? S Oct06 0:00 /usr/lib/postgresql/bin/psql -t -q -d template1
postgres 21014 0.0 0.7 8892 952 ? S Oct06 0:01 postgres: postgres rabatt [local] VACUUM waiting
postgres 21833 0.0 0.4 3844 572 ? S Oct06 0:00 /usr/lib/postgresql/bin/pg_dump rabatt
postgres 21841 0.0 1.2 9716 1564 ? S Oct06 0:00 postgres: postgres rabatt [local] COPY waiting
postgres 22135 0.0 0.9 8856 1224 ? S Oct06 0:00 postgres: postgres rabatt 192.168.50.223 idle in transaction waitingI'm not sure what's happening here and I have no remote access to the
machine myself. Any idea what could be the reason for this?There may be some client processes running at the time the dump and the
vacuum commands are issued that have an open transaction doing nothing. That
is the just issued a BEGIN command. Thinking about it run some inserts at
the very same time, although that's not likely.Any hints are appreciated. Thanks in advance.
Well, it'd be likely to get in this state if the first transaction grabbed
any write locks and then sat on them without committing or doing any more
commands, since the vacuum would wait on that and the rest of the
transactions will probably wait on the vacuum. Is that a possible
situation?
i've had similar problems before. Looks like some thing is in a transaction,
blocked on something else. Then vacuum comes in, locks half the tables, and
then gets stuck on a table that the transaction has modified. Now most of
your other transactions will block forever. Then the connection limit for
postgres will be hit. Then you can't connect to postgres at all.
Basically, its a death spiral starting from something in a transaction
blocking forever on an external command. Nothing postgres itself can do
about. Of course, this is just my conjecture based on the info provided.
-rchit
-----Original Message-----
From: Michael Meskes [mailto:meskes@postgresql.org]
Sent: Thursday, October 11, 2001 2:29 AM
To: PostgreSQL Hacker
Subject: [HACKERS] Deadlock? idle in transaction
A customer's machine hangs from time to time. All we could find so far is
that postgres seems to be in state "idle in transaction":
postgres 19317 0.0 0.3 8168 392 ? S Oct05 0:00
/usr/lib/postgresql/bin/postmaster -D /var/lib/postgres/data
postgres 19983 0.0 0.8 8932 1020 ? S Oct05 0:01 postgres:
postgres rabatt 192.168.50.222 idle in transaction
postgres 21005 0.0 0.0 3484 4 ? S Oct06 0:00
/usr/lib/postgresql/bin/psql -t -q -d template1
postgres 21014 0.0 0.7 8892 952 ? S Oct06 0:01 postgres:
postgres rabatt [local] VACUUM waiting
postgres 21833 0.0 0.4 3844 572 ? S Oct06 0:00
/usr/lib/postgresql/bin/pg_dump rabatt
postgres 21841 0.0 1.2 9716 1564 ? S Oct06 0:00 postgres:
postgres rabatt [local] COPY waiting
postgres 22135 0.0 0.9 8856 1224 ? S Oct06 0:00 postgres:
postgres rabatt 192.168.50.223 idle in transaction waiting
I'm not sure what's happening here and I have no remote access to the
machine myself. Any idea what could be the reason for this?
There may be some client processes running at the time the dump and the
vacuum commands are issued that have an open transaction doing nothing. That
is the just issued a BEGIN command. Thinking about it run some inserts at
the very same time, although that's not likely.
Any hints are appreciated. Thanks in advance.
Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Import Notes
Resolved by subject fallback
Michael Meskes <meskes@postgresql.org> writes:
A customer's machine hangs from time to time. All we could find so far is
that postgres seems to be in state "idle in transaction":
You evidently have some client applications holding open transactions
that have locks on some tables. That's not a deadlock --- at least,
it's not Postgres' fault. The VACUUM is waiting to get exclusive access
to some table that's held by one of these clients, and the COPY is
probably queued up behind the VACUUM.
regards, tom lane
On Thu, Oct 11, 2001 at 01:09:25PM -0700, Stephan Szabo wrote:
Well, it'd be likely to get in this state if the first transaction grabbed
any write locks and then sat on them without committing or doing any more
commands, since the vacuum would wait on that and the rest of the
transactions will probably wait on the vacuum. Is that a possible
situation?
Maybe. The first transaction should not sit on any lock, but I have to dig
through the sources to be sure it really does not. Also I wonder if this
could happen through normal operation:
Task 1:
begin
acquire lock in table A
acquire lock in table B
commit
Task 2 (vacuum):
lock table B
lock table A
Could this force the situation too?
If so the easy workaround would be to run vacuum when there is no other
process accessing the DB.
Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!
On Thu, Oct 11, 2001 at 03:15:36PM -0700, Rachit Siamwalla wrote:
then gets stuck on a table that the transaction has modified. Now most of
your other transactions will block forever. Then the connection limit for
postgres will be hit. Then you can't connect to postgres at all.
Really? I do not know the way the backend handles locks, but couldn't it
detect such a deadlock and cancel a transaction? Something like this:
task 1 locks table A
task 2 locks table B
task 1 locks table B
task 2 tries to lock table A
Of course the last call creates the deadlock. Would it be possible to just
cancel task 2 in this case? Or do I miss something obvious?
Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!
On Thu, Oct 11, 2001 at 08:26:48PM -0400, Tom Lane wrote:
You evidently have some client applications holding open transactions
Okay, I know where to look for that. Thanks.
that have locks on some tables. That's not a deadlock --- at least,
It is no deadlock if the transaction holding the lock remains idle and does
nothing. But I cannot imagine how this could happen.
What happens if there is a real deadlock, i.e. the transaction holding the
lock tries to lock a table vacuum already locked? Ah, I just checked and
rendered my last mail useless. It appears the backend does correctly detect
the deadlock and kill one transaction.
it's not Postgres' fault. The VACUUM is waiting to get exclusive access
to some table that's held by one of these clients, and the COPY is
probably queued up behind the VACUUM.
So the reason is that the transaction does hold a lock but does not advance
any further?
Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!
On Fri, 12 Oct 2001, Michael Meskes wrote:
On Thu, Oct 11, 2001 at 01:09:25PM -0700, Stephan Szabo wrote:
Well, it'd be likely to get in this state if the first transaction grabbed
any write locks and then sat on them without committing or doing any more
commands, since the vacuum would wait on that and the rest of the
transactions will probably wait on the vacuum. Is that a possible
situation?Maybe. The first transaction should not sit on any lock, but I have to dig
through the sources to be sure it really does not. Also I wonder if this
could happen through normal operation:Task 1:
begin
acquire lock in table A
acquire lock in table B
commitTask 2 (vacuum):
lock table B
lock table ACould this force the situation too?
Do you mean like task1 has gotten the A lock, and then task 2 gets the B
and then task1 tries to get B and task2 tries to get A? I *think*
(without ever looking at the code, and going on messages from here) that
would probably kick off the deadlock alert since you're trying to grab
a lock from a process which is waiting for a lock you hold.
If so the easy workaround would be to run vacuum when there is no other
process accessing the DB.
Well, fortunately it sounds like in 7.2 we'll have much less of this in
the first place since the normal uses of vacuum will be happier with
sharing.
Also note that an uncommitted select statement will lock the table and
prevent vacuum from running. It isn't just inserts/updates that will
lock and cause vacuum to block, but selects as well. This got me in the
past. (Of course this is all fixed in 7.2 with the new vacuum
functionality that doesn't require exclusive locks on the tables).
thanks,
--Barry
Michael Meskes wrote:
Show quoted text
On Thu, Oct 11, 2001 at 08:26:48PM -0400, Tom Lane wrote:
You evidently have some client applications holding open transactions
Okay, I know where to look for that. Thanks.
that have locks on some tables. That's not a deadlock --- at least,
It is no deadlock if the transaction holding the lock remains idle and does
nothing. But I cannot imagine how this could happen.What happens if there is a real deadlock, i.e. the transaction holding the
lock tries to lock a table vacuum already locked? Ah, I just checked and
rendered my last mail useless. It appears the backend does correctly detect
the deadlock and kill one transaction.it's not Postgres' fault. The VACUUM is waiting to get exclusive access
to some table that's held by one of these clients, and the COPY is
probably queued up behind the VACUUM.So the reason is that the transaction does hold a lock but does not advance
any further?Michael
On Fri, Oct 12, 2001 at 11:29:08AM -0700, Stephan Szabo wrote:
Do you mean like task1 has gotten the A lock, and then task 2 gets the B
and then task1 tries to get B and task2 tries to get A? I *think*
(without ever looking at the code, and going on messages from here) that
would probably kick off the deadlock alert since you're trying to grab
a lock from a process which is waiting for a lock you hold.
I checked it and yes, it kicks off the deadlock alert. The idle in
transaction problem is not a deadlock but a transaction that simply does
not proceed.
In our case we believe to have found the reason. There was one user who
accessed the database via M$ Access and was allowed to write. And this user
looked into a table and then let this query open while doing other work.
Since he's able to change data I would guess that the query is internally
realized as a cursor select for update which of course locks. With Access
doing nothing but displaying the data the transaction certainly is idle.
That's it.
Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!