REINDEX deadlock - Postgresql -9.1

Started by Anoop Kabout 13 years ago19 messagesgeneral
Jump to latest
#1Anoop K
anoopk6@gmail.com

We are hitting a situation where REINDEX is resulting in postgresql to go
to dead lock state* for ever*. On debugging the issue we found that
3 connections are going in to some dead lock state.

1. *idle in transaction *
2. *REINDEX waiting *
3. *SELECT waiting *

All these connections are made in the same minute. Once in deadlock state
we are not able to make new connections to db.(So not able to view pg_locks
also). New connections appears as *'startup waiting*' in *ps* output.
Initially we suspected *<idle in transaction>* is the result of not closing
a connection. But it seems it got stuck after creating a connection and is
not able to proceed.

Any clues ..

Thanks
Anoop

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Anoop K (#1)
Re: REINDEX deadlock - Postgresql -9.1

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in postgresql to go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in deadlock state we are not able to make new
connections to db.(So not able to view pg_locks also). New connections appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the result of not closing a connection. But
it seems it got stuck after creating a connection and is not able to proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Anoop K
anoopk6@gmail.com
In reply to: Laurenz Albe (#2)
Re: REINDEX deadlock - Postgresql -9.1

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be
collected in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz <laurenz.albe@wien.gv.at>wrote:

Show quoted text

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in postgresql to

go to dead lock state for ever.

On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in deadlock

state we are not able to make new

connections to db.(So not able to view pg_locks also). New connections

appears as 'startup waiting' in

ps output. Initially we suspected <idle in transaction> is the result of

not closing a connection. But

it seems it got stuck after creating a connection and is not able to

proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

#4Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Anoop K (#3)
Re: REINDEX deadlock - Postgresql -9.1

On Thu, Feb 7, 2013 at 2:08 PM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be collected
in this state ?

Try attaching each process involved in the deadlock to gdb and print
the call stack. That may or may not be useful, but given your
situation I wonder if you have a deadlock at LWLock level. Do you have
any external module installed ? Or any triggers written in C ?

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Scott Marlowe
scott.marlowe@gmail.com
In reply to: Anoop K (#3)
Re: REINDEX deadlock - Postgresql -9.1

It sounds like you're running out of connections. Have you tried
connecting as postgres? It has 2 or 3 superuser connections reserved
by default.

On Thu, Feb 7, 2013 at 1:38 AM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be collected
in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz <laurenz.albe@wien.gv.at>
wrote:

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in postgresql to
go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in deadlock
state we are not able to make new
connections to db.(So not able to view pg_locks also). New connections
appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the result of
not closing a connection. But
it seems it got stuck after creating a connection and is not able to
proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
To understand recursion, one must first understand recursion.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Anoop K
anoopk6@gmail.com
In reply to: Scott Marlowe (#5)
Re: REINDEX deadlock - Postgresql -9.1

We did run out of conns as our processes which tried to connect (over few
days) got hung in '*startup waiting state'. *Even superuser conns are also
over.

Thanks
Anoop

On Thu, Feb 7, 2013 at 3:37 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

Show quoted text

It sounds like you're running out of connections. Have you tried
connecting as postgres? It has 2 or 3 superuser connections reserved
by default.

On Thu, Feb 7, 2013 at 1:38 AM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be

collected

in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz <laurenz.albe@wien.gv.at>
wrote:

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in postgresql to
go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in deadlock
state we are not able to make new
connections to db.(So not able to view pg_locks also). New connections
appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the result

of

not closing a connection. But
it seems it got stuck after creating a connection and is not able to
proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
To understand recursion, one must first understand recursion.

#7Anoop K
anoopk6@gmail.com
In reply to: Pavan Deolasee (#4)
Re: REINDEX deadlock - Postgresql -9.1

I dont have C triggers. I can attach gdb and get stacktrace. Wondering if
it will take the processes out of problem state.

Thanks
Anoop

On Thu, Feb 7, 2013 at 3:33 PM, Pavan Deolasee <pavan.deolasee@gmail.com>wrote:

Show quoted text

On Thu, Feb 7, 2013 at 2:08 PM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be

collected

in this state ?

Try attaching each process involved in the deadlock to gdb and print
the call stack. That may or may not be useful, but given your
situation I wonder if you have a deadlock at LWLock level. Do you have
any external module installed ? Or any triggers written in C ?

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

#8Scott Marlowe
scott.marlowe@gmail.com
In reply to: Anoop K (#6)
Re: REINDEX deadlock - Postgresql -9.1

So have you tried connecting as a superuser?

On Thu, Feb 7, 2013 at 3:19 AM, Anoop K <anoopk6@gmail.com> wrote:

We did run out of conns as our processes which tried to connect (over few
days) got hung in 'startup waiting state'. Even superuser conns are also
over.

Thanks
Anoop

On Thu, Feb 7, 2013 at 3:37 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:

It sounds like you're running out of connections. Have you tried
connecting as postgres? It has 2 or 3 superuser connections reserved
by default.

On Thu, Feb 7, 2013 at 1:38 AM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be
collected
in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz <laurenz.albe@wien.gv.at>
wrote:

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in postgresql
to
go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in deadlock
state we are not able to make new
connections to db.(So not able to view pg_locks also). New
connections
appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the result
of
not closing a connection. But
it seems it got stuck after creating a connection and is not able to
proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
To understand recursion, one must first understand recursion.

--
To understand recursion, one must first understand recursion.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Anoop K
anoopk6@gmail.com
In reply to: Scott Marlowe (#8)
Re: REINDEX deadlock - Postgresql -9.1

Actually some of our processes connect as superuser. So even that is over
and is in hung state.

On Thu, Feb 7, 2013 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

Show quoted text

So have you tried connecting as a superuser?

On Thu, Feb 7, 2013 at 3:19 AM, Anoop K <anoopk6@gmail.com> wrote:

We did run out of conns as our processes which tried to connect (over few
days) got hung in 'startup waiting state'. Even superuser conns are also
over.

Thanks
Anoop

On Thu, Feb 7, 2013 at 3:37 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:

It sounds like you're running out of connections. Have you tried
connecting as postgres? It has 2 or 3 superuser connections reserved
by default.

On Thu, Feb 7, 2013 at 1:38 AM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be
collected
in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz <laurenz.albe@wien.gv.at

wrote:

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in postgresql
to
go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in deadlock
state we are not able to make new
connections to db.(So not able to view pg_locks also). New
connections
appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the

result

of
not closing a connection. But
it seems it got stuck after creating a connection and is not able

to

proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
To understand recursion, one must first understand recursion.

--
To understand recursion, one must first understand recursion.

#10Scott Marlowe
scott.marlowe@gmail.com
In reply to: Anoop K (#9)
Re: REINDEX deadlock - Postgresql -9.1

Processes should always connect by some other role with suspendable
superuser connections for situations like this. Do your processes
really need superuser access all the time? If you could turn it off
for a bit you could get into your database and troubleshoot from there
first. Not being able to connect to your db because you ran out of
superuser connections is a bad thing.

On Thu, Feb 7, 2013 at 4:00 AM, Anoop K <anoopk6@gmail.com> wrote:

Actually some of our processes connect as superuser. So even that is over
and is in hung state.

On Thu, Feb 7, 2013 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:

So have you tried connecting as a superuser?

On Thu, Feb 7, 2013 at 3:19 AM, Anoop K <anoopk6@gmail.com> wrote:

We did run out of conns as our processes which tried to connect (over
few
days) got hung in 'startup waiting state'. Even superuser conns are also
over.

Thanks
Anoop

On Thu, Feb 7, 2013 at 3:37 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:

It sounds like you're running out of connections. Have you tried
connecting as postgres? It has 2 or 3 superuser connections reserved
by default.

On Thu, Feb 7, 2013 at 1:38 AM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be
collected
in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz
<laurenz.albe@wien.gv.at>
wrote:

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in
postgresql
to
go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in
deadlock
state we are not able to make new
connections to db.(So not able to view pg_locks also). New
connections
appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the
result
of
not closing a connection. But
it seems it got stuck after creating a connection and is not able
to
proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
To understand recursion, one must first understand recursion.

--
To understand recursion, one must first understand recursion.

--
To understand recursion, one must first understand recursion.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Anoop K
anoopk6@gmail.com
In reply to: Scott Marlowe (#10)
Re: REINDEX deadlock - Postgresql -9.1

I will try. Here are the gdb stacktraces of hung processes.

<idle in transaction>

#0 0x00007fbdfaceb3e2 in recv () from /lib64/libc.so.6
#1 0x000000000058bde6 in secure_read ()
#2 0x000000000059697b in ?? ()
#3 0x0000000000596d7b in pq_getbyte ()
#4 0x00000000006334af in PostgresMain ()
#5 0x00000000005f4d69 in ?? ()
#6 0x00000000005f7501 in PostmasterMain ()
#7 0x0000000000598df0 in main ()

REINDEX:
-------
#0 0x00007fbdfacecca7 in semop () from /lib64/libc.so.6
#1 0x00000000005e8927 in PGSemaphoreLock ()
#2 0x0000000000624821 in ProcSleep ()
#3 0x000000000062145c in ?? ()
#4 0x0000000000622c6a in LockAcquireExtended ()
#5 0x0000000000620518 in LockRelationOid ()
#6 0x000000000046efe5 in relation_open ()
#7 0x0000000000476bc3 in index_open ()
#8 0x00000000004b8b14 in reindex_index ()
#9 0x00000000004b8f42 in reindex_relation ()
#10 0x000000000052d223 in ReindexDatabase ()
#11 0x00000000006347f7 in ?? ()
#12 0x000000000063580d in ?? ()
#13 0x0000000000635f92 in PortalRun ()
#14 0x00000000006325db in ?? ()
#15 0x0000000000633553 in PostgresMain ()
#16 0x00000000005f4d69 in ?? ()
#17 0x00000000005f7501 in PostmasterMain ()
#18 0x0000000000598df0 in main ()

SELECT BLOCKING
---------------

#0 0x00007fbdfacecca7 in semop () from /lib64/libc.so.6
#1 0x00000000005e8927 in PGSemaphoreLock ()
#2 0x0000000000624821 in ProcSleep ()
#3 0x000000000062145c in ?? ()
#4 0x0000000000622c6a in LockAcquireExtended ()
#5 0x0000000000620518 in LockRelationOid ()
#6 0x000000000046efe5 in relation_open ()
#7 0x0000000000476bc3 in index_open ()
#8 0x00000000005e47d1 in get_relation_info ()
#9 0x00000000005e67e9 in build_simple_rel ()
#10 0x00000000005cf6e2 in add_base_rels_to_query ()
#11 0x00000000005cf6e2 in add_base_rels_to_query ()
#12 0x00000000005cf733 in add_base_rels_to_query ()
#13 0x00000000005d0286 in query_planner ()
#14 0x00000000005d1dab in ?? ()
#15 0x00000000005d3866 in subquery_planner ()
#16 0x00000000005d3b20 in standard_planner ()
#17 0x00000000006315fa in pg_plan_query ()
#18 0x00000000006316e4 in pg_plan_queries ()
#19 0x00000000006326f2 in ?? ()
#20 0x0000000000633553 in PostgresMain ()
#21 0x00000000005f4d69 in ?? ()
#22 0x00000000005f7501 in PostmasterMain ()
#23 0x0000000000598df0 in main ()

NEW CONN
--------

(gdb) bt
#0 0x00007fbdfacecca7 in semop () from /lib64/libc.so.6
#1 0x00000000005e8927 in PGSemaphoreLock ()
#2 0x0000000000624821 in ProcSleep ()
#3 0x000000000062145c in ?? ()
#4 0x0000000000622c6a in LockAcquireExtended ()
#5 0x0000000000620518 in LockRelationOid ()
#6 0x000000000046efe5 in relation_open ()
#7 0x0000000000476bc3 in index_open ()
#8 0x00000000006dd185 in InitCatCachePhase2 ()
#9 0x00000000006e74ed in InitCatalogCachePhase2 ()
#10 0x00000000006e549a in RelationCacheInitializePhase3 ()
#11 0x00000000006fba51 in InitPostgres ()
#12 0x0000000000633153 in PostgresMain ()
#13 0x00000000005f4d69 in ?? ()
#14 0x00000000005f7501 in PostmasterMain ()
#15 0x0000000000598df0 in main ()

On Thu, Feb 7, 2013 at 4:37 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

Show quoted text

Processes should always connect by some other role with suspendable
superuser connections for situations like this. Do your processes
really need superuser access all the time? If you could turn it off
for a bit you could get into your database and troubleshoot from there
first. Not being able to connect to your db because you ran out of
superuser connections is a bad thing.

On Thu, Feb 7, 2013 at 4:00 AM, Anoop K <anoopk6@gmail.com> wrote:

Actually some of our processes connect as superuser. So even that is over
and is in hung state.

On Thu, Feb 7, 2013 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:

So have you tried connecting as a superuser?

On Thu, Feb 7, 2013 at 3:19 AM, Anoop K <anoopk6@gmail.com> wrote:

We did run out of conns as our processes which tried to connect (over
few
days) got hung in 'startup waiting state'. Even superuser conns are

also

over.

Thanks
Anoop

On Thu, Feb 7, 2013 at 3:37 PM, Scott Marlowe <

scott.marlowe@gmail.com>

wrote:

It sounds like you're running out of connections. Have you tried
connecting as postgres? It has 2 or 3 superuser connections reserved
by default.

On Thu, Feb 7, 2013 at 1:38 AM, Anoop K <anoopk6@gmail.com> wrote:

I have the setup in problem state. But I am not able to make psql
connections to view the lock details.
psql connections are hanging. Is there any other info which can be
collected
in this state ?

Also we don't know the steps to reproduce the issue.

On Thu, Feb 7, 2013 at 1:23 PM, Albe Laurenz
<laurenz.albe@wien.gv.at>
wrote:

Anoop K wrote:

We are hitting a situation where REINDEX is resulting in
postgresql
to
go to dead lock state for ever.
On debugging the issue we found that
3 connections are going in to some dead lock state.

1. idle in transaction
2. REINDEX waiting
3. SELECT waiting

All these connections are made in the same minute. Once in
deadlock
state we are not able to make new
connections to db.(So not able to view pg_locks also). New
connections
appears as 'startup waiting' in
ps output. Initially we suspected <idle in transaction> is the
result
of
not closing a connection. But
it seems it got stuck after creating a connection and is not

able

to
proceed.

Any clues ..

Check the contents of pg_locks:
What locks does the "idle in transaction" session hold?
Who holds the locks that block SELECT, REINDEX and new

connections?

Turn on log_statement='all' to see what the "idle in transaction"
session did since it started.

Yours,
Laurenz Albe

--
To understand recursion, one must first understand recursion.

--
To understand recursion, one must first understand recursion.

--
To understand recursion, one must first understand recursion.

#12Scott Marlowe
scott.marlowe@gmail.com
In reply to: Anoop K (#11)
Re: REINDEX deadlock - Postgresql -9.1

Note that if those processes use persistent connections you'll need to
restart them to free up the connections. In the meantime you can use
the step of renaming your superuser account. Then cutting all
superuser conns and turning off superuser of postgres user
temporarily. I wouldn't do this all night or anything, as autovac
usually connects as postgres to do its thing.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Scott Marlowe
scott.marlowe@gmail.com
In reply to: Anoop K (#1)
Re: REINDEX deadlock - Postgresql -9.1

On Wed, Feb 6, 2013 at 11:55 PM, Anoop K <anoopk6@gmail.com> wrote:

We are hitting a situation where REINDEX is resulting in postgresql to go to
dead lock state for ever. On debugging the issue we found that
3 connections are going in to some dead lock state.

idle in transaction
REINDEX waiting
SELECT waiting

All these connections are made in the same minute. Once in deadlock state we
are not able to make new connections to db.(So not able to view pg_locks
also). New connections appears as 'startup waiting' in ps output. Initially
we suspected <idle in transaction> is the result of not closing a
connection. But it seems it got stuck after creating a connection and is not
able to proceed.

This may or may not be a deadlock. Unless you've got a circle it's
not a deadlock, it's just a "cascading lock overloading your
connection limit" failure. You can get these with slony and vacuums
and ddl. Say I want to run a DDL script. Someone is running vacuum
(could be autovac process). I run slony execute to run ddl and it
waits with hard table locks, and all the updates stall behind that.
Your db then runs out of connections. What we need to know is what
that idle in transaction is just sitting there waiting to do, which is
usually a combination of db state and application state.

As a short term fix you can set some reasonable statement level
timeout on the reindex's connection, user or database. If no reindex
ever takes more than a minute and you give it 5 minutes and check the
logs for it you can see how often it fails (once every month or once
every minute you check for a while etc then you could set that user's
connect. If that user is the superuser things become problematic.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Anoop K (#11)
Re: REINDEX deadlock - Postgresql -9.1

Anoop K <anoopk6@gmail.com> wrote:

I will try. Here are the gdb stacktraces of hung processes.

<idle in transaction>

Have you tried `kill -SIGTERM` on the "idle in transaction" pid?

-Kevin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15Anoop K
anoopk6@gmail.com
In reply to: Scott Marlowe (#13)
Re: REINDEX deadlock - Postgresql -9.1

In an attempt to get access, I ended up killing a postgres process and the
whole thing *recovered from hang* state. Now don't have more data points to
debug.

I feel the trigger is the connection in <*idle in transaction>* state. On
examining the application side(Java) stacktrace, I found that other end is
trying to make connection.

at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at
org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:143)
at
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:112)
at
org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:71)
at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:272)
at
org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:269)
at
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:106)
at
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64)
at
org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:123)
at
org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:28)
at
org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:20)
at
org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at
org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:22)
at org.postgresql.Driver.makeConnection(Driver.java:391)
at org.postgresql.Driver.connect(Driver.java:265)

On Thu, Feb 7, 2013 at 4:52 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

Show quoted text

On Wed, Feb 6, 2013 at 11:55 PM, Anoop K <anoopk6@gmail.com> wrote:

We are hitting a situation where REINDEX is resulting in postgresql to

go to

dead lock state for ever. On debugging the issue we found that
3 connections are going in to some dead lock state.

idle in transaction
REINDEX waiting
SELECT waiting

All these connections are made in the same minute. Once in deadlock

state we

are not able to make new connections to db.(So not able to view pg_locks
also). New connections appears as 'startup waiting' in ps output.

Initially

we suspected <idle in transaction> is the result of not closing a
connection. But it seems it got stuck after creating a connection and is

not

able to proceed.

This may or may not be a deadlock. Unless you've got a circle it's
not a deadlock, it's just a "cascading lock overloading your
connection limit" failure. You can get these with slony and vacuums
and ddl. Say I want to run a DDL script. Someone is running vacuum
(could be autovac process). I run slony execute to run ddl and it
waits with hard table locks, and all the updates stall behind that.
Your db then runs out of connections. What we need to know is what
that idle in transaction is just sitting there waiting to do, which is
usually a combination of db state and application state.

As a short term fix you can set some reasonable statement level
timeout on the reindex's connection, user or database. If no reindex
ever takes more than a minute and you give it 5 minutes and check the
logs for it you can see how often it fails (once every month or once
every minute you check for a while etc then you could set that user's
connect. If that user is the superuser things become problematic.

#16Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Anoop K (#15)
Re: REINDEX deadlock - Postgresql -9.1

On Thu, Feb 7, 2013 at 8:19 PM, Anoop K <anoopk6@gmail.com> wrote:

In an attempt to get access, I ended up killing a postgres process and the
whole thing recovered from hang state. Now don't have more data points to
debug.

Sorry, I was going to ask what REINDEX was really indexing ? System
tables ? ISTM that the idle in transaction connection was holding some
kind of a heavy weight lock on one of the catalog tables and that may
be causing all other transactions to just wait. For example, I can
reproduce this by doing the following:

Session 1:
BEGIN;
REINDEX TABLE pg_class;
<stay idle in transaction>

Session 2:
REINDEX TABLE pg_attribute;
<will hang>

Try starting a new Session 3:
<will hung>

The stack traces of these processes will look similar to what you
posted. And as soon as you end the transaction on the first session,
everything will proceed.

You may want to look at your application code and see if you're
causing this kind of deadlock (or livelock, not sure what is a better
term to describe this situation)

Thanks,
Pavan
--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavan Deolasee (#16)
Re: REINDEX deadlock - Postgresql -9.1

Pavan Deolasee <pavan.deolasee@gmail.com> writes:

Sorry, I was going to ask what REINDEX was really indexing ? System
tables ?

The stack trace for the REINDEX process includes ReindexDatabase(), so
if it was running as a superuser it would be trying to reindex system
catalogs too. We don't actually know that the particular table it's
working on at the moment is a system catalog, but that seems like a
fairly good guess. The process that's blocked in startup is definitely
blocked on somebody's exclusive lock (or at least exclusive lock
request) on a system catalog index, and there are not that many
operations besides REINDEX that would take out such a lock.

I'm guessing that something holds a lock (maybe only AccessShareLock)
on a system catalog index, and REINDEX is blocked trying to get
exclusive lock on that index, and then all incoming processes are
queuing up behind REINDEX's request, since they'll all be trying
to open the same set of catcache-supporting indexes.

ISTM that the idle in transaction connection was holding some
kind of a heavy weight lock on one of the catalog tables and that may
be causing all other transactions to just wait.

It doesn't need to have been an exclusive lock to block REINDEX.
I suspect this theory is correct otherwise, because if it were a
true deadlock the deadlock detector should have noticed it. If it's
just "everybody is blocked behind that idle transaction", the deadlock
detector will not think that it should do anything about it.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18Anoop K
anoopk6@gmail.com
In reply to: Tom Lane (#17)
Re: REINDEX deadlock - Postgresql -9.1

REINDEX was for the whole database. It seems REINDEX was blocked by the <*idle
in transaction*> process.

What we are not able to explain is how that connection went in to <*idle in
transaction*> state. The app stacktrace confirms that app (JDBC) is trying
to open a connection. We do close connection after use.
So can't think how transaction went in to idle state.

Thanks
Anoop

On Fri, Feb 8, 2013 at 12:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Pavan Deolasee <pavan.deolasee@gmail.com> writes:

Sorry, I was going to ask what REINDEX was really indexing ? System
tables ?

The stack trace for the REINDEX process includes ReindexDatabase(), so
if it was running as a superuser it would be trying to reindex system
catalogs too. We don't actually know that the particular table it's
working on at the moment is a system catalog, but that seems like a
fairly good guess. The process that's blocked in startup is definitely
blocked on somebody's exclusive lock (or at least exclusive lock
request) on a system catalog index, and there are not that many
operations besides REINDEX that would take out such a lock.

I'm guessing that something holds a lock (maybe only AccessShareLock)
on a system catalog index, and REINDEX is blocked trying to get
exclusive lock on that index, and then all incoming processes are
queuing up behind REINDEX's request, since they'll all be trying
to open the same set of catcache-supporting indexes.

ISTM that the idle in transaction connection was holding some
kind of a heavy weight lock on one of the catalog tables and that may
be causing all other transactions to just wait.

It doesn't need to have been an exclusive lock to block REINDEX.
I suspect this theory is correct otherwise, because if it were a
true deadlock the deadlock detector should have noticed it. If it's
just "everybody is blocked behind that idle transaction", the deadlock
detector will not think that it should do anything about it.

regards, tom lane

#19Scott Marlowe
scott.marlowe@gmail.com
In reply to: Anoop K (#18)
Re: REINDEX deadlock - Postgresql -9.1

You might want to consider adding a pooler like pgbouncer to the
equation so that the pooler is what runs out of connections and not
the database. Then you could at least get into it to fix things.

On Thu, Feb 7, 2013 at 9:04 PM, Anoop K <anoopk6@gmail.com> wrote:

REINDEX was for the whole database. It seems REINDEX was blocked by the
<idle in transaction> process.

What we are not able to explain is how that connection went in to <idle in
transaction> state. The app stacktrace confirms that app (JDBC) is trying to
open a connection. We do close connection after use.
So can't think how transaction went in to idle state.

Thanks
Anoop

On Fri, Feb 8, 2013 at 12:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavan Deolasee <pavan.deolasee@gmail.com> writes:

Sorry, I was going to ask what REINDEX was really indexing ? System
tables ?

The stack trace for the REINDEX process includes ReindexDatabase(), so
if it was running as a superuser it would be trying to reindex system
catalogs too. We don't actually know that the particular table it's
working on at the moment is a system catalog, but that seems like a
fairly good guess. The process that's blocked in startup is definitely
blocked on somebody's exclusive lock (or at least exclusive lock
request) on a system catalog index, and there are not that many
operations besides REINDEX that would take out such a lock.

I'm guessing that something holds a lock (maybe only AccessShareLock)
on a system catalog index, and REINDEX is blocked trying to get
exclusive lock on that index, and then all incoming processes are
queuing up behind REINDEX's request, since they'll all be trying
to open the same set of catcache-supporting indexes.

ISTM that the idle in transaction connection was holding some
kind of a heavy weight lock on one of the catalog tables and that may
be causing all other transactions to just wait.

It doesn't need to have been an exclusive lock to block REINDEX.
I suspect this theory is correct otherwise, because if it were a
true deadlock the deadlock detector should have noticed it. If it's
just "everybody is blocked behind that idle transaction", the deadlock
detector will not think that it should do anything about it.

regards, tom lane

--
To understand recursion, one must first understand recursion.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general