Problem running "ALTER TABLE...", ALTER TABLE waiting

Started by Brian McNallyover 13 years ago11 messagesgeneral
Jump to latest
#1Brian McNally
bmcnally@uw.edu

Hello,

I'm running Postgres 9.0.2 on RHEL 5.6 x86_64 and trying to run the
following statement:

alter table samples add column in_esp1234 text;

When I do this the command just hangs. ps output indicates that it's
waiting:

postgres: postgres exomeSNP [local] ALTER TABLE waiting

When I look to see what might have that table locked I don't see anything:

select * from pg_locks where relation=(select oid from pg_class where
relname='samples');

I have a nearly identical server where this command works. It seems like
something has this table locked, but I haven't been able to find it. Any
ideas?

Thanks for any help,

--
Brian McNally

#2raghu ram
raghuchennuru@gmail.com
In reply to: Brian McNally (#1)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

On Wed, Jul 18, 2012 at 1:24 AM, Brian McNally <bmcnally@uw.edu> wrote:

Hello,

I'm running Postgres 9.0.2 on RHEL 5.6 x86_64 and trying to run the
following statement:

alter table samples add column in_esp1234 text;

When I do this the command just hangs. ps output indicates that it's
waiting:

postgres: postgres exomeSNP [local] ALTER TABLE waiting

When I look to see what might have that table locked I don't see anything:

select * from pg_locks where relation=(select oid from pg_class where
relname='samples');

I have a nearly identical server where this command works. It seems like
something has this table locked, but I haven't been able to find it. Any
ideas?

Please find more information using below query:

\x

SELECT
waiting.locktype AS waiting_locktype,
waiting.relation::regclass AS waiting_table,
waiting_stm.current_query AS waiting_query,
waiting.mode AS waiting_mode,
waiting.pid AS waiting_pid,
other.locktype AS other_locktype,
other.relation::regclass AS other_table,
other_stm.current_query AS other_query,
other.mode AS other_mode,
other.pid AS other_pid,
other.granted AS other_granted
FROM
pg_catalog.pg_locks AS waiting
JOIN
pg_catalog.pg_stat_activity AS waiting_stm
ON (
waiting_stm.procpid = waiting.pid
)
JOIN
pg_catalog.pg_locks AS other
ON (
(
waiting."database" = other."database"
AND waiting.relation = other.relation
)
OR waiting.transactionid = other.transactionid
)
JOIN
pg_catalog.pg_stat_activity AS other_stm
ON (
other_stm.procpid = other.pid
)
WHERE
NOT waiting.granted
AND
waiting.pid <> other.pid;

--

Thanks & Regards,

Raghu Ram

EnterpriseDB Corporation

Blog:http://raghurc.blogspot.in/

#3Brian McNally
bmcnally@uw.edu
In reply to: raghu ram (#2)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

Hi Raghu,

I don't get any rows returned back from that query. I'm running it while
connected to the DB in question. Am I supposed to substitute values for
any of the variables in the query?

--
Brian McNally

Show quoted text

On 07/17/2012 07:23 PM, raghu ram wrote:

On Wed, Jul 18, 2012 at 1:24 AM, Brian McNally <bmcnally@uw.edu
<mailto:bmcnally@uw.edu>> wrote:

Hello,

I'm running Postgres 9.0.2 on RHEL 5.6 x86_64 and trying to run the
following statement:

alter table samples add column in_esp1234 text;

When I do this the command just hangs. ps output indicates that it's
waiting:

postgres: postgres exomeSNP [local] ALTER TABLE waiting

When I look to see what might have that table locked I don't see
anything:

select * from pg_locks where relation=(select oid from pg_class
where relname='samples');

I have a nearly identical server where this command works. It seems
like something has this table locked, but I haven't been able to
find it. Any ideas?

Please find more information using below query:

\x

SELECT
waiting.locktype AS waiting_locktype,
waiting.relation::regclass AS waiting_table,
waiting_stm.current_query AS waiting_query,
waiting.mode AS waiting_mode,
waiting.pid AS waiting_pid,
other.locktype AS other_locktype,
other.relation::regclass AS other_table,
other_stm.current_query AS other_query,
other.mode AS other_mode,
other.pid AS other_pid,
other.granted AS other_granted
FROM
pg_catalog.pg_locks AS waiting
JOIN
pg_catalog.pg_stat_activity AS waiting_stm
ON (
waiting_stm.procpid = waiting.pid
)
JOIN
pg_catalog.pg_locks AS other
ON (
(
waiting."database" = other."database"
AND waiting.relation = other.relation
)
OR waiting.transactionid = other.transactionid
)
JOIN
pg_catalog.pg_stat_activity AS other_stm
ON (
other_stm.procpid = other.pid
)
WHERE
NOT waiting.granted
AND
waiting.pid <> other.pid;

--

Thanks & Regards,

Raghu Ram

EnterpriseDB Corporation

Blog:http://raghurc.blogspot.in/

#4Sergey Konoplev
sergey.konoplev@postgresql-consulting.com
In reply to: Brian McNally (#3)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

On Wed, Jul 18, 2012 at 9:42 PM, Brian McNally <bmcnally@uw.edu> wrote:

I don't get any rows returned back from that query. I'm running it while
connected to the DB in question. Am I supposed to substitute values for any

Try this one. Are there any COPY queries (current_query) or long
lasting transactions (age)?

SELECT
now() - pg_stat_activity.xact_start AS age, pg_stat_activity.datname,
pg_stat_activity.procpid, pg_stat_activity.usename, pg_stat_activity.waiting,
pg_stat_activity.query_start, pg_stat_activity.client_addr,
pg_stat_activity.client_port, pg_stat_activity.current_query
FROM pg_stat_activity
WHERE pg_stat_activity.xact_start IS NOT NULL
ORDER BY pg_stat_activity.xact_start;

Ensure that your ALTER still hangs before executing this query.

of the variables in the query?

--
Brian McNally

On 07/17/2012 07:23 PM, raghu ram wrote:

On Wed, Jul 18, 2012 at 1:24 AM, Brian McNally <bmcnally@uw.edu
<mailto:bmcnally@uw.edu>> wrote:

Hello,

I'm running Postgres 9.0.2 on RHEL 5.6 x86_64 and trying to run the
following statement:

alter table samples add column in_esp1234 text;

When I do this the command just hangs. ps output indicates that it's
waiting:

postgres: postgres exomeSNP [local] ALTER TABLE waiting

When I look to see what might have that table locked I don't see
anything:

select * from pg_locks where relation=(select oid from pg_class
where relname='samples');

I have a nearly identical server where this command works. It seems
like something has this table locked, but I haven't been able to
find it. Any ideas?

Please find more information using below query:

\x

SELECT
waiting.locktype AS waiting_locktype,
waiting.relation::regclass AS waiting_table,
waiting_stm.current_query AS waiting_query,
waiting.mode AS waiting_mode,
waiting.pid AS waiting_pid,
other.locktype AS other_locktype,
other.relation::regclass AS other_table,
other_stm.current_query AS other_query,
other.mode AS other_mode,
other.pid AS other_pid,
other.granted AS other_granted
FROM
pg_catalog.pg_locks AS waiting
JOIN
pg_catalog.pg_stat_activity AS waiting_stm
ON (
waiting_stm.procpid = waiting.pid
)
JOIN
pg_catalog.pg_locks AS other
ON (
(
waiting."database" = other."database"
AND waiting.relation = other.relation
)
OR waiting.transactionid = other.transactionid
)
JOIN
pg_catalog.pg_stat_activity AS other_stm
ON (
other_stm.procpid = other.pid
)
WHERE
NOT waiting.granted
AND
waiting.pid <> other.pid;

--

Thanks & Regards,

Raghu Ram

EnterpriseDB Corporation

Blog:http://raghurc.blogspot.in/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204

#5Sergey Konoplev
sergey.konoplev@postgresql-consulting.com
In reply to: Brian McNally (#1)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

Hi Brian,

On Wed, Aug 1, 2012 at 10:21 PM, Brian McNally <bmcnally@uw.edu> wrote:

I was able to upgrade Postgres to 9.0.8 today and the ALTER TABLE... command
still hangs. Stracing the hung PID doesn't reveal much:

[root@gvsdb-dev ~]# strace -fp 13107
Process 13107 attached - interrupt to quit
semop(843382828, 0x7fffd9671ab0, 1

ps still shows the same waiting process:

postgres 13107 13067 0 11:14 ? 00:00:00 postgres: postgres exomeSNP
[local] ALTER TABLE waiting

Can you attach to the hanging process with gdb and show a backtrace?

http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

--
Brian McNally

On 07/20/2012 12:06 AM, Sergey Konoplev wrote:

On Fri, Jul 20, 2012 at 10:42 AM, Brian McNally <bmcnally@uw.edu> wrote:

Thanks for the help. I don't get any results from that query either
though.

Okay, it looks like a bug for me.

What I would do is to upgrade Pg to the latest minor release 9.0.8.
Probably this issue has already been solved.

If it will not help run strace -p <hanging_pid> and show its output.

--
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204

#6Brian McNally
bmcnally@uw.edu
In reply to: Sergey Konoplev (#5)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

It took me a little while but here it is. Let me know if this is helpful
or not. I'm not sure if I need more -debuginfo packages installed:

===
[root@gvsdb-dev tmp]# gdb /usr/pgsql-9.0/bin/postmaster 1160
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-32.el5_6.2)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html&gt;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/&gt;...
Reading symbols from /usr/pgsql-9.0/bin/postmaster...Reading symbols
from /usr/lib/debug/usr/pgsql-9.0/bin/postgres.debug...done.
done.
Attaching to program: /usr/pgsql-9.0/bin/postmaster, process 1160
Reading symbols from /usr/lib64/libxslt.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libxslt.so.1
Reading symbols from /usr/lib64/libxml2.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libxml2.so.2
Reading symbols from /lib64/libpam.so.0...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libpam.so.0
Reading symbols from /lib64/libssl.so.6...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libssl.so.6
Reading symbols from /lib64/libcrypto.so.6...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libcrypto.so.6
Reading symbols from /usr/lib64/libgssapi_krb5.so.2...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /usr/lib64/libldap-2.3.so.0...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libldap-2.3.so.0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /usr/lib64/libkrb5.so.3...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /usr/lib64/libz.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /lib64/libaudit.so.0...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libaudit.so.0
Reading symbols from /usr/lib64/libk5crypto.so.3...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libk5crypto.so.3
Reading symbols from /usr/lib64/libkrb5support.so.0...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libkeyutils.so.1
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/lib64/liblber-2.3.so.0...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/liblber-2.3.so.0
Reading symbols from /usr/lib64/libsasl2.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libsasl2.so.2
Reading symbols from /lib64/libselinux.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libsepol.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libsepol.so.1

warning: no loadable sections found in added symbol-file system-supplied
DSO at 0x7fffd96e4000
0x000000378f8d5497 in semop () from /lib64/libc.so.6
(gdb) bt
#0 0x000000378f8d5497 in semop () from /lib64/libc.so.6
#1 0x00000000005bc1c3 in PGSemaphoreLock (sema=0x2b1e695789e8,
interruptOK=1 '\001') at pg_sema.c:420
#2 0x00000000005ec8a1 in ProcSleep (locallock=0x95e00d0,
lockMethodTable=<value optimized out>) at proc.c:973
#3 0x00000000005eb45c in WaitOnLock (locallock=0x95e00d0,
owner=0x95573b0) at lock.c:1223
#4 0x00000000005ebb8c in LockAcquireExtended (locktag=0x7fffd9671d40,
lockmode=8, sessionLock=<value optimized out>, dontWait=0 '\000',
reportMemoryError=1 '\001') at lock.c:848
#5 0x00000000005e988b in LockRelationOid (relid=17211, lockmode=8) at
lmgr.c:79
#6 0x0000000000467ee5 in relation_open (relationId=17211,
lockmode=-647554384) at heapam.c:906
#7 0x00000000004f057c in transformAlterTableStmt (stmt=0x9558c70,
queryString=0x95e3310 "alter table samples add column esp_race text;")
at parse_utilcmd.c:1948
#8 0x00000000005fae4c in standard_ProcessUtility (parsetree=0x95e3f48,
queryString=0x95e3310 "alter table samples add column esp_race
text;", params=0x0, isTopLevel=1 '\001', dest=0x95e4288,
completionTag=0x7fffd9672110 "") at utility.c:706
#9 0x00000000005f81e9 in PortalRunUtility (portal=0x9636c20,
utilityStmt=0x95e3f48, isTopLevel=1 '\001', dest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:1191
#10 0x00000000005f9228 in PortalRunMulti (portal=0x9636c20, isTopLevel=1
'\001', dest=0x95e4288, altdest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:1296
#11 0x00000000005f9c45 in PortalRun (portal=0x9636c20,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x95e4288,
altdest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:822
#12 0x00000000005f6745 in exec_simple_query (query_string=0x95e3310
"alter table samples add column esp_race text;") at postgres.c:1060
#13 0x00000000005f6ff4 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, username=<value optimized out>)
at postgres.c:3978
#14 0x00000000005c6e35 in ServerLoop () at postmaster.c:3565
#15 0x00000000005c7b3c in PostmasterMain (argc=5, argv=0x951dbb0) at
postmaster.c:1097
#16 0x00000000005714be in main (argc=5, argv=<value optimized out>) at
main.c:188
===

--
Brian McNally

Show quoted text

On 08/02/2012 05:57 AM, Sergey Konoplev wrote:

Hi Brian,

On Wed, Aug 1, 2012 at 10:21 PM, Brian McNally <bmcnally@uw.edu> wrote:

I was able to upgrade Postgres to 9.0.8 today and the ALTER TABLE... command
still hangs. Stracing the hung PID doesn't reveal much:

[root@gvsdb-dev ~]# strace -fp 13107
Process 13107 attached - interrupt to quit
semop(843382828, 0x7fffd9671ab0, 1

ps still shows the same waiting process:

postgres 13107 13067 0 11:14 ? 00:00:00 postgres: postgres exomeSNP
[local] ALTER TABLE waiting

Can you attach to the hanging process with gdb and show a backtrace?

http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

--
Brian McNally

On 07/20/2012 12:06 AM, Sergey Konoplev wrote:

On Fri, Jul 20, 2012 at 10:42 AM, Brian McNally <bmcnally@uw.edu> wrote:

Thanks for the help. I don't get any results from that query either
though.

Okay, it looks like a bug for me.

What I would do is to upgrade Pg to the latest minor release 9.0.8.
Probably this issue has already been solved.

If it will not help run strace -p <hanging_pid> and show its output.

#7Sergey Konoplev
sergey.konoplev@postgresql-consulting.com
In reply to: Brian McNally (#6)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

On Wed, Aug 8, 2012 at 3:03 AM, Brian McNally <bmcnally@uw.edu> wrote:

[root@gvsdb-dev tmp]# gdb /usr/pgsql-9.0/bin/postmaster 1160
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-32.el5_6.2)
(gdb) bt
#0 0x000000378f8d5497 in semop () from /lib64/libc.so.6
#1 0x00000000005bc1c3 in PGSemaphoreLock (sema=0x2b1e695789e8,
interruptOK=1 '\001') at pg_sema.c:420
#2 0x00000000005ec8a1 in ProcSleep (locallock=0x95e00d0,
lockMethodTable=<value optimized out>) at proc.c:973
#3 0x00000000005eb45c in WaitOnLock (locallock=0x95e00d0, owner=0x95573b0)
at lock.c:1223
#4 0x00000000005ebb8c in LockAcquireExtended (locktag=0x7fffd9671d40,
lockmode=8, sessionLock=<value optimized out>, dontWait=0 '\000',
reportMemoryError=1 '\001') at lock.c:848
#5 0x00000000005e988b in LockRelationOid (relid=17211, lockmode=8) at
lmgr.c:79
#6 0x0000000000467ee5 in relation_open (relationId=17211,
lockmode=-647554384) at heapam.c:906

What kernel version is on this machine and which one is on the server
where everything works fine? As I understand Red Hat is installed on
both of them, am I correct?

I have found several mentions of similar situations related to a
possible kernel bug.

#7 0x00000000004f057c in transformAlterTableStmt (stmt=0x9558c70,
queryString=0x95e3310 "alter table samples add column esp_race text;")
at parse_utilcmd.c:1948
#8 0x00000000005fae4c in standard_ProcessUtility (parsetree=0x95e3f48,
queryString=0x95e3310 "alter table samples add column esp_race text;",
params=0x0, isTopLevel=1 '\001', dest=0x95e4288,
completionTag=0x7fffd9672110 "") at utility.c:706
#9 0x00000000005f81e9 in PortalRunUtility (portal=0x9636c20,
utilityStmt=0x95e3f48, isTopLevel=1 '\001', dest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:1191
#10 0x00000000005f9228 in PortalRunMulti (portal=0x9636c20, isTopLevel=1
'\001', dest=0x95e4288, altdest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:1296
#11 0x00000000005f9c45 in PortalRun (portal=0x9636c20,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x95e4288,
altdest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:822
#12 0x00000000005f6745 in exec_simple_query (query_string=0x95e3310 "alter
table samples add column esp_race text;") at postgres.c:1060
#13 0x00000000005f6ff4 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, username=<value optimized out>)
at postgres.c:3978
#14 0x00000000005c6e35 in ServerLoop () at postmaster.c:3565
#15 0x00000000005c7b3c in PostmasterMain (argc=5, argv=0x951dbb0) at
postmaster.c:1097
#16 0x00000000005714be in main (argc=5, argv=<value optimized out>) at
main.c:188
===

--
Brian McNally

On 08/02/2012 05:57 AM, Sergey Konoplev wrote:

Hi Brian,

On Wed, Aug 1, 2012 at 10:21 PM, Brian McNally <bmcnally@uw.edu> wrote:

I was able to upgrade Postgres to 9.0.8 today and the ALTER TABLE...
command
still hangs. Stracing the hung PID doesn't reveal much:

[root@gvsdb-dev ~]# strace -fp 13107
Process 13107 attached - interrupt to quit
semop(843382828, 0x7fffd9671ab0, 1

ps still shows the same waiting process:

postgres 13107 13067 0 11:14 ? 00:00:00 postgres: postgres
exomeSNP
[local] ALTER TABLE waiting

Can you attach to the hanging process with gdb and show a backtrace?

http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

--
Brian McNally

On 07/20/2012 12:06 AM, Sergey Konoplev wrote:

On Fri, Jul 20, 2012 at 10:42 AM, Brian McNally <bmcnally@uw.edu> wrote:

Thanks for the help. I don't get any results from that query either
though.

Okay, it looks like a bug for me.

What I would do is to upgrade Pg to the latest minor release 9.0.8.
Probably this issue has already been solved.

If it will not help run strace -p <hanging_pid> and show its output.

--
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204

#8Brian McNally
bmcnally@uw.edu
In reply to: Brian McNally (#1)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

Ok, I'm running with all available updates and kernel 2.6.18-308.4.1.el5
but am still having the same problem.

--
Brian McNally

Show quoted text

On 08/07/2012 05:29 PM, Sergey Konoplev wrote:

On Wed, Aug 8, 2012 at 4:27 AM, Brian McNally <bmcnally@uw.edu> wrote:

RHEL 5.7 on both. 2.6.18-274.el5 on the system that works,
2.6.18-238.5.1.el5 on the system that doesn't. I will try upgrading.

Yeah, I think it is worth trying.

--
Brian McNally

On 08/07/2012 05:19 PM, Sergey Konoplev wrote:

On Wed, Aug 8, 2012 at 3:03 AM, Brian McNally <bmcnally@uw.edu> wrote:

[root@gvsdb-dev tmp]# gdb /usr/pgsql-9.0/bin/postmaster 1160
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-32.el5_6.2)
(gdb) bt
#0 0x000000378f8d5497 in semop () from /lib64/libc.so.6
#1 0x00000000005bc1c3 in PGSemaphoreLock (sema=0x2b1e695789e8,
interruptOK=1 '\001') at pg_sema.c:420
#2 0x00000000005ec8a1 in ProcSleep (locallock=0x95e00d0,
lockMethodTable=<value optimized out>) at proc.c:973
#3 0x00000000005eb45c in WaitOnLock (locallock=0x95e00d0,
owner=0x95573b0)
at lock.c:1223
#4 0x00000000005ebb8c in LockAcquireExtended (locktag=0x7fffd9671d40,
lockmode=8, sessionLock=<value optimized out>, dontWait=0 '\000',
reportMemoryError=1 '\001') at lock.c:848
#5 0x00000000005e988b in LockRelationOid (relid=17211, lockmode=8) at
lmgr.c:79
#6 0x0000000000467ee5 in relation_open (relationId=17211,
lockmode=-647554384) at heapam.c:906

What kernel version is on this machine and which one is on the server
where everything works fine? As I understand Red Hat is installed on
both of them, am I correct?

I have found several mentions of similar situations related to a
possible kernel bug.

#7 0x00000000004f057c in transformAlterTableStmt (stmt=0x9558c70,
queryString=0x95e3310 "alter table samples add column esp_race text;")
at parse_utilcmd.c:1948
#8 0x00000000005fae4c in standard_ProcessUtility (parsetree=0x95e3f48,
queryString=0x95e3310 "alter table samples add column esp_race
text;",
params=0x0, isTopLevel=1 '\001', dest=0x95e4288,
completionTag=0x7fffd9672110 "") at utility.c:706
#9 0x00000000005f81e9 in PortalRunUtility (portal=0x9636c20,
utilityStmt=0x95e3f48, isTopLevel=1 '\001', dest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:1191
#10 0x00000000005f9228 in PortalRunMulti (portal=0x9636c20, isTopLevel=1
'\001', dest=0x95e4288, altdest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:1296
#11 0x00000000005f9c45 in PortalRun (portal=0x9636c20,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x95e4288,
altdest=0x95e4288,
completionTag=0x7fffd9672110 "") at pquery.c:822
#12 0x00000000005f6745 in exec_simple_query (query_string=0x95e3310
"alter
table samples add column esp_race text;") at postgres.c:1060
#13 0x00000000005f6ff4 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, username=<value optimized out>)
at postgres.c:3978
#14 0x00000000005c6e35 in ServerLoop () at postmaster.c:3565
#15 0x00000000005c7b3c in PostmasterMain (argc=5, argv=0x951dbb0) at
postmaster.c:1097
#16 0x00000000005714be in main (argc=5, argv=<value optimized out>) at
main.c:188
===

--
Brian McNally

On 08/02/2012 05:57 AM, Sergey Konoplev wrote:

Hi Brian,

On Wed, Aug 1, 2012 at 10:21 PM, Brian McNally <bmcnally@uw.edu> wrote:

I was able to upgrade Postgres to 9.0.8 today and the ALTER TABLE...
command
still hangs. Stracing the hung PID doesn't reveal much:

[root@gvsdb-dev ~]# strace -fp 13107
Process 13107 attached - interrupt to quit
semop(843382828, 0x7fffd9671ab0, 1

ps still shows the same waiting process:

postgres 13107 13067 0 11:14 ? 00:00:00 postgres: postgres
exomeSNP
[local] ALTER TABLE waiting

Can you attach to the hanging process with gdb and show a backtrace?

http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

--
Brian McNally

On 07/20/2012 12:06 AM, Sergey Konoplev wrote:

On Fri, Jul 20, 2012 at 10:42 AM, Brian McNally <bmcnally@uw.edu>
wrote:

Thanks for the help. I don't get any results from that query either
though.

Okay, it looks like a bug for me.

What I would do is to upgrade Pg to the latest minor release 9.0.8.
Probably this issue has already been solved.

If it will not help run strace -p <hanging_pid> and show its output.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brian McNally (#8)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

Brian McNally <bmcnally@uw.edu> writes:

Ok, I'm running with all available updates and kernel 2.6.18-308.4.1.el5
but am still having the same problem.

It's fairly clearly blocked on a lock ... have you looked into the
pg_locks view to see what is holding the lock?

(I'm wondering about a prepared transaction, myself. There isn't much
else that could hold a lock across a server restart.)

regards, tom lane

#10Sergey Konoplev
sergey.konoplev@postgresql-consulting.com
In reply to: Tom Lane (#9)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

On Thu, Aug 9, 2012 at 4:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Brian McNally <bmcnally@uw.edu> writes:

Ok, I'm running with all available updates and kernel 2.6.18-308.4.1.el5
but am still having the same problem.

It's fairly clearly blocked on a lock ... have you looked into the
pg_locks view to see what is holding the lock?

There is a hidden part of the conversation. We have already checked
pg_locks and no locks were there. Moreover there were no other
activity except the ALTER. And this was the odd part.

(I'm wondering about a prepared transaction, myself. There isn't much
else that could hold a lock across a server restart.)

Are not they shown by pg_locks?

regards, tom lane

--
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204

#11Brian McNally
bmcnally@uw.edu
In reply to: Sergey Konoplev (#10)
Re: Problem running "ALTER TABLE...", ALTER TABLE waiting

In an interesting twist, we tried to drop the table in question and got
this:

-bash-3.2$ dropdb exomeSNP
dropdb: database removal failed: ERROR: database "exomeSNP" is being
accessed by other users
DETAIL: There are 1 prepared transaction(s) using the database.
-bash-3.2$

So, it seems that maybe there is prepared transaction that could be
holding up the ALTER TABLE. I'm a bit confused though since previous
attempts to find that didn't succeed. Maybe I just haven't used to
correct query yet.

--
Brian McNally

Show quoted text

On 08/09/2012 12:30 AM, Sergey Konoplev wrote:

On Thu, Aug 9, 2012 at 4:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Brian McNally <bmcnally@uw.edu> writes:

Ok, I'm running with all available updates and kernel 2.6.18-308.4.1.el5
but am still having the same problem.

It's fairly clearly blocked on a lock ... have you looked into the
pg_locks view to see what is holding the lock?

There is a hidden part of the conversation. We have already checked
pg_locks and no locks were there. Moreover there were no other
activity except the ALTER. And this was the odd part.

(I'm wondering about a prepared transaction, myself. There isn't much
else that could hold a lock across a server restart.)

Are not they shown by pg_locks?

regards, tom lane