lock AccessShareLock on object 0/1260/0 is already held

Started by Pavel Stehuleabout 13 years ago4 messages
#1Pavel Stehule
pavel.stehule@gmail.com

Hello

What is state of this issue?
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php

We detect this issue in our servers - we use 9.1.6 on Linux

locktype | database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction | pid
| mode | granted
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------
relation | 5630403 | 1259 | | | |
| | | | 20/0 | 14960 |
AccessShareLock | t

Interesting of this issue - it is blocks DBI login

Exception thrown: DBI
connect('database=xxx;host=yyy;port=5432','username',...) failed:
FATAL: lock AccessShareLock on object 5630403/1259/0 is already
held#012

but I am able to login to this database from command line without problems

There is unfriendly error message - I cannot get information about
statement that exactly raise this fatal state.

The most old message that I found is

Dec 10 15:01:51 cl-pdwh-d01 postgres[32548]: [6-1] (2012-12-10
15:01:51 CET db_hw7ny3roi257xe886fvc5krqgosp6g23)
Dec 10 15:01:51 cl-pdwh-d01 postgres[32548]: [6-2] ERROR: canceling
statement due to user request
Dec 10 15:01:51 cl-pdwh-d01 postgres[32738]: [6-1] (2012-12-10
15:01:51 CET db_hw7ny3roi257xe886fvc5krqgosp6g23)
Dec 10 15:01:51 cl-pdwh-d01 postgres[32738]: [6-2] FATAL: lock
AccessShareLock on object 5630403/1259/0 is already held
Dec 10 15:01:51 cl-pdwh-d01 postgres[32739]: [6-1] (2012-12-10
15:01:51 CET db_hw7ny3roi257xe886fvc5krqgosp6g23)
Dec 10 15:01:51 cl-pdwh-d01 postgres[32739]: [6-2] FATAL: lock
AccessShareLock on object 5630403/1259/0 is already held
Dec 10 15:02:29 cl-pdwh-d01 postgres[957]: [6-1] (2012-12-10 15:02:29
CET db_hw7ny3roi257xe886fvc5krqgosp6g23)
Dec 10 15:02:29 cl-pdwh-d01 postgres[957]: [6-2] FATAL: lock
AccessShareLock on object 5630403/1259/0 is already held

Regards

Pavel Stehule

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#1)
Re: lock AccessShareLock on object 0/1260/0 is already held

Pavel Stehule <pavel.stehule@gmail.com> writes:

What is state of this issue?
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php

AFAICS we never identified the cause. It was pretty clear that
something was failing to clean up shared-memory lock data structures,
but not what that something was. The last productive suggestion was
to add process-exit-time logging of whether unreleased locks remain,
but if Dave ever did that, he didn't report back what he found.

Maybe you could add such logging on your machines.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#2)
Re: lock AccessShareLock on object 0/1260/0 is already held

2013/1/4 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

What is state of this issue?
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php

AFAICS we never identified the cause. It was pretty clear that
something was failing to clean up shared-memory lock data structures,
but not what that something was. The last productive suggestion was
to add process-exit-time logging of whether unreleased locks remain,
but if Dave ever did that, he didn't report back what he found.

probably I am able to find statement, that was canceled as last
"success" statement from our application logs. And probably it will be
LOCK ... or CREATE TABLE AS SELECT

Recheck on session end can help us to drop this leaked locks, but I
don't understand how it can help with finding with finding the cause?

Maybe you could add such logging on your machines.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#3)
1 attachment(s)
Re: lock AccessShareLock on object 0/1260/0 is already held

Hello

2013/1/4 Pavel Stehule <pavel.stehule@gmail.com>:

2013/1/4 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

What is state of this issue?
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php

AFAICS we never identified the cause. It was pretty clear that
something was failing to clean up shared-memory lock data structures,
but not what that something was. The last productive suggestion was
to add process-exit-time logging of whether unreleased locks remain,
but if Dave ever did that, he didn't report back what he found.

probably I am able to find statement, that was canceled as last
"success" statement from our application logs. And probably it will be
LOCK ... or CREATE TABLE AS SELECT

Recheck on session end can help us to drop this leaked locks, but I
don't understand how it can help with finding with finding the cause?

I am sending a patch that removes a described issue

we use this patch on our productions servers with some others diagnostics

+ 	 * GOODDATA
+ 	 *
+ 	 * we would to diagnose how much often is proces startup find
+ 	 * orphaned PGPROCLOCKS. Don't use ereport, because process is
+ 	 * not fully prepared in this moment.
+ 	 */
+ 	{
+ 		int		i;
+ 		bool			fix_orphaned_locks = false;
+
+ 		for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+ 		{
+ 			if (!SHMQueueEmpty(&(MyProc->myProcLocks[i])))
+ 			{
+ 				fprintf(stderr, "ProcKill: found orphaned PGPROCLOCK proc:%p,
pid:%d\n", MyProc, MyProcPid);
+ 				fix_orphaned_locks = true;
+ 				break;
+ 			}
+ 		}
+
+ 		if (fix_orphaned_locks)
+ 		{
+ 			LockReleaseAll(DEFAULT_LOCKMETHOD, true);
+ 			/* Release transaction-level advisory locks */
+ 			LockReleaseAll(USER_LOCKMETHOD, false);
+ 			fix_orphaned_locks = false;
+
+ 			for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+ 			{
+ 				if (!SHMQueueEmpty(&(MyProc->myProcLocks[i])))
+ 				{
+ 					fprintf(stderr, "ProcKill: found orphaned PGPROCLOCK proc:%p,
pid:%d again\n", MyProc, MyProcPid);
+ 					fix_orphaned_locks = true;
+ 					break;
+ 				}
+ 			}
+
+ 			if (fix_orphaned_locks)
+ 				fprintf(stderr, "ProcKill: orphaned PGPROCLOCK proc:%p, pid:%d
was removed\n", MyProc, MyProcPid);
+ 			else
+ 				fprintf(stderr, "ProcKill: orphaned PGPROCLOCK proc:%p, pid:%d
was not removed\n", MyProc, MyProcPid);
+ 		}
+ 	}
+

but it dosn't detect and removed any orphaned locks, so this code is
not necessary.

I can't to explain why moving initialization from proces to global
helps, but it just helps.

9.1 will be suppported next four years so maybe this patch can be
usefull for somebody or maybe can be merged

Regards

Pavel

Show quoted text

Maybe you could add such logging on your machines.

regards, tom lane

Attachments:

orphaned-locks-simple.patchapplication/octet-stream; name=orphaned-locks-simple.patchDownload
*** ./src/backend/storage/lmgr/proc.c.orig	2013-02-04 22:28:27.000000000 +0100
--- ./src/backend/storage/lmgr/proc.c	2013-03-15 14:40:33.044812917 +0100
***************
*** 158,164 ****
  InitProcGlobal(void)
  {
  	PGPROC	   *procs;
! 	int			i;
  	bool		found;
  
  	/* Create the ProcGlobal shared structure */
--- 158,165 ----
  InitProcGlobal(void)
  {
  	PGPROC	   *procs;
! 	int			i,
! 				 j;
  	bool		found;
  
  	/* Create the ProcGlobal shared structure */
***************
*** 202,207 ****
--- 203,211 ----
  		InitSharedLatch(&procs[i].procLatch);
  		procs[i].links.next = (SHM_QUEUE *) ProcGlobal->freeProcs;
  		ProcGlobal->freeProcs = &procs[i];
+ 
+ 		for (j = 0; j < NUM_LOCK_PARTITIONS; j++)
+ 			SHMQueueInit(&(procs[i].myProcLocks[j]));
  	}
  
  	/*
***************
*** 221,226 ****
--- 225,233 ----
  		InitSharedLatch(&procs[i].procLatch);
  		procs[i].links.next = (SHM_QUEUE *) ProcGlobal->autovacFreeProcs;
  		ProcGlobal->autovacFreeProcs = &procs[i];
+ 
+ 		for (j = 0; j < NUM_LOCK_PARTITIONS; j++)
+ 			SHMQueueInit(&(procs[i].myProcLocks[j]));
  	}
  
  	/*
***************
*** 232,237 ****
--- 239,247 ----
  		AuxiliaryProcs[i].pid = 0;		/* marks auxiliary proc as not in use */
  		PGSemaphoreCreate(&(AuxiliaryProcs[i].sem));
  		InitSharedLatch(&AuxiliaryProcs[i].procLatch);
+ 
+ 		for (j = 0; j < NUM_LOCK_PARTITIONS; j++)
+ 			SHMQueueInit(&(procs[i].myProcLocks[j]));
  	}
  
  	/* Create ProcStructLock spinlock, too */
***************
*** 247,253 ****
  {
  	/* use volatile pointer to prevent code rearrangement */
  	volatile PROC_HDR *procglobal = ProcGlobal;
- 	int			i;
  
  	/*
  	 * ProcGlobal should be set up already (if we are a backend, we inherit
--- 257,262 ----
***************
*** 337,344 ****
  	MyProc->lwWaitLink = NULL;
  	MyProc->waitLock = NULL;
  	MyProc->waitProcLock = NULL;
- 	for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
- 		SHMQueueInit(&(MyProc->myProcLocks[i]));
  	MyProc->recoveryConflictPending = false;
  
  	/* Initialize fields for sync rep */
--- 346,351 ----