Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Started by Christian Kruseabout 12 years ago28 messages

christian@2ndQuadrant.com

about 12 years ago

4 attachment(s)

Hi there,

I created two patches improving the log messages generated by
log_lock_waits. The first patch shows the process IDs holding a lock
we try to acquire (show_pids_in_lock_log.patch), sample output
(log_lock_waits=on required):

session 1: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 2: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 3: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;

Output w/o patch:

LOG: process 13777 still waiting for ExclusiveLock on relation 16385 of database 16384 after 1000.030 ms

Output with patch:

LOG: process 13777 still waiting for ExclusiveLock on relation 16385 of database 16384 after 1000.030 ms
CONTEXT: processes owning lock: 13775, 13776

The second patch (show_table_name_and_tuple_in_lock_log.patch)
includes relation info (table name and OID) as well as some tuple
information (if available). Sample output (log_lock_waits=on required):

session 1:
CREATE TABLE foo (val integer);
INSERT INTO foo (val) VALUES (1);
BEGIN;
UPDATE foo SET val = 3;

session 2:
BEGIN;
UPDATE TABLE foo SET val = 2;

Output w/o patch:

LOG: process 24774 acquired ShareLock on transaction 696 after 11688.720 ms

Output with patch:

LOG: process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
CONTEXT: relation name: foo (OID 16385)
tuple (ctid (0,1)): (1)

Regarding this patch I am not really sure where to put the
functions. Currently they are located in backend/storage/lmgr/lmgr.c
because XactLockTableWait() is located there, too. What do you think?

I also created two test specs for easy creation of the log output;
however, I was not able to provide an expected file since the process
IDs vary from test run to test run.

Regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 122afb2..fdfacdf 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1195,13 +1195,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						buf1;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&buf1);
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1211,10 +1217,42 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				if (MyProc->pid != proclock->tag.myProc->pid)
+				{
+					if (first)
+					{
+						appendStringInfo(&buf1, "%d",
+										 proclock->tag.myProc->pid);
+						first = false;
+					}
+					else
+						appendStringInfo(&buf1, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s",
+											  "processes owning lock: %s",
+											  lockHoldersNum), buf1.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1226,13 +1264,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s",
+											  "processes owning lock: %s",
+											  lockHoldersNum), buf1.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s",
+											  "processes owning lock: %s",
+											  lockHoldersNum), buf1.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1252,7 +1296,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errcontext(ngettext("process owning lock: %s",
+												  "processes owning lock: %s",
+											  lockHoldersNum), buf1.data))));
 			}
 
 			/*
@@ -1262,6 +1309,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(buf1.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);

show_table_name_and_tuple_in_lock_log.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index f8545c1..cfa49c2 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2702,9 +2702,11 @@ l1:
 
 		if (infomask & HEAP_XMAX_IS_MULTI)
 		{
+			XactLockTableWaitSetupErrorContextCallback(relation, &tp);
 			/* wait for multixact */
 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
 							NULL, infomask);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,10 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
+			XactLockTableWaitSetupErrorContextCallback(relation, &tp);
+
 			XactLockTableWait(xwait);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3254,9 +3259,11 @@ l2:
 			TransactionId update_xact;
 			int			remain;
 
+			XactLockTableWaitSetupErrorContextCallback(relation, &oldtup);
 			/* wait for multixact */
 			MultiXactIdWait((MultiXactId) xwait, mxact_status, &remain,
 							infomask);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3337,10 @@ l2:
 			else
 			{
 				/* wait for regular transaction to end */
+				XactLockTableWaitSetupErrorContextCallback(relation, &oldtup);
+
 				XactLockTableWait(xwait);
+				XactLockTableWaitCleanupErrorContextCallback();
 				LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 				/*
@@ -4398,7 +4408,11 @@ l3:
 										RelationGetRelationName(relation))));
 				}
 				else
+				{
+					XactLockTableWaitSetupErrorContextCallback(relation, tuple);
 					MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+					XactLockTableWaitCleanupErrorContextCallback();
+				}
 
 				/* if there are updates, follow the update chain */
 				if (follow_updates &&
@@ -4453,7 +4467,11 @@ l3:
 										RelationGetRelationName(relation))));
 				}
 				else
+				{
+					XactLockTableWaitSetupErrorContextCallback(relation, tuple);
 					XactLockTableWait(xwait);
+					XactLockTableWaitCleanupErrorContextCallback();
+				}
 
 				/* if there are updates, follow the update chain */
 				if (follow_updates &&
@@ -5139,9 +5157,14 @@ l4:
 
 					if (needwait)
 					{
+						XactLockTableWaitSetupErrorContextCallback(rel, &mytup);
+
 						LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 						XactLockTableWait(members[i].xid);
 						pfree(members);
+
+						XactLockTableWaitCleanupErrorContextCallback();
+
 						goto l4;
 					}
 					if (res != HeapTupleMayBeUpdated)
@@ -5199,8 +5222,13 @@ l4:
 												 &needwait);
 				if (needwait)
 				{
+					XactLockTableWaitSetupErrorContextCallback(rel, &mytup);
+
 					LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 					XactLockTableWait(rawxmax);
+
+					XactLockTableWaitCleanupErrorContextCallback();
+
 					goto l4;
 				}
 				if (res != HeapTupleMayBeUpdated)
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index a452fea..7f79615 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -165,7 +165,11 @@ top:
 		{
 			/* Have to wait for the other guy ... */
 			_bt_relbuf(rel, buf);
+
+			XactLockTableWaitSetupErrorContextCallback(rel, NULL);
 			XactLockTableWait(xwait);
+			XactLockTableWaitCleanupErrorContextCallback();
+
 			/* start over... */
 			_bt_freestack(stack);
 			goto top;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7ad9720..6e80e37 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2297,7 +2297,11 @@ IndexBuildHeapScan(Relation heapRelation,
 							 * Must drop the lock on the buffer before we wait
 							 */
 							LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
+
+							XactLockTableWaitSetupErrorContextCallback(heapRelation, heapTuple);
 							XactLockTableWait(xwait);
+							XactLockTableWaitCleanupErrorContextCallback();
+
 							goto recheck;
 						}
 					}
@@ -2343,7 +2347,11 @@ IndexBuildHeapScan(Relation heapRelation,
 							 * Must drop the lock on the buffer before we wait
 							 */
 							LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
+
+							XactLockTableWaitSetupErrorContextCallback(heapRelation, heapTuple);
 							XactLockTableWait(xwait);
+							XactLockTableWaitCleanupErrorContextCallback();
+
 							goto recheck;
 						}
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 6be17a9..2ee6f51 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1981,7 +1981,11 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
 			if (TransactionIdIsValid(SnapshotDirty.xmax))
 			{
 				ReleaseBuffer(buffer);
+
+				XactLockTableWaitSetupErrorContextCallback(relation, &tuple);
 				XactLockTableWait(SnapshotDirty.xmax);
+				XactLockTableWaitCleanupErrorContextCallback();
+
 				continue;		/* loop back to repeat heap_fetch */
 			}
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 39e3b2e..65aca6e 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1307,7 +1307,11 @@ retry:
 		if (TransactionIdIsValid(xwait))
 		{
 			index_endscan(index_scan);
+
+			XactLockTableWaitSetupErrorContextCallback(heap, tup);
 			XactLockTableWait(xwait);
+			XactLockTableWaitCleanupErrorContextCallback();
+
 			goto retry;
 		}
 
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index a978172..74eddab 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -18,12 +18,14 @@
 #include "access/subtrans.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "access/htup_details.h"
 #include "catalog/catalog.h"
 #include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/procarray.h"
 #include "utils/inval.h"
-
+#include "utils/lsyscache.h"
+#include "mb/pg_wchar.h"
 
 /*
  * RelationInitLockInfo
@@ -533,6 +535,142 @@ ConditionalXactLockTableWait(TransactionId xid)
 	return true;
 }
 
+static struct {
+	Relation rel;
+	HeapTuple tuple;
+} XactLockTableWaitLockInfos;
+ErrorContextCallback XactLockTableWaitErrorContext;
+
+/*
+ * XactLockTableWaitErrorContextCallback
+ *		error context callback set up by
+ *		XactLockTableWaitSetupErrorContextCallback. It reports some
+ *		tuple information and the relation of a lock to aquire
+ *
+ */
+static void
+XactLockTableWaitErrorContextCallback(void *arg)
+{
+	StringInfoData buf;
+	int i;
+	bool write_comma = false;
+	int maxfieldlen = 30;
+
+	if (XactLockTableWaitLockInfos.rel != NULL)
+	{
+		errcontext("relation name: %s (OID %u)",
+				   RelationGetRelationName(XactLockTableWaitLockInfos.rel),
+				   RelationGetRelid(XactLockTableWaitLockInfos.rel));
+	}
+
+	if (XactLockTableWaitLockInfos.tuple != NULL)
+	{
+		/*
+		 * Can't produce an error message including the tuple if we're in a
+		 * possibly aborted transaction state, database access might not be safe.
+		 */
+		if (geterrlevel() >= ERROR)
+		{
+			errcontext("tuple ctid (%u,%u)",
+					   BlockIdGetBlockNumber(&(XactLockTableWaitLockInfos.tuple->t_self.ip_blkid)),
+					   XactLockTableWaitLockInfos.tuple->t_self.ip_posid);
+		}
+		else
+		{
+			TupleDesc desc = RelationGetDescr(XactLockTableWaitLockInfos.rel);
+			Datum *values = palloc(desc->natts * sizeof(*values));
+			bool *nulls = palloc(desc->natts * sizeof(*values));
+
+			heap_deform_tuple(XactLockTableWaitLockInfos.tuple, desc, values, nulls);
+
+			initStringInfo(&buf);
+			appendStringInfoChar(&buf, '(');
+
+			for (i = 0; i < desc->natts; i++)
+			{
+				char *val;
+				int vallen;
+
+				if (nulls[i])
+					val = "null";
+				else
+				{
+					Oid foutoid;
+					bool typisvarlena;
+
+					getTypeOutputInfo(desc->attrs[i]->atttypid,
+									  &foutoid, &typisvarlena);
+
+					val = OidOutputFunctionCall(foutoid, values[i]);
+				}
+
+				if (write_comma)
+					appendStringInfoString(&buf, ", ");
+				else
+					write_comma = true;
+
+				vallen = strlen(val);
+				if (vallen <= maxfieldlen)
+					appendStringInfoString(&buf, val);
+				else
+				{
+					vallen = pg_mbcliplen(val, vallen, maxfieldlen);
+					appendBinaryStringInfo(&buf, val, vallen);
+					appendStringInfoString(&buf, "...");
+				}
+			}
+
+			appendStringInfoChar(&buf, ')');
+
+			errcontext("tuple (ctid (%u,%u)): %s",
+					   BlockIdGetBlockNumber(&(XactLockTableWaitLockInfos.tuple->t_self.ip_blkid)),
+					   XactLockTableWaitLockInfos.tuple->t_self.ip_posid, buf.data);
+
+			pfree(buf.data);
+			pfree(values);
+			pfree(nulls);
+		}
+	}
+}
+
+/*
+ * XactLockTableWaitSetupErrorContextCallback
+ *		Sets up the error context callback for reporting tuple
+ *		information and relation for a lock to aquire
+ *
+ * Use this before calling XactTableLockWait() or MultiXactIdWait()
+ */
+void
+XactLockTableWaitSetupErrorContextCallback(Relation rel, HeapTuple tuple)
+{
+	/*
+	 * assure that we haven't already set up the same callback
+	 */
+	Assert(XactLockTableWaitLockInfos.rel == NULL);
+
+	XactLockTableWaitLockInfos.rel = rel;
+	XactLockTableWaitLockInfos.tuple = tuple;
+
+	XactLockTableWaitErrorContext.callback = XactLockTableWaitErrorContextCallback;
+	XactLockTableWaitErrorContext.previous = error_context_stack;
+	error_context_stack = &XactLockTableWaitErrorContext;
+}
+
+/*
+ * XactLockTableWaitCleanupErrorContextCallback
+ *		Removes the error callback context
+ *
+ * Use this after calling XactTableLockWait() or MultiXactIdWait()
+ */
+void
+XactLockTableWaitCleanupErrorContextCallback(void)
+{
+	error_context_stack = error_context_stack->previous;
+
+	XactLockTableWaitLockInfos.rel = NULL;
+	XactLockTableWaitLockInfos.tuple = NULL;
+}
+
 /*
  * WaitForLockersMultiple
  *		Wait until no transaction holds locks that conflict with the given
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 65eb3bd..16e0410 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1204,6 +1204,23 @@ geterrposition(void)
 }
 
 /*
+ * geterrlevel --- return the currently set error level
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+int
+geterrlevel(void)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+
+	/* we don't bother incrementing recursion_depth */
+	CHECK_STACK_DEPTH();
+
+	return edata->elevel;
+}
+
+/*
  * getinternalerrposition --- same for internal error position
  *
  * This is only intended for use in error callback subroutines, since there
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index 1a8c018..ff97888 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -18,6 +18,7 @@
 #include "storage/itemptr.h"
 #include "storage/lock.h"
 #include "utils/rel.h"
+#include "access/htup.h"
 
 
 extern void RelationInitLockInfo(Relation relation);
@@ -57,6 +58,9 @@ extern void XactLockTableDelete(TransactionId xid);
 extern void XactLockTableWait(TransactionId xid);
 extern bool ConditionalXactLockTableWait(TransactionId xid);
 
+extern void XactLockTableWaitSetupErrorContextCallback(Relation rel, HeapTuple tuple);
+extern void XactLockTableWaitCleanupErrorContextCallback(void);
+
 /* Lock VXIDs, specified by conflicting locktags */
 extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
 extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 59d6f5e..3b627c4 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -224,6 +224,7 @@ extern int	err_generic_string(int field, const char *str);
 
 extern int	geterrcode(void);
 extern int	geterrposition(void);
+extern int  geterrlevel(void);
 extern int	getinternalerrposition(void);

lock-log-contains-pids.spectext/plain; charset=us-asciiDownload

lock-log-contains-tuple-and-relation.spectext/plain; charset=us-asciiDownload

Simon Riggs

simon@2ndQuadrant.com

about 12 years ago

In reply to: Christian Kruse (#1)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 30 December 2013 19:52, Christian Kruse <christian@2ndquadrant.com> wrote:

I created two patches..

Patches are related but separate, so should be tracked on separate
threads. Please add them to the CF app also.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Christian Kruse

christian@2ndquadrant.com

about 12 years ago

In reply to: Simon Riggs (#2)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 31/12/13 08:48, Simon Riggs wrote:

I created two patches..

Patches are related but separate, so should be tracked on separate
threads.

[x] Done (in <20131231091244.GB25649@defunct.ch>)

Please add them to the CF app also.

[x] Done. I modified the existing commitfest entry
(<https://commitfest.postgresql.org/action/patch_view?id=1350>) and
the second, new one is located here:

<https://commitfest.postgresql.org/action/patch_view?id=1351>

Regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Rajeev rastogi

rajeev.rastogi@huawei.com

almost 12 years ago

In reply to: Christian Kruse (#1)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 31st December 2013, Christian Kruse Wrote:

Hi there,

I created two patches improving the log messages generated by
log_lock_waits. The first patch shows the process IDs holding a lock we
try to acquire (show_pids_in_lock_log.patch), sample output
(log_lock_waits=on required):

session 1: BEGIN; LOCK TABLE foo IN SHARE MODE; session 2: BEGIN; LOCK
TABLE foo IN SHARE MODE; session 3: BEGIN; LOCK TABLE foo IN EXCLUSIVE
MODE;

Output w/o patch:

LOG: process 13777 still waiting for ExclusiveLock on relation 16385
of database 16384 after 1000.030 ms

Output with patch:

LOG: process 13777 still waiting for ExclusiveLock on relation 16385
of database 16384 after 1000.030 ms
CONTEXT: processes owning lock: 13775, 13776

I am reviewing this patch. The idea seems to be reasonable.
Following are my first level observation:

1. I find a issue in following scenario:

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;
session 3 with process id Z: BEGIN; LOCK TABLE foo IN SHARE MODE;

On execution of LOCK in session-3, as part of log it will display as:
processes owning lock: X, Y
But actually if we see Y has not yet own the lock, it is still waiting with higher priority.
It may mislead user.
May be we should change message to give all meaning i.e. which process is owning lock and
Which process is already in queue.

2. Can we give a better name to new variable 'buf1'?

3. Do we need to take performance reading to see if any impact?

4. Do we require documentation?

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Simon Riggs

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Rajeev rastogi (#4)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 22 January 2014 04:42, Rajeev rastogi <rajeev.rastogi@huawei.com> wrote:

On 31st December 2013, Christian Kruse Wrote:

Hi there,

I created two patches improving the log messages generated by
log_lock_waits. The first patch shows the process IDs holding a lock we
try to acquire (show_pids_in_lock_log.patch), sample output
(log_lock_waits=on required):

session 1: BEGIN; LOCK TABLE foo IN SHARE MODE; session 2: BEGIN; LOCK
TABLE foo IN SHARE MODE; session 3: BEGIN; LOCK TABLE foo IN EXCLUSIVE
MODE;

Output w/o patch:

LOG: process 13777 still waiting for ExclusiveLock on relation 16385
of database 16384 after 1000.030 ms

Output with patch:

LOG: process 13777 still waiting for ExclusiveLock on relation 16385
of database 16384 after 1000.030 ms
CONTEXT: processes owning lock: 13775, 13776

I am reviewing this patch. The idea seems to be reasonable.
Following are my first level observation:

1. I find a issue in following scenario:

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;
session 3 with process id Z: BEGIN; LOCK TABLE foo IN SHARE MODE;

On execution of LOCK in session-3, as part of log it will display as:
processes owning lock: X, Y
But actually if we see Y has not yet own the lock, it is still waiting with higher priority.
It may mislead user.
May be we should change message to give all meaning i.e. which process is owning lock and
Which process is already in queue.

Perhaps this?

CONTEXT: lock owner XXXX request queue XXX, XXX, XXX, etc

2. Can we give a better name to new variable 'buf1'?

3. Do we need to take performance reading to see if any impact?

Don't think so. Diagnosing problems will help performance, not hinder it

4. Do we require documentation?

Don't think so.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Simon Riggs (#5)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 22/01/14 12:40, Simon Riggs wrote:

1. I find a issue in following scenario:

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;
session 3 with process id Z: BEGIN; LOCK TABLE foo IN SHARE MODE;

On execution of LOCK in session-3, as part of log it will display as:
processes owning lock: X, Y
But actually if we see Y has not yet own the lock, it is still waiting with higher priority.
It may mislead user.
May be we should change message to give all meaning i.e. which process is owning lock and
Which process is already in queue.

Perhaps this?

CONTEXT: lock owner XXXX request queue XXX, XXX, XXX, etc

Fixed.

2. Can we give a better name to new variable 'buf1'?

Fixed.

3. Do we need to take performance reading to see if any impact?

Don't think so. Diagnosing problems will help performance, not hinder it

I agree. And this code path will only get executed when log_lock_waits
= on, which seems to be a debugging method to me.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v2.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index ee6c24c..588c4ef 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1195,13 +1195,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1211,10 +1221,58 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+											  lockHoldersNum), lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1226,13 +1284,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+											  lockHoldersNum), lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+											  lockHoldersNum), lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1252,7 +1316,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errcontext(ngettext("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+												  lockHoldersNum), lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1262,6 +1329,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);

Alvaro Herrera

alvherre@2ndquadrant.com

almost 12 years ago

In reply to: Christian Kruse (#6)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Christian Kruse wrote:

I think this could use some more comments -- for instance at the top of
the while loop, explain what's its purpose.

if (deadlock_state == DS_SOFT_DEADLOCK)
ereport(LOG,
(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errcontext(ngettext("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+											  lockHoldersNum), lock_holders_sbuf.data, lock_waiters_sbuf.data))));

This ngettext() call is repeated four times in the new code, which is a
bit annoying because it's not trivial. I think you could assign the
ngettext() to a char * at the bottom of the loop, and then in the
ereport() calls use it:

char *errcxt = NULL;

while ( ... )
{
...
errcxt = ngettext("processes owning lock: ..");
}

ereport(LOG,
(errmsg("blah blah"),
errcxt != NULL ? errcontext(errcxt) : 0));

That would avoid the repetition.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Alvaro Herrera (#7)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

This ngettext() call is repeated four times in the new code, which is a
bit annoying because it's not trivial. I think you could assign the
ngettext() to a char * at the bottom of the loop, and then in the
ereport() calls use it:

Would that not break the compiler's ability to verify the format codes
in the string? Not to mention make it harder for people to compare
format to arguments, too?

However, the real problem here is that you shouldn't be calling ngettext
manually in an ereport context in the first place. There is
infrastructure in place for that, and this isn't using it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Tom Lane (#8)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

attached you will find a new version of the patch containing more
comments.

On 22/01/14 12:00, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

This ngettext() call is repeated four times in the new code, which is a
bit annoying because it's not trivial. I think you could assign the
ngettext() to a char * at the bottom of the loop, and then in the
ereport() calls use it:

Would that not break the compiler's ability to verify the format codes
in the string? Not to mention make it harder for people to compare
format to arguments, too?

I agree.

However, the real problem here is that you shouldn't be calling ngettext
manually in an ereport context in the first place. There is
infrastructure in place for that, and this isn't using it.

Fixed in attached patch. I changed it from calling
errorcontext(ngettext()) to calling errdetail_plural().

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v3.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 122afb2..552c5a4 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1195,13 +1195,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1211,10 +1221,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+				(errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+								  lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1226,13 +1293,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+				(errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+								  lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+				(errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+								  lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1252,7 +1325,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1262,6 +1338,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Christian Kruse (#9)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Christian Kruse <christian@2ndquadrant.com> writes:

However, the real problem here is that you shouldn't be calling ngettext
manually in an ereport context in the first place. There is
infrastructure in place for that, and this isn't using it.

Fixed in attached patch. I changed it from calling
errorcontext(ngettext()) to calling errdetail_plural().

Well, is it context or detail? Those fields have reasonably well defined
meanings IMO.

If we need errcontext_plural, let's add it, not adopt inferior solutions
just because that isn't there for lack of previous need.

But having said that, I think this is indeed detail not context.
(I kinda wonder whether some of the stuff that's now in the primary
message shouldn't be pushed to errdetail as well. It looks like some
previous patches in this area have been lazy.)

While I'm griping, this message isn't even trying to follow the project's
message style guidelines. Detail or context messages are supposed to be
complete sentence(s), with capitalization and punctuation to match.

Lastly, is this information that we want to be shipping to clients?
Perhaps from a security standpoint that's not such a wise idea, and
errdetail_log() is what should be used.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Tom Lane (#10)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 22/01/14 14:45, Tom Lane wrote:

Well, is it context or detail? Those fields have reasonably well defined
meanings IMO.

I find the distinction somewhat blurry and think both would be
appropriate. But since I wasn't sure I changed to detail.

If we need errcontext_plural, let's add it, not adopt inferior solutions
just because that isn't there for lack of previous need.

I would've added it if I would've been sure.

But having said that, I think this is indeed detail not context.
(I kinda wonder whether some of the stuff that's now in the primary
message shouldn't be pushed to errdetail as well. It looks like some
previous patches in this area have been lazy.)

I agree, the primary message is not very well worded. On the other
hand finding an appropriate alternative seems hard for me.

While I'm griping, this message isn't even trying to follow the project's
message style guidelines. Detail or context messages are supposed to be
complete sentence(s), with capitalization and punctuation to match.

Hm, I hope I fixed it in this version of the patch.

Lastly, is this information that we want to be shipping to clients?
Perhaps from a security standpoint that's not such a wise idea, and
errdetail_log() is what should be used.

Fixed. I added an errdetail_log_plural() for this, too.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v3.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 122afb2..552c5a4 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1195,13 +1195,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1211,10 +1221,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+				(errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+								  lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1226,13 +1293,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+				(errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+								  lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+				(errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+								  lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1252,7 +1325,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errdetail_plural("process owning lock: %s request queue: %s",
+							   "processes owning lock: %s request queue: %s",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1262,6 +1338,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);

#12

Rajeev rastogi

rajeev.rastogi@huawei.com

almost 12 years ago

In reply to: Christian Kruse (#11)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 23/01/14 14:45, Christian Kruse wrote:

Well, is it context or detail? Those fields have reasonably well
defined meanings IMO.

I find the distinction somewhat blurry and think both would be
appropriate. But since I wasn't sure I changed to detail.

If we need errcontext_plural, let's add it, not adopt inferior
solutions just because that isn't there for lack of previous need.

I would've added it if I would've been sure.

But having said that, I think this is indeed detail not context.
(I kinda wonder whether some of the stuff that's now in the primary
message shouldn't be pushed to errdetail as well. It looks like some
previous patches in this area have been lazy.)

I agree, the primary message is not very well worded. On the other hand
finding an appropriate alternative seems hard for me.

While I'm griping, this message isn't even trying to follow the
project's message style guidelines. Detail or context messages are
supposed to be complete sentence(s), with capitalization and

punctuation to match.

Hm, I hope I fixed it in this version of the patch.

Lastly, is this information that we want to be shipping to clients?
Perhaps from a security standpoint that's not such a wise idea, and
errdetail_log() is what should be used.

Fixed. I added an errdetail_log_plural() for this, too.

I think you have attached wrong patch.

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Rajeev rastogi (#12)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

I think you have attached wrong patch.

Hurm. You are right, attached v3, not v4. Sorry.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v4.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index ee6c24c..6c648cf 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1195,13 +1195,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1211,10 +1221,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+						"Processes holding the lock: %s. Request queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1226,13 +1293,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+							"Processes holding lock: %s. Request queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+						"Processes holding the lock: %s. Request queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1252,7 +1325,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+						"Processes holding the lock: %s. Request queue: %s.",
+												   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1262,6 +1338,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8705586..b7f004e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -937,6 +937,28 @@ errdetail_log(const char *fmt,...)
 	return 0;					/* return value does not matter */
 }
 
+/*
+ * errdetail_log_plural --- add a detail_log error message text to the current error
+ * with support for pluralization of the message text
+ */
+int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+			  unsigned long n,...)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+	MemoryContext oldcontext;
+
+	recursion_depth++;
+	CHECK_STACK_DEPTH();
+	oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+
+	EVALUATE_MESSAGE_PLURAL(edata->domain, detail_log, false);
+
+	MemoryContextSwitchTo(oldcontext);
+	recursion_depth--;
+	return 0;					/* return value does not matter */
+}
+
 
 /*
  * errdetail_plural --- add a detail error message text to the current error,
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index d7916c2..427d52d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -182,6 +182,14 @@ errdetail_log(const char *fmt,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
 
 extern int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+/* This extension allows gcc to check the format string for consistency with
+   the supplied arguments. */
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 4)))
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 4)));
+
+extern int
 errdetail_plural(const char *fmt_singular, const char *fmt_plural,
 				 unsigned long n,...)
 /* This extension allows gcc to check the format string for consistency with

#14

Rajeev rastogi

rajeev.rastogi@huawei.com

almost 12 years ago

In reply to: Christian Kruse (#11)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 23/01/14, Christian Kruse wrote:

Well, is it context or detail? Those fields have reasonably well
defined meanings IMO.

I find the distinction somewhat blurry and think both would be
appropriate. But since I wasn't sure I changed to detail.

If we need errcontext_plural, let's add it, not adopt inferior
solutions just because that isn't there for lack of previous need.

I would've added it if I would've been sure.

But having said that, I think this is indeed detail not context.
(I kinda wonder whether some of the stuff that's now in the primary
message shouldn't be pushed to errdetail as well. It looks like some
previous patches in this area have been lazy.)

I agree, the primary message is not very well worded. On the other hand
finding an appropriate alternative seems hard for me.

While I'm griping, this message isn't even trying to follow the
project's message style guidelines. Detail or context messages are
supposed to be complete sentence(s), with capitalization and

punctuation to match.

Hm, I hope I fixed it in this version of the patch.

Lastly, is this information that we want to be shipping to clients?
Perhaps from a security standpoint that's not such a wise idea, and
errdetail_log() is what should be used.

Fixed. I added an errdetail_log_plural() for this, too.

I have checked the revised patch. It looks fine to me except one minor code formatting issue.
In elog.c, two tabs are missing in the definition of function "errdetail_log_plural".
Please run pgindent tool to check the same.

Also I would like to highlight one behavior here is that process ID of process trying to
acquire lock is also listed in the list of "Request queue". E.g.

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;

On execution of LOCK in session-2, as part of log it will display as:
DETAIL: Process holding the lock: X. Request queue: Y.

Where Y is the process ID of same process, which was trying to acquire lock.

To me, it seems to be correct behavior as the meaning of message will be
list of all processes already holding the lock and processes waiting in queue and
position of self process in wait-list. In above example, it will indicate that
process Y in on top of wait list.

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Rajeev rastogi (#14)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 27/01/14 11:44, Rajeev rastogi wrote:

I have checked the revised patch. It looks fine to me except one minor code formatting issue.
In elog.c, two tabs are missing in the definition of function "errdetail_log_plural".
Please run pgindent tool to check the same.

I did, but this reformats various other locations in the file,
too. Nevertheless I now ran pg_indent against it and removed the other
parts. Attached you will find the corrected patch version.

Also I would like to highlight one behavior here is that process ID of process trying to
acquire lock is also listed in the list of "Request queue". E.g.

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE;
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;

On execution of LOCK in session-2, as part of log it will display as:
DETAIL: Process holding the lock: X. Request queue: Y.

Where Y is the process ID of same process, which was trying to acquire lock.

This is on purpose due to the rewording of the Message. In the first
version the PID of the backend was missing.

Thanks for the review!

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v5.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index ee6c24c..6c648cf 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1195,13 +1195,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1211,10 +1221,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+						"Processes holding the lock: %s. Request queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1226,13 +1293,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+							"Processes holding lock: %s. Request queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+						"Processes holding the lock: %s. Request queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1252,7 +1325,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errdetail_log_plural("Process holding the lock: %s. Request queue: %s.",
+						"Processes holding the lock: %s. Request queue: %s.",
+												   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1262,6 +1338,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8705586..d28b749 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -937,6 +937,28 @@ errdetail_log(const char *fmt,...)
 	return 0;					/* return value does not matter */
 }
 
+/*
+ * errdetail_log_plural --- add a detail_log error message text to the current error
+ * with support for pluralization of the message text
+ */
+int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+	MemoryContext oldcontext;
+
+	recursion_depth++;
+	CHECK_STACK_DEPTH();
+	oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+
+	EVALUATE_MESSAGE_PLURAL(edata->domain, detail_log, false);
+
+	MemoryContextSwitchTo(oldcontext);
+	recursion_depth--;
+	return 0;					/* return value does not matter */
+}
+
 
 /*
  * errdetail_plural --- add a detail error message text to the current error,
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index d7916c2..427d52d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -182,6 +182,14 @@ errdetail_log(const char *fmt,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
 
 extern int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+/* This extension allows gcc to check the format string for consistency with
+   the supplied arguments. */
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 4)))
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 4)));
+
+extern int
 errdetail_plural(const char *fmt_singular, const char *fmt_plural,
 				 unsigned long n,...)
 /* This extension allows gcc to check the format string for consistency with

#16

Rajeev rastogi

rajeev.rastogi@huawei.com

almost 12 years ago

In reply to: Christian Kruse (#15)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 28/01/14, Christian Kruse wrote:

I have checked the revised patch. It looks fine to me except one

minor code formatting issue.

In elog.c, two tabs are missing in the definition of function

"errdetail_log_plural".

Please run pgindent tool to check the same.

I did, but this reformats various other locations in the file, too.
Nevertheless I now ran pg_indent against it and removed the other parts.
Attached you will find the corrected patch version.

Also I would like to highlight one behavior here is that process ID

of

process trying to acquire lock is also listed in the list of "Request

queue". E.g.

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE

MODE;

session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE
MODE;

On execution of LOCK in session-2, as part of log it will display as:
DETAIL: Process holding the lock: X. Request queue: Y.

Where Y is the process ID of same process, which was trying to

acquire lock.

This is on purpose due to the rewording of the Message. In the first
version the PID of the backend was missing.

Thanks for the review!

Now patch looks fine to me. I am marking this as "Ready for Committer".

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Fujii Masao

masao.fujii@gmail.com

almost 12 years ago

In reply to: Rajeev rastogi (#16)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On Tue, Jan 28, 2014 at 6:39 PM, Rajeev rastogi
<rajeev.rastogi@huawei.com> wrote:

On 28/01/14, Christian Kruse wrote:

I have checked the revised patch. It looks fine to me except one

minor code formatting issue.

In elog.c, two tabs are missing in the definition of function

"errdetail_log_plural".

Please run pgindent tool to check the same.

I did, but this reformats various other locations in the file, too.
Nevertheless I now ran pg_indent against it and removed the other parts.
Attached you will find the corrected patch version.

Also I would like to highlight one behavior here is that process ID

of

process trying to acquire lock is also listed in the list of "Request

queue". E.g.

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE

MODE;

session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE
MODE;

On execution of LOCK in session-2, as part of log it will display as:
DETAIL: Process holding the lock: X. Request queue: Y.

Where Y is the process ID of same process, which was trying to

acquire lock.

This is on purpose due to the rewording of the Message. In the first
version the PID of the backend was missing.

Thanks for the review!

Now patch looks fine to me. I am marking this as "Ready for Committer".

When I tested the patch, I got the strange behavior.

1. I executed the SELECT FOR UPDATE query and kept waiting for a while
in the session 1.

[session 1] PID=33660
BEGIN;
SELECT * FROM t WHERE i = 1 FOR UPDATE;
(Wait)

2. I executed the SELECT FOR UPDATE query again in the session 2.

[session 2] PID=33662
BEGIN;
SELECT * FROM t WHERE i = 1 FOR UPDATE;
(Waiting for the lock)

3. I got the following log message. This is OK.

LOG: process 33662 still waiting for ShareLock on transaction
1011 after 1000.184 ms
DETAIL: Process holding the lock: 33660. Request queue: 33662.

4. I executed the UPDATE query in the session 3.

[session 3] PID=33665
UPDATE t SET j = j + 1 WHERE i = 1;
(Waiting for the lock)

5. Then, I got the following log message. This looks strange and
confusing to me.

LOG: process 33665 still waiting for ExclusiveLock on tuple (0,4)
of relation 16384 of database 12310 after 1000.134 ms
DETAIL: Process holding the lock: 33662. Request queue: 33665

This log message says that the process 33662 is holding the lock, but
it's not true.
The process holding the lock is 33660. The process 33662 should be in
the request
queue. Is this the intentional behavior?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Fujii Masao (#17)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 01/02/14 02:45, Fujii Masao wrote:

LOG: process 33662 still waiting for ShareLock on transaction
1011 after 1000.184 ms
DETAIL: Process holding the lock: 33660. Request queue: 33662.
[… snip …]
LOG: process 33665 still waiting for ExclusiveLock on tuple (0,4)
of relation 16384 of database 12310 after 1000.134 ms
DETAIL: Process holding the lock: 33662. Request queue: 33665

This log message says that the process 33662 is holding the lock, but
it's not true.

As the message says: first lock is waiting for the transaction, second
one for the tuple. So that are two different locks thus the two
different holders and queues. So…

Is this the intentional behavior?

Yes, I think so.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#19

Fujii Masao

masao.fujii@gmail.com

almost 12 years ago

In reply to: Christian Kruse (#18)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On Sat, Feb 1, 2014 at 7:41 PM, Christian Kruse
<christian@2ndquadrant.com> wrote:

Hi,

On 01/02/14 02:45, Fujii Masao wrote:

LOG: process 33662 still waiting for ShareLock on transaction
1011 after 1000.184 ms
DETAIL: Process holding the lock: 33660. Request queue: 33662.
[... snip ...]
LOG: process 33665 still waiting for ExclusiveLock on tuple (0,4)
of relation 16384 of database 12310 after 1000.134 ms
DETAIL: Process holding the lock: 33662. Request queue: 33665

This log message says that the process 33662 is holding the lock, but
it's not true.

As the message says: first lock is waiting for the transaction, second
one for the tuple. So that are two different locks thus the two
different holders and queues. So...

Is this the intentional behavior?

Yes, I think so.

Oh, yes. You're right.

I have other minor comments:

Since you added errdetail_log_plural(), ISTM that you need to update
sources.sgml.

While I'm griping, this message isn't even trying to follow the project's
message style guidelines. Detail or context messages are supposed to be
complete sentence(s), with capitalization and punctuation to match.

Hm, I hope I fixed it in this version of the patch.

Current message doesn't look like complete sentence yet... We would
need to use something like "Processes X, Y are holding while Z is waiting
for the lock.". I could not come up with good message, though..

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Fujii Masao (#19)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 03/02/14 17:59, Fujii Masao wrote:

Since you added errdetail_log_plural(), ISTM that you need to update
sources.sgml.

[x] Fixed.

While I'm griping, this message isn't even trying to follow the project's
message style guidelines. Detail or context messages are supposed to be
complete sentence(s), with capitalization and punctuation to match.

Hm, I hope I fixed it in this version of the patch.

Current message doesn't look like complete sentence yet... We would
need to use something like "Processes X, Y are holding while Z is waiting
for the lock.". I could not come up with good message, though..

The problem is that we have two potential plural cases in this
message. That leads to the need to formulate the second part
independently from singular/plural. I tried to improve a little bit
and propose this message:

Singular:
"The following process is holding the lock: A. The request queue
consists of: B."

Plural:
"Following processes are holding the lock: A, B. The request queue
consists of: C."

Attached you will find an updated patch.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v6.patchtext/x-diff; charset=us-asciiDownload

diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml
index 881b0c3..aa20807 100644
--- a/doc/src/sgml/sources.sgml
+++ b/doc/src/sgml/sources.sgml
@@ -251,6 +251,15 @@ ereport(ERROR,
    </listitem>
    <listitem>
     <para>
+     <function>errdetail_log_plural(const char *fmt_singuar, const char
+     *fmt_plural, unsigned long n, ...)</function> is like
+     <function>errdetail_log</>, but with support for various plural forms of
+     the message.
+     For more information see <xref linkend="nls-guidelines">.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
      <function>errhint(const char *msg, ...)</function> supplies an optional
      <quote>hint</> message; this is to be used when offering suggestions
      about how to fix the problem, as opposed to factual details about
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index fb449a8..da72c82 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1209,13 +1209,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1225,10 +1235,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("The following process is holding the lock: %s. The request queue consists of: %s.",
+											   "Following processes are holding the lock: %s. The request queue consists of: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1240,13 +1307,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("The following process is holding the lock: %s. The request queue consists of: %s.",
+											   "Following processes are holding the lock: %s. The request queue consists of: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("The following process is holding the lock: %s. The request queue consists of: %s.",
+											   "Following processes are holding the lock: %s. The request queue consists of: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1266,7 +1339,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errdetail_log_plural("The following process is holding the lock: %s. The request queue consists of: %s.",
+												   "Following processes are holding the lock: %s. The request queue consists of: %s.",
+												   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1276,6 +1352,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8705586..d28b749 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -937,6 +937,28 @@ errdetail_log(const char *fmt,...)
 	return 0;					/* return value does not matter */
 }
 
+/*
+ * errdetail_log_plural --- add a detail_log error message text to the current error
+ * with support for pluralization of the message text
+ */
+int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+	MemoryContext oldcontext;
+
+	recursion_depth++;
+	CHECK_STACK_DEPTH();
+	oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+
+	EVALUATE_MESSAGE_PLURAL(edata->domain, detail_log, false);
+
+	MemoryContextSwitchTo(oldcontext);
+	recursion_depth--;
+	return 0;					/* return value does not matter */
+}
+
 
 /*
  * errdetail_plural --- add a detail error message text to the current error,
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index d7916c2..427d52d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -182,6 +182,14 @@ errdetail_log(const char *fmt,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
 
 extern int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+/* This extension allows gcc to check the format string for consistency with
+   the supplied arguments. */
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 4)))
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 4)));
+
+extern int
 errdetail_plural(const char *fmt_singular, const char *fmt_plural,
 				 unsigned long n,...)
 /* This extension allows gcc to check the format string for consistency with

#21

Simon Riggs

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Christian Kruse (#20)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 3 February 2014 10:06, Christian Kruse <christian@2ndquadrant.com> wrote:

Hi,

On 03/02/14 17:59, Fujii Masao wrote:

Since you added errdetail_log_plural(), ISTM that you need to update
sources.sgml.

[x] Fixed.

While I'm griping, this message isn't even trying to follow the project's
message style guidelines. Detail or context messages are supposed to be
complete sentence(s), with capitalization and punctuation to match.

Hm, I hope I fixed it in this version of the patch.

Current message doesn't look like complete sentence yet... We would
need to use something like "Processes X, Y are holding while Z is waiting
for the lock.". I could not come up with good message, though..

The problem is that we have two potential plural cases in this
message. That leads to the need to formulate the second part
independently from singular/plural. I tried to improve a little bit
and propose this message:

Singular:
"The following process is holding the lock: A. The request queue
consists of: B."

Plural:
"Following processes are holding the lock: A, B. The request queue
consists of: C."

Seems too complex. How about this...

"Lock holder(s): A. Lock waiter(s) B"
"Lock holder(s): A, B. Lock waiter(s) C"

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Simon Riggs (#21)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi Simon,

On 03/02/14 10:43, Simon Riggs wrote:

Singular:
"The following process is holding the lock: A. The request queue
consists of: B."

Plural:
"Following processes are holding the lock: A, B. The request queue
consists of: C."

Seems too complex. How about this...

"Lock holder(s): A. Lock waiter(s) B"
"Lock holder(s): A, B. Lock waiter(s) C"

This is basically the same as before, it is even shorter. The
complaint was that I don't use a whole sentence in this error
detail. Won't the change fulfill the same complaint?

To be honest, I'd like to stick with your earlier proposal:

Singular:
Process holding the lock: A. Request queue: B

Plural:
Processes holding the lock: A, B. Request queue: C, D

This seems to be a good trade-off between project guidelines,
readability and parsability.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#23

Fujii Masao

masao.fujii@gmail.com

almost 12 years ago

In reply to: Christian Kruse (#22)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On Mon, Feb 3, 2014 at 8:53 PM, Christian Kruse
<christian@2ndquadrant.com> wrote:

Hi Simon,

On 03/02/14 10:43, Simon Riggs wrote:

Singular:
"The following process is holding the lock: A. The request queue
consists of: B."

Plural:
"Following processes are holding the lock: A, B. The request queue
consists of: C."

Seems too complex. How about this...

"Lock holder(s): A. Lock waiter(s) B"
"Lock holder(s): A, B. Lock waiter(s) C"

This is basically the same as before, it is even shorter. The
complaint was that I don't use a whole sentence in this error
detail. Won't the change fulfill the same complaint?

To be honest, I'd like to stick with your earlier proposal:

Singular:
Process holding the lock: A. Request queue: B

Plural:
Processes holding the lock: A, B. Request queue: C, D

This seems to be a good trade-off between project guidelines,
readability and parsability.

ISTM that the phrase "Request queue" is not used much around the lock.
Using the phrase "wait queue" or Simon's suggestion sound better to at least me.
Thought?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Fujii Masao (#23)

1 attachment(s)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 04/02/14 12:38, Fujii Masao wrote:

ISTM that the phrase "Request queue" is not used much around the lock.
Using the phrase "wait queue" or Simon's suggestion sound better to at least me.
Thought?

Sounds reasonable to me. Attached patch changes messages to the following:

Process holding the lock: A. Wait queue: B.
Processes holding the lock: A, B. Wait queue: C.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

show_pids_in_lock_log_v7.patchtext/x-diff; charset=us-asciiDownload

diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml
index 881b0c3..aa20807 100644
--- a/doc/src/sgml/sources.sgml
+++ b/doc/src/sgml/sources.sgml
@@ -251,6 +251,15 @@ ereport(ERROR,
    </listitem>
    <listitem>
     <para>
+     <function>errdetail_log_plural(const char *fmt_singuar, const char
+     *fmt_plural, unsigned long n, ...)</function> is like
+     <function>errdetail_log</>, but with support for various plural forms of
+     the message.
+     For more information see <xref linkend="nls-guidelines">.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
      <function>errhint(const char *msg, ...)</function> supplies an optional
      <quote>hint</> message; this is to be used when offering suggestions
      about how to fix the problem, as opposed to factual details about
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index fb449a8..9620f6e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1209,13 +1209,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+						lock_waiters_sbuf,
+						lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+						first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 									   lockmode);
@@ -1225,10 +1235,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+											   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+				/*
+				 * we are a waiter if myProc->waitProcLock == proclock; we are
+				 * a holder if it is NULL or something different
+				 */
+				if (proclock->tag.myProc->waitProcLock == proclock)
+				{
+					if (first_waiter)
+					{
+						appendStringInfo(&lock_waiters_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_waiter = false;
+					}
+					else
+						appendStringInfo(&lock_waiters_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+				}
+				else
+				{
+					if (first_holder)
+					{
+						appendStringInfo(&lock_holders_sbuf, "%d",
+										 proclock->tag.myProc->pid);
+						first_holder = false;
+					}
+					else
+						appendStringInfo(&lock_holders_sbuf, ", %d",
+										 proclock->tag.myProc->pid);
+
+					lockHoldersNum++;
+				}
+
+				proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+											   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 				ereport(LOG,
 						(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.",
+						   "Processes holding the lock: %s. Wait queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 				/*
@@ -1240,13 +1307,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				 */
 				ereport(LOG,
 						(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.",
+						   "Processes holding the lock: %s. Wait queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 				ereport(LOG,
 						(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+						 (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.",
+						   "Processes holding the lock: %s. Wait queue: %s.",
+											   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			else if (myWaitStatus == STATUS_OK)
 				ereport(LOG,
 					(errmsg("process %d acquired %s on %s after %ld.%03d ms",
@@ -1266,7 +1339,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				if (deadlock_state != DS_HARD_DEADLOCK)
 					ereport(LOG,
 							(errmsg("process %d failed to acquire %s on %s after %ld.%03d ms",
-							  MyProcPid, modename, buf.data, msecs, usecs)));
+								MyProcPid, modename, buf.data, msecs, usecs),
+							 (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.",
+						   "Processes holding the lock: %s. Wait queue: %s.",
+												   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data))));
 			}
 
 			/*
@@ -1276,6 +1352,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			deadlock_state = DS_NO_DEADLOCK;
 
 			pfree(buf.data);
+			pfree(lock_holders_sbuf.data);
+			pfree(lock_waiters_sbuf.data);
 		}
 	} while (myWaitStatus == STATUS_WAITING);
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8705586..d28b749 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -937,6 +937,28 @@ errdetail_log(const char *fmt,...)
 	return 0;					/* return value does not matter */
 }
 
+/*
+ * errdetail_log_plural --- add a detail_log error message text to the current error
+ * with support for pluralization of the message text
+ */
+int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+	MemoryContext oldcontext;
+
+	recursion_depth++;
+	CHECK_STACK_DEPTH();
+	oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+
+	EVALUATE_MESSAGE_PLURAL(edata->domain, detail_log, false);
+
+	MemoryContextSwitchTo(oldcontext);
+	recursion_depth--;
+	return 0;					/* return value does not matter */
+}
+
 
 /*
  * errdetail_plural --- add a detail error message text to the current error,
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index d7916c2..427d52d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -182,6 +182,14 @@ errdetail_log(const char *fmt,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
 
 extern int
+errdetail_log_plural(const char *fmt_singular, const char *fmt_plural,
+					 unsigned long n,...)
+/* This extension allows gcc to check the format string for consistency with
+   the supplied arguments. */
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 4)))
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 4)));
+
+extern int
 errdetail_plural(const char *fmt_singular, const char *fmt_plural,
 				 unsigned long n,...)
 /* This extension allows gcc to check the format string for consistency with

#25

Rajeev rastogi

rajeev.rastogi@huawei.com

almost 12 years ago

In reply to: Christian Kruse (#24)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 4th February 2014, Christian kruse Wrote:

On 04/02/14 12:38, Fujii Masao wrote:

ISTM that the phrase "Request queue" is not used much around the lock.
Using the phrase "wait queue" or Simon's suggestion sound better to

at least me.

Thought?

Sounds reasonable to me. Attached patch changes messages to the
following:

Process holding the lock: A. Wait queue: B.
Processes holding the lock: A, B. Wait queue: C.

This looks good to me also.

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Rajeev rastogi

rajeev.rastogi@huawei.com

almost 12 years ago

In reply to: Rajeev rastogi (#14)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On 04 February 2014 14:38, Myself wrote:

On 4th February 2014, Christian kruse Wrote:

On 04/02/14 12:38, Fujii Masao wrote:

ISTM that the phrase "Request queue" is not used much around the

lock.

Using the phrase "wait queue" or Simon's suggestion sound better to

at least me.

Thought?

Sounds reasonable to me. Attached patch changes messages to the
following:

Process holding the lock: A. Wait queue: B.
Processes holding the lock: A, B. Wait queue: C.

This looks good to me also.

I have tested the revised patch and found ready to be committed.

I am marking this as "Ready for Committer".

Thanks and Regards,
Kumar Rajeev Rastogi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Fujii Masao

masao.fujii@gmail.com

almost 12 years ago

In reply to: Rajeev rastogi (#26)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

On Tue, Feb 25, 2014 at 1:03 PM, Rajeev rastogi
<rajeev.rastogi@huawei.com> wrote:

On 04 February 2014 14:38, Myself wrote:

On 4th February 2014, Christian kruse Wrote:

On 04/02/14 12:38, Fujii Masao wrote:

ISTM that the phrase "Request queue" is not used much around the

lock.

Using the phrase "wait queue" or Simon's suggestion sound better to

at least me.

Thought?

Sounds reasonable to me. Attached patch changes messages to the
following:

Process holding the lock: A. Wait queue: B.
Processes holding the lock: A, B. Wait queue: C.

This looks good to me also.

I have tested the revised patch and found ready to be committed.

I am marking this as "Ready for Committer".

Committed!

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Christian Kruse

christian@2ndquadrant.com

almost 12 years ago

In reply to: Fujii Masao (#27)

Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

Hi,

On 13/03/14 03:27, Fujii Masao wrote:

Committed!

Thank you very much!

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services