Re: lock_timeout GUC patch - Review

Started by Marc Cousinover 15 years ago13 messages

cousinmarc@gmail.com

over 15 years ago

Hi, I've been reviewing this patch for the last few days. Here it is :

* Submission review
* Is the patch in context diff format?
Yes

* Does it apply cleanly to the current CVS HEAD?
Yes

* Does it include reasonable tests, necessary doc patches, etc?
Doc patches are there.
There are no regression tests. Should there be ?

* Usability review
* Does the patch actually implement that?
Yes

* Do we want that?
I think so. At least I'd like to have this feature :)

* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
I didn't see a clear conclusion from the -hackers thread on this (GUC
vs SQL statement extension)

* Does it include pg_dump support (if applicable)?
Not applicable. Or should pg_dump and/or pg_restore put lock_timeout to 0 ?

* Are there dangers?
As it is a guc, it could be set globally, is that a danger ?

* Have all the bases been covered?
I honestly don't know. It touches alarm signal handling.

* Apply the patch, compile it and test:
* Does the feature work as advertised?
I only tested it with Linux. The code is very OS-dependent. It works
as advertised with Linux. I found only one corner case (see below)

* Are there corner cases the author has failed to consider?
The feature almost works as advertised : it fails when lock_timeout =
deadlock_timeout. Then the lock_timeout isn't detected. I think this
case isn't considered in handle_sig_alarm().

* Are there any assertion failures or crashes?
No

* Performance review
* Does the patch slow down simple tests?
No

* If it claims to improve performance, does it?
Not applicable

* Does it slow down other things?
No. Maybe alarm signal handling and enabling will be slower, as there
is more work done there (for instance, a GetCurrentTimestamp, that
was only done when log_lock_waits was activated until now. But I
couldn't measure a slowdown.

* Read the changes to the code in detail and consider:
* Does it follow the project coding guidelines?
I think so

* Are there portability issues?
It seems to have already been adressed, from the previous discussion
in the thread.

* Will it work on Windows/BSD etc?
It should. I couldn't test it though. Infrastructure is there.

* Are the comments sufficient and accurate?
Yes

* Does it do what it says, correctly?
Yes

* Does it produce compiler warnings?
No

* Can you make it crash?
No

* Consider the changes to the code in the context of the project as a whole:
* Is everything done in a way that fits together coherently with
other features/modules?
I have a feeling that
enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a
very specific problem, and it gets complicated because there is no
infrastructure in the code to handle several timeouts at the same time
with sigalarm, so each timeout has its own dedicated and intertwined
code. But I'm still discovering this part of the code.

* Are there interdependencies that can cause problems?
I don't think so.

Noname

zb@cybertec.at

over 15 years ago

In reply to: Marc Cousin (#1)

Hi,

first, thanks for the review.

Hi, I've been reviewing this patch for the last few days. Here it is :

* Submission review
* Is the patch in context diff format?
Yes

* Does it apply cleanly to the current CVS HEAD?
Yes

* Does it include reasonable tests, necessary doc patches, etc?
Doc patches are there.
There are no regression tests. Should there be ?

IIRC, there was a discussion/patch about parallel psql that can
hold more than one connections open. With that feature, a regression
test can be added. Reading the 9.0beta3 docs, it's not there
and this patch is not on the current commitfest either.
Is there anyone who knows the status of this feature?

* Usability review
* Does the patch actually implement that?
Yes

* Do we want that?
I think so. At least I'd like to have this feature :)

:-)

* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
I didn't see a clear conclusion from the -hackers thread on this (GUC
vs SQL statement extension)

* Does it include pg_dump support (if applicable)?
Not applicable. Or should pg_dump and/or pg_restore put lock_timeout to 0
?

* Are there dangers?
As it is a guc, it could be set globally, is that a danger ?

It could be set globally, but it will exhibit a new global behaviour.
Which is not unexpected. The problem may come from [auto]vacuum processes
can get stopped. Maybe others, too. A previous version contained a
checking function that refused to set it from postgresql.conf for this
reason, but it was frowned upon by Tom. :-) The proper fix would be
that every such processes should set this GUC to zero for them locally.

* Have all the bases been covered?
I honestly don't know. It touches alarm signal handling.

* Apply the patch, compile it and test:
* Does the feature work as advertised?
I only tested it with Linux. The code is very OS-dependent. It works
as advertised with Linux. I found only one corner case (see below)

The setitimer() function is implemented in backend/port/win32/timer.c,
so it's abstracted away. With that in mind, I think there's no
OS-dependent in this patch.

* Are there corner cases the author has failed to consider?
The feature almost works as advertised : it fails when lock_timeout =
deadlock_timeout. Then the lock_timeout isn't detected. I think this
case isn't considered in handle_sig_alarm().

I will look into this, thanks for spotting it.

* Are there any assertion failures or crashes?
No

* Performance review
* Does the patch slow down simple tests?
No

* If it claims to improve performance, does it?
Not applicable

* Does it slow down other things?
No. Maybe alarm signal handling and enabling will be slower, as there
is more work done there (for instance, a GetCurrentTimestamp, that
was only done when log_lock_waits was activated until now. But I
couldn't measure a slowdown.

* Read the changes to the code in detail and consider:
* Does it follow the project coding guidelines?
I think so

* Are there portability issues?
It seems to have already been adressed, from the previous discussion
in the thread.

* Will it work on Windows/BSD etc?
It should. I couldn't test it though. Infrastructure is there.

* Are the comments sufficient and accurate?
Yes

* Does it do what it says, correctly?
Yes

* Does it produce compiler warnings?
No

* Can you make it crash?
No

* Consider the changes to the code in the context of the project as a
whole:
* Is everything done in a way that fits together coherently with
other features/modules?
I have a feeling that
enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a
very specific problem, and it gets complicated because there is no
infrastructure in the code to handle several timeouts at the same time
with sigalarm, so each timeout has its own dedicated and intertwined
code. But I'm still discovering this part of the code.

There is a problem with setitimer(): only one timer can be alive
at one time with the same timer id (ITIMER_REAL).

* Are there interdependencies that can cause problems?
I don't think so.

Thanks for the review, I will post a new patch sometime next week.

Best regards,
Zoltán Böszörményi

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Marc Cousin (#1)

2 attachment(s)

Hi,

Marc Cousin ï¿½rta:

Hi, I've been reviewing this patch for the last few days. Here it is :

...

* Are there dangers?
As it is a guc, it could be set globally, is that a danger ?

I haven't added any new code covering supplemental (e.g. autovacuum)
processes,
the interactions are yet to be discovered. Setting it globally is not
recommended.

* Are there corner cases the author has failed to consider?
The feature almost works as advertised : it fails when lock_timeout =
deadlock_timeout. Then the lock_timeout isn't detected. I think this
case isn't considered in handle_sig_alarm().

I fixed this by adding CheckLockTimeout() function that works like
CheckStatementTimeout() and ensuring that the same start time is
used for both deadlock_timeout and lock_timeout if both are active.
The preference of errors if their timeout values are equal is:
statement_timeout > lock_timeout > deadlock_timeout

* Consider the changes to the code in the context of the project as a whole:
* Is everything done in a way that fits together coherently with
other features/modules?
I have a feeling that
enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a
very specific problem, and it gets complicated because there is no
infrastructure in the code to handle several timeouts at the same time
with sigalarm, so each timeout has its own dedicated and intertwined
code. But I'm still discovering this part of the code.

I tried to create a framework that could potentially handle any number
timeouts
ordered by their fin_time but it doesn't survive "make check", it
reliably stalls
at the test in parallel_schedule below:

# ----------
# Another group of parallel tests
# ----------
test: select_into select_distinct select_distinct_on select_implicit
select_having subselect union case join aggregates transactions random
portals arrays btree_index hash_index update namespace prepared_xacts delete

This WIP patch is also attached for reference, too. I would prefer
this way, but I don't have more time to work on it and there are some
interdependencies in the signal handler when e.g. disable_sig_alarm(true);
means to disable ALL timers not just the statement_timeout.
The specifically coded lock_timeout patch goes with the flow and doesn't
change the semantics and works. If someone wants to pick up the timer
framework patch and can make it work, fine. But I am not explicitely
submitting it for the commitfest. The original patch with the fixes works
and needs only a little more review.

Best regards,
Zoltï¿½n Bï¿½szï¿½rmï¿½nyi

Attachments:

5-pg91-locktimeout-18-ctxdiff.patchtext/x-patch; name=5-pg91-locktimeout-18-ctxdiff.patchDownload

diff -dcrpN pgsql.orig/doc/src/sgml/config.sgml pgsql/doc/src/sgml/config.sgml
*** pgsql.orig/doc/src/sgml/config.sgml	2010-07-26 10:05:37.000000000 +0200
--- pgsql/doc/src/sgml/config.sgml	2010-07-29 11:58:56.000000000 +0200
*************** COPY postgres_log FROM '/full/path/to/lo
*** 4479,4484 ****
--- 4479,4508 ----
        </listitem>
       </varlistentry>
  
+      <varlistentry id="guc-lock-timeout" xreflabel="lock_timeout">
+       <term><varname>lock_timeout</varname> (<type>integer</type>)</term>
+       <indexterm>
+        <primary><varname>lock_timeout</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Abort any statement that tries to acquire a heavy-weight lock (e.g. rows,
+         pages, tables, indices or other objects) and the lock has to wait more
+         than the specified number of milliseconds, starting from the time the
+         command arrives at the server from the client.
+         If <varname>log_min_error_statement</> is set to <literal>ERROR</> or lower,
+         the statement that timed out will also be logged. A value of zero
+         (the default) turns off the limitation.
+        </para>
+ 
+        <para>
+         Setting <varname>lock_timeout</> in
+         <filename>postgresql.conf</> is not recommended because it
+         affects all sessions.
+        </para>      
+       </listitem>   
+      </varlistentry>
+ 
       <varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age">
        <term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>)</term>
        <indexterm>
diff -dcrpN pgsql.orig/doc/src/sgml/ref/lock.sgml pgsql/doc/src/sgml/ref/lock.sgml
*** pgsql.orig/doc/src/sgml/ref/lock.sgml	2010-04-03 09:23:01.000000000 +0200
--- pgsql/doc/src/sgml/ref/lock.sgml	2010-07-29 11:58:56.000000000 +0200
*************** LOCK [ TABLE ] [ ONLY ] <replaceable cla
*** 39,46 ****
     <literal>NOWAIT</literal> is specified, <command>LOCK
     TABLE</command> does not wait to acquire the desired lock: if it
     cannot be acquired immediately, the command is aborted and an
!    error is emitted.  Once obtained, the lock is held for the
!    remainder of the current transaction.  (There is no <command>UNLOCK
     TABLE</command> command; locks are always released at transaction
     end.)
    </para>
--- 39,49 ----
     <literal>NOWAIT</literal> is specified, <command>LOCK
     TABLE</command> does not wait to acquire the desired lock: if it
     cannot be acquired immediately, the command is aborted and an
!    error is emitted. If <varname>lock_timeout</varname> is set to a value
!    higher than 0, and the lock cannot be acquired under the specified
!    timeout value in milliseconds, the command is aborted and an error
!    is emitted. Once obtained, the lock is held for the remainder of  
!    the current transaction.  (There is no <command>UNLOCK
     TABLE</command> command; locks are always released at transaction
     end.)
    </para>
diff -dcrpN pgsql.orig/doc/src/sgml/ref/select.sgml pgsql/doc/src/sgml/ref/select.sgml
*** pgsql.orig/doc/src/sgml/ref/select.sgml	2010-06-20 13:59:13.000000000 +0200
--- pgsql/doc/src/sgml/ref/select.sgml	2010-07-29 11:58:56.000000000 +0200
*************** FOR SHARE [ OF <replaceable class="param
*** 1160,1165 ****
--- 1160,1173 ----
     </para>
  
     <para>
+     If <literal>NOWAIT</> option is not specified and <varname>lock_timeout</varname>
+     is set to a value higher than 0, and the lock needs to wait more than
+     the specified value in milliseconds, the command reports an error after
+     timing out, rather than waiting indefinitely. The note in the previous
+     paragraph applies to the <varname>lock_timeout</varname>, too.
+    </para>
+ 
+    <para>
      If specific tables are named in <literal>FOR UPDATE</literal>
      or <literal>FOR SHARE</literal>,
      then only rows coming from those tables are locked; any other
diff -dcrpN pgsql.orig/src/backend/port/posix_sema.c pgsql/src/backend/port/posix_sema.c
*** pgsql.orig/src/backend/port/posix_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql/src/backend/port/posix_sema.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 24,29 ****
--- 24,30 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  
  #ifdef USE_NAMED_POSIX_SEMAPHORES
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 313,315 ****
--- 314,341 ----
  
  	return true;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0
+  * Return if lock_timeout expired
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	int			errStatus;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = sem_wait(PG_SEM_REF(sema));
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "sem_wait failed: %m");
+ }
diff -dcrpN pgsql.orig/src/backend/port/sysv_sema.c pgsql/src/backend/port/sysv_sema.c
*** pgsql.orig/src/backend/port/sysv_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql/src/backend/port/sysv_sema.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 30,35 ****
--- 30,36 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  
  #ifndef HAVE_UNION_SEMUN
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 497,499 ****
--- 498,530 ----
  
  	return true;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0
+  * Return if lock_timeout expired
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	int			errStatus;
+ 	struct sembuf sops;
+ 
+ 	sops.sem_op = -1;			/* decrement */
+ 	sops.sem_flg = 0;
+ 	sops.sem_num = sema->semNum;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = semop(sema->semId, &sops, 1);
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "semop(id=%d) failed: %m", sema->semId);
+ }
diff -dcrpN pgsql.orig/src/backend/port/win32_sema.c pgsql/src/backend/port/win32_sema.c
*** pgsql.orig/src/backend/port/win32_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql/src/backend/port/win32_sema.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 16,21 ****
--- 16,22 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  static HANDLE *mySemSet;		/* IDs of sema sets acquired so far */
  static int	numSems;			/* number of sema sets acquired so far */
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 205,207 ****
--- 206,263 ----
  	/* keep compiler quiet */
  	return false;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0.
+  * Serve the interrupt if interruptOK is true.
+  * Return if lock_timeout expired.
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	DWORD		ret;
+ 	HANDLE		wh[2];
+ 
+ 	wh[0] = *sema;
+ 	wh[1] = pgwin32_signal_event;
+ 
+ 	/*
+ 	 * As in other implementations of PGSemaphoreLock, we need to check for
+ 	 * cancel/die interrupts each time through the loop.  But here, there is
+ 	 * no hidden magic about whether the syscall will internally service a
+ 	 * signal --- we do that ourselves.
+ 	 */
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 
+ 		errno = 0;
+ 		ret = WaitForMultipleObjectsEx(2, wh, FALSE, INFINITE, TRUE);
+ 
+ 		if (ret == WAIT_OBJECT_0)
+ 		{
+ 			/* We got it! */
+ 			return;
+ 		}
+ 		else if (ret == WAIT_OBJECT_0 + 1)
+ 		{
+ 			/* Signal event is set - we have a signal to deliver */
+ 			pgwin32_dispatch_queued_signals();
+ 			errno = EINTR;
+ 		}
+ 		else
+ 			/* Otherwise we are in trouble */
+ 			errno = EIDRM;
+ 
+ 		ImmediateInterruptOK = false;
+ 	} while (errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errno != 0)
+ 		ereport(FATAL,
+ 				(errmsg("could not lock semaphore: error code %d", (int) GetLastError())));
+ }
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lmgr.c pgsql/src/backend/storage/lmgr/lmgr.c
*** pgsql.orig/src/backend/storage/lmgr/lmgr.c	2010-01-02 17:57:52.000000000 +0100
--- pgsql/src/backend/storage/lmgr/lmgr.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 19,26 ****
--- 19,29 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/catalog.h"
+ #include "catalog/pg_database.h"
  #include "miscadmin.h"
  #include "storage/lmgr.h"
+ #include "utils/lsyscache.h"
+ #include "storage/proc.h"
  #include "storage/procarray.h"
  #include "utils/inval.h"
  
*************** LockRelationOid(Oid relid, LOCKMODE lock
*** 78,83 ****
--- 81,101 ----
  
  	res = LockAcquire(&tag, lockmode, false, false);
  
+ 	if (res == LOCKACQUIRE_NOT_AVAIL)
+ 	{
+ 		char	   *relname = get_rel_name(relid);
+ 		if (relname)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 						errmsg("could not obtain lock on relation \"%s\"",
+ 						relname)));
+ 		else
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 						errmsg("could not obtain lock on relation with OID %u",
+ 						relid)));
+ 	}
+ 
  	/*
  	 * Now that we have the lock, check for invalidation messages, so that we
  	 * will update or flush any stale relcache entry before we try to use it.
*************** LockRelation(Relation relation, LOCKMODE
*** 173,178 ****
--- 191,202 ----
  
  	res = LockAcquire(&tag, lockmode, false, false);
  
+ 	if (res == LOCKACQUIRE_NOT_AVAIL)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 					errmsg("could not obtain lock on relation \"%s\"",
+ 				RelationGetRelationName(relation))));
+ 
  	/*
  	 * Now that we have the lock, check for invalidation messages; see notes
  	 * in LockRelationOid.
*************** LockRelationIdForSession(LockRelId *reli
*** 250,256 ****
  
  	SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId);
  
! 	(void) LockAcquire(&tag, lockmode, true, false);
  }
  
  /*
--- 274,293 ----
  
  	SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId);
  
! 	if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL)
! 	{
! 		char	   *relname = get_rel_name(relid->relId);
! 		if (relname)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on relation \"%s\"",
! 						relname)));
! 		else
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on relation with OID %u",
! 						relid->relId)));
! 	}
  }
  
  /*
*************** LockRelationForExtension(Relation relati
*** 285,291 ****
  								relation->rd_lockInfo.lockRelId.dbId,
  								relation->rd_lockInfo.lockRelId.relId);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 322,332 ----
  								relation->rd_lockInfo.lockRelId.dbId,
  								relation->rd_lockInfo.lockRelId.relId);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on index \"%s\"",
! 				RelationGetRelationName(relation))));
  }
  
  /*
*************** LockPage(Relation relation, BlockNumber 
*** 319,325 ****
  					 relation->rd_lockInfo.lockRelId.relId,
  					 blkno);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 360,370 ----
  					 relation->rd_lockInfo.lockRelId.relId,
  					 blkno);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on page %u of relation \"%s\"",
! 				blkno, RelationGetRelationName(relation))));
  }
  
  /*
*************** LockTuple(Relation relation, ItemPointer
*** 375,381 ****
  					  ItemPointerGetBlockNumber(tid),
  					  ItemPointerGetOffsetNumber(tid));
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 420,430 ----
  					  ItemPointerGetBlockNumber(tid),
  					  ItemPointerGetOffsetNumber(tid));
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on row in relation \"%s\"",
! 				RelationGetRelationName(relation))));
  }
  
  /*
*************** XactLockTableInsert(TransactionId xid)
*** 429,435 ****
  
  	SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 	(void) LockAcquire(&tag, ExclusiveLock, false, false);
  }
  
  /*
--- 478,487 ----
  
  	SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 	if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on transaction with ID %u", xid)));
  }
  
  /*
*************** XactLockTableWait(TransactionId xid)
*** 473,479 ****
  
  		SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 		(void) LockAcquire(&tag, ShareLock, false, false);
  
  		LockRelease(&tag, ShareLock, false);
  
--- 525,534 ----
  
  		SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 		if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on transaction with ID %u", xid)));
  
  		LockRelease(&tag, ShareLock, false);
  
*************** VirtualXactLockTableInsert(VirtualTransa
*** 531,537 ****
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	(void) LockAcquire(&tag, ExclusiveLock, false, false);
  }
  
  /*
--- 586,596 ----
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on virtual transaction with ID %u",
! 				vxid.localTransactionId)));
  }
  
  /*
*************** VirtualXactLockTableWait(VirtualTransact
*** 549,555 ****
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	(void) LockAcquire(&tag, ShareLock, false, false);
  
  	LockRelease(&tag, ShareLock, false);
  }
--- 608,618 ----
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on virtual transaction with ID %u",
! 				vxid.localTransactionId)));
  
  	LockRelease(&tag, ShareLock, false);
  }
*************** LockDatabaseObject(Oid classid, Oid obji
*** 598,604 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 661,671 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on class:object: %u:%u",
! 				classid, objid)));
  }
  
  /*
*************** LockSharedObject(Oid classid, Oid objid,
*** 636,642 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  
  	/* Make sure syscaches are up-to-date with any changes we waited for */
  	AcceptInvalidationMessages();
--- 703,713 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on class:object: %u:%u",
! 				classid, objid)));
  
  	/* Make sure syscaches are up-to-date with any changes we waited for */
  	AcceptInvalidationMessages();
*************** LockSharedObjectForSession(Oid classid, 
*** 678,684 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, true, false);
  }
  
  /*
--- 749,770 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL)
! 		switch(classid)
! 		{
! 		case DatabaseRelationId:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on database with ID %u",
! 					objid)));
! 			break;
! 		default:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on class:object: %u:%u",
! 					classid, objid)));
! 			break;
! 		}
  }
  
  /*
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lock.c pgsql/src/backend/storage/lmgr/lock.c
*** pgsql.orig/src/backend/storage/lmgr/lock.c	2010-04-29 12:09:03.000000000 +0200
--- pgsql/src/backend/storage/lmgr/lock.c	2010-07-29 11:58:56.000000000 +0200
*************** PROCLOCK_PRINT(const char *where, const 
*** 255,261 ****
  static uint32 proclock_hash(const void *key, Size keysize);
  static void RemoveLocalLock(LOCALLOCK *locallock);
  static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
! static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
  static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
  			PROCLOCK *proclock, LockMethod lockMethodTable);
  static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
--- 255,261 ----
  static uint32 proclock_hash(const void *key, Size keysize);
  static void RemoveLocalLock(LOCALLOCK *locallock);
  static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
! static int WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
  static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
  			PROCLOCK *proclock, LockMethod lockMethodTable);
  static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
*************** ProcLockHashCode(const PROCLOCKTAG *proc
*** 447,453 ****
   *	dontWait: if true, don't wait to acquire lock
   *
   * Returns one of:
!  *		LOCKACQUIRE_NOT_AVAIL		lock not available, and dontWait=true
   *		LOCKACQUIRE_OK				lock successfully acquired
   *		LOCKACQUIRE_ALREADY_HELD	incremented count for lock already held
   *
--- 447,453 ----
   *	dontWait: if true, don't wait to acquire lock
   *
   * Returns one of:
!  *		LOCKACQUIRE_NOT_AVAIL		lock not available, either dontWait=true or timeout
   *		LOCKACQUIRE_OK				lock successfully acquired
   *		LOCKACQUIRE_ALREADY_HELD	incremented count for lock already held
   *
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 833,839 ****
  										 locktag->locktag_type,
  										 lockmode);
  
! 		WaitOnLock(locallock, owner);
  
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
--- 833,839 ----
  										 locktag->locktag_type,
  										 lockmode);
  
! 		status = WaitOnLock(locallock, owner);
  
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 848,867 ****
  		 * done when the lock was granted to us --- see notes in WaitOnLock.
  		 */
  
! 		/*
! 		 * Check the proclock entry status, in case something in the ipc
! 		 * communication doesn't work correctly.
! 		 */
! 		if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
  		{
! 			PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock);
! 			LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode);
! 			/* Should we retry ? */
! 			LWLockRelease(partitionLock);
! 			elog(ERROR, "LockAcquire failed");
  		}
- 		PROCLOCK_PRINT("LockAcquire: granted", proclock);
- 		LOCK_PRINT("LockAcquire: granted", lock, lockmode);
  	}
  
  	LWLockRelease(partitionLock);
--- 848,879 ----
  		 * done when the lock was granted to us --- see notes in WaitOnLock.
  		 */
  
! 		switch (status)
  		{
! 		case STATUS_OK:
! 			/*
! 			 * Check the proclock entry status, in case something in the ipc
! 			 * communication doesn't work correctly.
! 			 */
! 			if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
! 			{
! 				PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock);
! 				LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode);
! 				/* Should we retry ? */
! 				LWLockRelease(partitionLock);
! 				elog(ERROR, "LockAcquire failed");
! 			}
! 			PROCLOCK_PRINT("LockAcquire: granted", proclock);
! 			LOCK_PRINT("LockAcquire: granted", lock, lockmode);
! 			break;
! 		case STATUS_WAITING:
! 			PROCLOCK_PRINT("LockAcquire: timed out", proclock);
! 			LOCK_PRINT("LockAcquire: timed out", lock, lockmode);
! 			break;
! 		default:
! 			elog(ERROR, "LockAcquire invalid status");
! 			break;
  		}
  	}
  
  	LWLockRelease(partitionLock);
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 887,893 ****
  							   locktag->locktag_field2);
  	}
  
! 	return LOCKACQUIRE_OK;
  }
  
  /*
--- 899,905 ----
  							   locktag->locktag_field2);
  	}
  
! 	return (status == STATUS_OK ? LOCKACQUIRE_OK : LOCKACQUIRE_NOT_AVAIL);
  }
  
  /*
*************** GrantAwaitedLock(void)
*** 1165,1178 ****
   * Caller must have set MyProc->heldLocks to reflect locks already held
   * on the lockable object by this process.
   *
   * The appropriate partition lock must be held at entry.
   */
! static void
  WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
  {
  	LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock);
  	LockMethod	lockMethodTable = LockMethods[lockmethodid];
  	char	   *volatile new_status = NULL;
  
  	LOCK_PRINT("WaitOnLock: sleeping on lock",
  			   locallock->lock, locallock->tag.mode);
--- 1177,1196 ----
   * Caller must have set MyProc->heldLocks to reflect locks already held
   * on the lockable object by this process.
   *
+  * Result: returns value of ProcSleep()
+  *	STATUS_OK if we acquired the lock
+  *	STATUS_ERROR if not (deadlock)
+  *	STATUS_WAITING if not (timeout)
+  *
   * The appropriate partition lock must be held at entry.
   */
! static int
  WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
  {
  	LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock);
  	LockMethod	lockMethodTable = LockMethods[lockmethodid];
  	char	   *volatile new_status = NULL;
+ 	int		wait_status;
  
  	LOCK_PRINT("WaitOnLock: sleeping on lock",
  			   locallock->lock, locallock->tag.mode);
*************** WaitOnLock(LOCALLOCK *locallock, Resourc
*** 1214,1221 ****
  	 */
  	PG_TRY();
  	{
! 		if (ProcSleep(locallock, lockMethodTable) != STATUS_OK)
  		{
  			/*
  			 * We failed as a result of a deadlock, see CheckDeadLock(). Quit
  			 * now.
--- 1232,1244 ----
  	 */
  	PG_TRY();
  	{
! 		wait_status = ProcSleep(locallock, lockMethodTable);
! 		switch (wait_status)
  		{
+ 		case STATUS_OK:
+ 		case STATUS_WAITING:
+ 			break;
+ 		default:
  			/*
  			 * We failed as a result of a deadlock, see CheckDeadLock(). Quit
  			 * now.
*************** WaitOnLock(LOCALLOCK *locallock, Resourc
*** 1260,1267 ****
  		pfree(new_status);
  	}
  
! 	LOCK_PRINT("WaitOnLock: wakeup on lock",
  			   locallock->lock, locallock->tag.mode);
  }
  
  /*
--- 1283,1296 ----
  		pfree(new_status);
  	}
  
! 	if (wait_status == STATUS_OK)
! 		LOCK_PRINT("WaitOnLock: wakeup on lock",
! 			   locallock->lock, locallock->tag.mode);
! 	else if (wait_status == STATUS_WAITING)
! 		LOCK_PRINT("WaitOnLock: timeout on lock",
  			   locallock->lock, locallock->tag.mode);
+ 
+ 	return wait_status;
  }
  
  /*
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/proc.c pgsql/src/backend/storage/lmgr/proc.c
*** pgsql.orig/src/backend/storage/lmgr/proc.c	2010-07-11 11:14:54.000000000 +0200
--- pgsql/src/backend/storage/lmgr/proc.c	2010-07-29 13:08:19.000000000 +0200
***************
*** 52,57 ****
--- 52,59 ----
  /* GUC variables */
  int			DeadlockTimeout = 1000;
  int			StatementTimeout = 0;
+ int			LockTimeout = 0;
+ static int		CurrentLockTimeout = 0;
  bool		log_lock_waits = false;
  
  /* Pointer to this process's PGPROC struct, if any */
*************** static volatile bool statement_timeout_a
*** 79,98 ****
--- 81,107 ----
  static volatile bool deadlock_timeout_active = false;
  static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
  volatile bool cancel_from_timeout = false;
+ static volatile bool lock_timeout_active = false;
+ volatile bool lock_timeout_detected = false;
  
  /* timeout_start_time is set when log_lock_waits is true */
  static TimestampTz timeout_start_time;
+ static TimestampTz timeout_fin_time;
  
  /* statement_fin_time is valid only if statement_timeout_active is true */
  static TimestampTz statement_fin_time;
  static TimestampTz statement_fin_time2; /* valid only in recovery */
  
+ /* lock_timeout_fin_time is valid only if lock_timeout_active is true */
+ static TimestampTz lock_timeout_fin_time;
  
  static void RemoveProcFromArray(int code, Datum arg);
  static void ProcKill(int code, Datum arg);
  static void AuxiliaryProcKill(int code, Datum arg);
  static bool CheckStatementTimeout(void);
+ static bool CheckLockTimeout(void);
  static bool CheckStandbyTimeout(void);
+ static bool enable_sig_alarm_for_lock_timeout(int delayms);
  
  
  /*
*************** ProcQueueInit(PROC_QUEUE *queue)
*** 797,803 ****
   * The lock table's partition lock must be held at entry, and will be held
   * at exit.
   *
!  * Result: STATUS_OK if we acquired the lock, STATUS_ERROR if not (deadlock).
   *
   * ASSUME: that no one will fiddle with the queue until after
   *		we release the partition lock.
--- 806,815 ----
   * The lock table's partition lock must be held at entry, and will be held
   * at exit.
   *
!  * Result:
!  *     STATUS_OK if we acquired the lock
!  *     STATUS_ERROR if not (deadlock)
!  *     STATUS_WAITING if not (timeout)
   *
   * ASSUME: that no one will fiddle with the queue until after
   *		we release the partition lock.
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 951,957 ****
  		elog(FATAL, "could not set timer for process wakeup");
  
  	/*
! 	 * If someone wakes us between LWLockRelease and PGSemaphoreLock,
  	 * PGSemaphoreLock will not block.	The wakeup is "saved" by the semaphore
  	 * implementation.	While this is normally good, there are cases where a
  	 * saved wakeup might be leftover from a previous operation (for example,
--- 963,978 ----
  		elog(FATAL, "could not set timer for process wakeup");
  
  	/*
! 	 * Reset timer so we are awaken in case of lock timeout.
! 	 * This doesn't modify the timer for deadlock check in case
! 	 * the deadlock check happens earlier.
! 	 */
! 	CurrentLockTimeout = LockTimeout;
! 	if (!enable_sig_alarm_for_lock_timeout(CurrentLockTimeout))
! 		elog(FATAL, "could not set timer for process wakeup");
! 
! 	/*
! 	 * If someone wakes us between LWLockRelease and PGSemaphoreTimedLock,
  	 * PGSemaphoreLock will not block.	The wakeup is "saved" by the semaphore
  	 * implementation.	While this is normally good, there are cases where a
  	 * saved wakeup might be leftover from a previous operation (for example,
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 969,975 ****
  	 */
  	do
  	{
! 		PGSemaphoreLock(&MyProc->sem, true);
  
  		/*
  		 * waitStatus could change from STATUS_WAITING to something else
--- 990,999 ----
  	 */
  	do
  	{
! 		PGSemaphoreTimedLock(&MyProc->sem, true);
! 
! 		if (lock_timeout_detected == true)
! 			break;
  
  		/*
  		 * waitStatus could change from STATUS_WAITING to something else
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1109,1114 ****
--- 1133,1146 ----
  	LWLockAcquire(partitionLock, LW_EXCLUSIVE);
  
  	/*
+ 	 * If we're in timeout, so we're not waiting anymore and
+ 	 * we're not the one that the lock will be granted to.
+ 	 * So remove ourselves from the wait queue.
+ 	 */
+ 	if (lock_timeout_detected)
+ 		RemoveFromWaitQueue(MyProc, hashcode);
+ 
+ 	/*
  	 * We no longer want LockWaitCancel to do anything.
  	 */
  	lockAwaited = NULL;
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1122,1129 ****
  	/*
  	 * We don't have to do anything else, because the awaker did all the
  	 * necessary update of the lock table and MyProc.
  	 */
! 	return MyProc->waitStatus;
  }
  
  
--- 1154,1163 ----
  	/*
  	 * We don't have to do anything else, because the awaker did all the
  	 * necessary update of the lock table and MyProc.
+ 	 * RemoveFromWaitQueue() have set MyProc->waitStatus = STATUS_ERROR,
+ 	 * we need to distinguish this case.
  	 */
! 	return (lock_timeout_detected ? STATUS_WAITING : MyProc->waitStatus);
  }
  
  
*************** enable_sig_alarm(int delayms, bool is_st
*** 1462,1479 ****
  		 * than normal, but that does no harm.
  		 */
  		timeout_start_time = GetCurrentTimestamp();
! 		fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
  		deadlock_timeout_active = true;
! 		if (fin_time >= statement_fin_time)
  			return true;
  	}
  	else
  	{
  		/* Begin deadlock timeout with no statement-level timeout */
  		deadlock_timeout_active = true;
! 		/* GetCurrentTimestamp can be expensive, so only do it if we must */
! 		if (log_lock_waits)
! 			timeout_start_time = GetCurrentTimestamp();
  	}
  
  	/* If we reach here, okay to set the timer interrupt */
--- 1496,1564 ----
  		 * than normal, but that does no harm.
  		 */
  		timeout_start_time = GetCurrentTimestamp();
! 		timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
  		deadlock_timeout_active = true;
! 		if (timeout_fin_time >= statement_fin_time)
  			return true;
  	}
  	else
  	{
  		/* Begin deadlock timeout with no statement-level timeout */
  		deadlock_timeout_active = true;
! 		/*
! 		 * Computing the timeout_fin_time is needed because
! 		 * the lock timeout logic checks for it.
! 		 */
! 		timeout_start_time = GetCurrentTimestamp();
! 		timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
! 	}
! 
! 	/* If we reach here, okay to set the timer interrupt */
! 	MemSet(&timeval, 0, sizeof(struct itimerval));
! 	timeval.it_value.tv_sec = delayms / 1000;
! 	timeval.it_value.tv_usec = (delayms % 1000) * 1000;
! 	if (setitimer(ITIMER_REAL, &timeval, NULL))
! 		return false;
! 	return true;
! }
! 
! /*
!  * Enable the SIGALRM interrupt to fire after the specified delay
!  * in case LockTimeout is set.
!  *
!  * This code properly handles nesting of lock_timeout timeout alarm
!  * within deadlock timeout and statement timeout alarms.
!  *
!  * Returns TRUE if okay, FALSE on failure.
!  */
! static bool
! enable_sig_alarm_for_lock_timeout(int delayms)
! {
! 	struct itimerval	timeval;
! 	TimestampTz		fin_time;
! 
! 	lock_timeout_detected = false;
! 	if (LockTimeout == 0)
! 		return true;
! 
! 	if (deadlock_timeout_active)
! 		/*
! 		 * ensure the same starting time for deadlock_timeout and lock_timeout
! 		 */
! 		fin_time = timeout_start_time;
! 	else
! 		fin_time = GetCurrentTimestamp();
! 	fin_time = TimestampTzPlusMilliseconds(fin_time, delayms);
! 
! 	if (statement_timeout_active)
! 	{
! 		if (fin_time > statement_fin_time)
! 			return true;
! 	}
! 	else if (deadlock_timeout_active)
! 	{
! 		if (fin_time > timeout_fin_time)
! 			return true;
  	}
  
  	/* If we reach here, okay to set the timer interrupt */
*************** enable_sig_alarm(int delayms, bool is_st
*** 1482,1487 ****
--- 1567,1575 ----
  	timeval.it_value.tv_usec = (delayms % 1000) * 1000;
  	if (setitimer(ITIMER_REAL, &timeval, NULL))
  		return false;
+ 
+ 	lock_timeout_fin_time = fin_time;
+ 	lock_timeout_active = true;
  	return true;
  }
  
*************** disable_sig_alarm(bool is_statement_time
*** 1502,1508 ****
  	 *
  	 * We will re-enable the interrupt if necessary in CheckStatementTimeout.
  	 */
! 	if (statement_timeout_active || deadlock_timeout_active)
  	{
  		struct itimerval timeval;
  
--- 1590,1596 ----
  	 *
  	 * We will re-enable the interrupt if necessary in CheckStatementTimeout.
  	 */
! 	if (statement_timeout_active || deadlock_timeout_active || lock_timeout_active)
  	{
  		struct itimerval timeval;
  
*************** disable_sig_alarm(bool is_statement_time
*** 1512,1517 ****
--- 1600,1607 ----
  			statement_timeout_active = false;
  			cancel_from_timeout = false;
  			deadlock_timeout_active = false;
+ 			lock_timeout_active = false;
+ 			lock_timeout_detected = false;
  			return false;
  		}
  	}
*************** disable_sig_alarm(bool is_statement_time
*** 1525,1534 ****
  		statement_timeout_active = false;
  		cancel_from_timeout = false;
  	}
! 	else if (statement_timeout_active)
  	{
! 		if (!CheckStatementTimeout())
! 			return false;
  	}
  	return true;
  }
--- 1615,1633 ----
  		statement_timeout_active = false;
  		cancel_from_timeout = false;
  	}
! 	else
  	{
! 		if (statement_timeout_active)
! 		{
! 			if (!CheckStatementTimeout())
! 				return false;
! 		}
! 
! 		if (lock_timeout_active)
! 		{
! 			if (!CheckLockTimeout())
! 				return false;
! 		}
  	}
  	return true;
  }
*************** CheckStatementTimeout(void)
*** 1590,1595 ****
--- 1689,1744 ----
  
  
  /*
+  * Check for lock timeout.  If the timeout time has come,
+  * indicate it; if not, reschedule the SIGALRM interrupt to occur
+  * at the right time.
+  *
+  * Returns true if okay, false if failed to set the interrupt.
+  */
+ static bool
+ CheckLockTimeout(void)
+ {
+ 	TimestampTz now;
+ 
+ 	if (!lock_timeout_active)
+ 		return true;			/* do nothing if not active */
+ 
+ 	now = GetCurrentTimestamp();
+ 
+ 	if (now >= lock_timeout_fin_time)
+ 	{
+ 		/* Time to die */
+ 		lock_timeout_active = false;
+ 		lock_timeout_detected = true;
+ 	}
+ 	else
+ 	{
+ 		/* Not time yet, so (re)schedule the interrupt */
+ 		long		secs;
+ 		int			usecs;
+ 		struct itimerval timeval;
+ 
+ 		TimestampDifference(now, statement_fin_time,
+ 							&secs, &usecs);
+ 
+ 		/*
+ 		 * It's possible that the difference is less than a microsecond;
+ 		 * ensure we don't cancel, rather than set, the interrupt.
+ 		 */
+ 		if (secs == 0 && usecs == 0)
+ 			usecs = 1;
+ 		MemSet(&timeval, 0, sizeof(struct itimerval));
+ 		timeval.it_value.tv_sec = secs;
+ 		timeval.it_value.tv_usec = usecs;
+ 		if (setitimer(ITIMER_REAL, &timeval, NULL))
+ 			return false;
+ 	}
+ 
+ 	return true;
+ }
+ 
+ 
+ /*
   * Signal handler for SIGALRM for normal user backends
   *
   * Process deadlock check and/or statement timeout check, as needed.
*************** handle_sig_alarm(SIGNAL_ARGS)
*** 1608,1613 ****
--- 1757,1765 ----
  		CheckDeadLock();
  	}
  
+ 	if (lock_timeout_active)
+ 		(void) CheckLockTimeout();
+ 
  	if (statement_timeout_active)
  		(void) CheckStatementTimeout();
  
diff -dcrpN pgsql.orig/src/backend/utils/misc/guc.c pgsql/src/backend/utils/misc/guc.c
*** pgsql.orig/src/backend/utils/misc/guc.c	2010-07-26 10:05:55.000000000 +0200
--- pgsql/src/backend/utils/misc/guc.c	2010-07-29 11:58:56.000000000 +0200
*************** static struct config_int ConfigureNamesI
*** 1648,1653 ****
--- 1648,1663 ----
  	},
  
  	{
+ 		{"lock_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT,
+ 			gettext_noop("Sets the maximum allowed timeout for any lock taken by a statement."),
+ 			gettext_noop("A value of 0 turns off the timeout."),
+ 			GUC_UNIT_MS
+ 		},
+ 		&LockTimeout,
+ 		0, 0, INT_MAX, NULL, NULL
+ 	},
+ 
+ 	{
  		{"vacuum_freeze_min_age", PGC_USERSET, CLIENT_CONN_STATEMENT,
  			gettext_noop("Minimum age at which VACUUM should freeze a table row."),
  			NULL
diff -dcrpN pgsql.orig/src/backend/utils/misc/postgresql.conf.sample pgsql/src/backend/utils/misc/postgresql.conf.sample
*** pgsql.orig/src/backend/utils/misc/postgresql.conf.sample	2010-07-26 10:05:55.000000000 +0200
--- pgsql/src/backend/utils/misc/postgresql.conf.sample	2010-07-29 11:58:56.000000000 +0200
***************
*** 492,497 ****
--- 492,500 ----
  #------------------------------------------------------------------------------
  
  #deadlock_timeout = 1s
+ #lock_timeout = 0			# timeout value for heavy-weight locks
+ 					# taken by statements. 0 disables timeout
+ 					# unit in milliseconds, default is 0
  #max_locks_per_transaction = 64		# min 10
  					# (change requires restart)
  # Note:  Each lock table slot uses ~270 bytes of shared memory, and there are
diff -dcrpN pgsql.orig/src/include/storage/pg_sema.h pgsql/src/include/storage/pg_sema.h
*** pgsql.orig/src/include/storage/pg_sema.h	2010-01-02 17:58:08.000000000 +0100
--- pgsql/src/include/storage/pg_sema.h	2010-07-29 11:58:56.000000000 +0200
*************** extern void PGSemaphoreUnlock(PGSemaphor
*** 80,83 ****
--- 80,86 ----
  /* Lock a semaphore only if able to do so without blocking */
  extern bool PGSemaphoreTryLock(PGSemaphore sema);
  
+ /* Lock a semaphore (decrement count), blocking if count would be < 0 */
+ extern void PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK);
+ 
  #endif   /* PG_SEMA_H */
diff -dcrpN pgsql.orig/src/include/storage/proc.h pgsql/src/include/storage/proc.h
*** pgsql.orig/src/include/storage/proc.h	2010-07-11 11:15:00.000000000 +0200
--- pgsql/src/include/storage/proc.h	2010-07-29 11:58:56.000000000 +0200
*************** typedef struct PROC_HDR
*** 163,170 ****
--- 163,172 ----
  /* configurable options */
  extern int	DeadlockTimeout;
  extern int	StatementTimeout;
+ extern int	LockTimeout;
  extern bool log_lock_waits;
  
+ extern volatile bool lock_timeout_detected;
  extern volatile bool cancel_from_timeout;

pg91-experimental-lock-framework.patchtext/x-patch; name=pg91-experimental-lock-framework.patchDownload

diff -dcrpN pgsql.orig/doc/src/sgml/config.sgml pgsql.lck-experimental-framework/doc/src/sgml/config.sgml
*** pgsql.orig/doc/src/sgml/config.sgml	2010-07-26 10:05:37.000000000 +0200
--- pgsql.lck-experimental-framework/doc/src/sgml/config.sgml	2010-07-26 10:53:11.000000000 +0200
*************** COPY postgres_log FROM '/full/path/to/lo
*** 4479,4484 ****
--- 4479,4508 ----
        </listitem>
       </varlistentry>
  
+      <varlistentry id="guc-lock-timeout" xreflabel="lock_timeout">
+       <term><varname>lock_timeout</varname> (<type>integer</type>)</term>
+       <indexterm>
+        <primary><varname>lock_timeout</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Abort any statement that tries to acquire a heavy-weight lock (e.g. rows,
+         pages, tables, indices or other objects) and the lock has to wait more
+         than the specified number of milliseconds, starting from the time the
+         command arrives at the server from the client.
+         If <varname>log_min_error_statement</> is set to <literal>ERROR</> or lower,
+         the statement that timed out will also be logged. A value of zero
+         (the default) turns off the limitation.
+        </para>
+ 
+        <para>
+         Setting <varname>lock_timeout</> in
+         <filename>postgresql.conf</> is not recommended because it
+         affects all sessions.
+        </para>      
+       </listitem>   
+      </varlistentry>
+ 
       <varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age">
        <term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>)</term>
        <indexterm>
diff -dcrpN pgsql.orig/doc/src/sgml/ref/lock.sgml pgsql.lck-experimental-framework/doc/src/sgml/ref/lock.sgml
*** pgsql.orig/doc/src/sgml/ref/lock.sgml	2010-04-03 09:23:01.000000000 +0200
--- pgsql.lck-experimental-framework/doc/src/sgml/ref/lock.sgml	2010-07-26 10:53:11.000000000 +0200
*************** LOCK [ TABLE ] [ ONLY ] <replaceable cla
*** 39,46 ****
     <literal>NOWAIT</literal> is specified, <command>LOCK
     TABLE</command> does not wait to acquire the desired lock: if it
     cannot be acquired immediately, the command is aborted and an
!    error is emitted.  Once obtained, the lock is held for the
!    remainder of the current transaction.  (There is no <command>UNLOCK
     TABLE</command> command; locks are always released at transaction
     end.)
    </para>
--- 39,49 ----
     <literal>NOWAIT</literal> is specified, <command>LOCK
     TABLE</command> does not wait to acquire the desired lock: if it
     cannot be acquired immediately, the command is aborted and an
!    error is emitted. If <varname>lock_timeout</varname> is set to a value
!    higher than 0, and the lock cannot be acquired under the specified
!    timeout value in milliseconds, the command is aborted and an error
!    is emitted. Once obtained, the lock is held for the remainder of  
!    the current transaction.  (There is no <command>UNLOCK
     TABLE</command> command; locks are always released at transaction
     end.)
    </para>
diff -dcrpN pgsql.orig/doc/src/sgml/ref/select.sgml pgsql.lck-experimental-framework/doc/src/sgml/ref/select.sgml
*** pgsql.orig/doc/src/sgml/ref/select.sgml	2010-06-20 13:59:13.000000000 +0200
--- pgsql.lck-experimental-framework/doc/src/sgml/ref/select.sgml	2010-07-26 10:53:11.000000000 +0200
*************** FOR SHARE [ OF <replaceable class="param
*** 1160,1165 ****
--- 1160,1173 ----
     </para>
  
     <para>
+     If <literal>NOWAIT</> option is not specified and <varname>lock_timeout</varname>
+     is set to a value higher than 0, and the lock needs to wait more than
+     the specified value in milliseconds, the command reports an error after
+     timing out, rather than waiting indefinitely. The note in the previous
+     paragraph applies to the <varname>lock_timeout</varname>, too.
+    </para>
+ 
+    <para>
      If specific tables are named in <literal>FOR UPDATE</literal>
      or <literal>FOR SHARE</literal>,
      then only rows coming from those tables are locked; any other
diff -dcrpN pgsql.orig/src/backend/port/posix_sema.c pgsql.lck-experimental-framework/src/backend/port/posix_sema.c
*** pgsql.orig/src/backend/port/posix_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql.lck-experimental-framework/src/backend/port/posix_sema.c	2010-07-26 15:20:14.000000000 +0200
***************
*** 24,29 ****
--- 24,30 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  
  #ifdef USE_NAMED_POSIX_SEMAPHORES
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 313,315 ****
--- 314,341 ----
  
  	return true;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0
+  * Return if lock_timeout expired
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	int			errStatus;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = sem_wait(PG_SEM_REF(sema));
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "sem_wait failed: %m");
+ }
diff -dcrpN pgsql.orig/src/backend/port/sysv_sema.c pgsql.lck-experimental-framework/src/backend/port/sysv_sema.c
*** pgsql.orig/src/backend/port/sysv_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql.lck-experimental-framework/src/backend/port/sysv_sema.c	2010-07-26 15:20:19.000000000 +0200
***************
*** 30,35 ****
--- 30,36 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  
  #ifndef HAVE_UNION_SEMUN
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 497,499 ****
--- 498,530 ----
  
  	return true;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0
+  * Return if lock_timeout expired
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	int			errStatus;
+ 	struct sembuf sops;
+ 
+ 	sops.sem_op = -1;			/* decrement */
+ 	sops.sem_flg = 0;
+ 	sops.sem_num = sema->semNum;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = semop(sema->semId, &sops, 1);
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "semop(id=%d) failed: %m", sema->semId);
+ }
diff -dcrpN pgsql.orig/src/backend/port/win32_sema.c pgsql.lck-experimental-framework/src/backend/port/win32_sema.c
*** pgsql.orig/src/backend/port/win32_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql.lck-experimental-framework/src/backend/port/win32_sema.c	2010-07-26 15:20:27.000000000 +0200
***************
*** 16,21 ****
--- 16,22 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  static HANDLE *mySemSet;		/* IDs of sema sets acquired so far */
  static int	numSems;			/* number of sema sets acquired so far */
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 205,207 ****
--- 206,263 ----
  	/* keep compiler quiet */
  	return false;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0.
+  * Serve the interrupt if interruptOK is true.
+  * Return if lock_timeout expired.
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	DWORD		ret;
+ 	HANDLE		wh[2];
+ 
+ 	wh[0] = *sema;
+ 	wh[1] = pgwin32_signal_event;
+ 
+ 	/*
+ 	 * As in other implementations of PGSemaphoreLock, we need to check for
+ 	 * cancel/die interrupts each time through the loop.  But here, there is
+ 	 * no hidden magic about whether the syscall will internally service a
+ 	 * signal --- we do that ourselves.
+ 	 */
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 
+ 		errno = 0;
+ 		ret = WaitForMultipleObjectsEx(2, wh, FALSE, INFINITE, TRUE);
+ 
+ 		if (ret == WAIT_OBJECT_0)
+ 		{
+ 			/* We got it! */
+ 			return;
+ 		}
+ 		else if (ret == WAIT_OBJECT_0 + 1)
+ 		{
+ 			/* Signal event is set - we have a signal to deliver */
+ 			pgwin32_dispatch_queued_signals();
+ 			errno = EINTR;
+ 		}
+ 		else
+ 			/* Otherwise we are in trouble */
+ 			errno = EIDRM;
+ 
+ 		ImmediateInterruptOK = false;
+ 	} while (errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errno != 0)
+ 		ereport(FATAL,
+ 				(errmsg("could not lock semaphore: error code %d", (int) GetLastError())));
+ }
diff -dcrpN pgsql.orig/src/backend/postmaster/autovacuum.c pgsql.lck-experimental-framework/src/backend/postmaster/autovacuum.c
*** pgsql.orig/src/backend/postmaster/autovacuum.c	2010-04-29 12:09:03.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/postmaster/autovacuum.c	2010-07-28 18:45:56.000000000 +0200
*************** AutoVacLauncherMain(int argc, char *argv
*** 481,487 ****
  
  		/* Forget any pending QueryCancel request */
  		QueryCancelPending = false;
! 		disable_sig_alarm(true);
  		QueryCancelPending = false;		/* again in case timeout occurred */
  
  		/* Report the error to the server log */
--- 481,487 ----
  
  		/* Forget any pending QueryCancel request */
  		QueryCancelPending = false;
! 		disable_all_sig_alarm(); /* was disable_sig_alarm(true) a.k.a. disable_statement_sig_alarm() */
  		QueryCancelPending = false;		/* again in case timeout occurred */
  
  		/* Report the error to the server log */
diff -dcrpN pgsql.orig/src/backend/postmaster/postmaster.c pgsql.lck-experimental-framework/src/backend/postmaster/postmaster.c
*** pgsql.orig/src/backend/postmaster/postmaster.c	2010-07-26 10:05:47.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/postmaster/postmaster.c	2010-07-28 14:59:37.000000000 +0200
*************** BackendInitialize(Port *port)
*** 3414,3420 ****
  	 * indefinitely.  PreAuthDelay and any DNS interactions above don't count
  	 * against the time limit.
  	 */
! 	if (!enable_sig_alarm(AuthenticationTimeout * 1000, false))
  		elog(FATAL, "could not set timer for startup packet timeout");
  
  	/*
--- 3414,3420 ----
  	 * indefinitely.  PreAuthDelay and any DNS interactions above don't count
  	 * against the time limit.
  	 */
! 	if (!enable_sig_alarm(AuthenticationTimeout * 1000))
  		elog(FATAL, "could not set timer for startup packet timeout");
  
  	/*
*************** BackendInitialize(Port *port)
*** 3452,3458 ****
  	/*
  	 * Disable the timeout, and prevent SIGTERM/SIGQUIT again.
  	 */
! 	if (!disable_sig_alarm(false))
  		elog(FATAL, "could not disable timer for startup packet timeout");
  	PG_SETMASK(&BlockSig);
  }
--- 3452,3458 ----
  	/*
  	 * Disable the timeout, and prevent SIGTERM/SIGQUIT again.
  	 */
! 	if (!disable_sig_alarm())
  		elog(FATAL, "could not disable timer for startup packet timeout");
  	PG_SETMASK(&BlockSig);
  }
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lmgr.c pgsql.lck-experimental-framework/src/backend/storage/lmgr/lmgr.c
*** pgsql.orig/src/backend/storage/lmgr/lmgr.c	2010-01-02 17:57:52.000000000 +0100
--- pgsql.lck-experimental-framework/src/backend/storage/lmgr/lmgr.c	2010-07-26 10:53:11.000000000 +0200
***************
*** 19,26 ****
--- 19,29 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/catalog.h"
+ #include "catalog/pg_database.h"
  #include "miscadmin.h"
  #include "storage/lmgr.h"
+ #include "utils/lsyscache.h"
+ #include "storage/proc.h"
  #include "storage/procarray.h"
  #include "utils/inval.h"
  
*************** LockRelationOid(Oid relid, LOCKMODE lock
*** 78,83 ****
--- 81,101 ----
  
  	res = LockAcquire(&tag, lockmode, false, false);
  
+ 	if (res == LOCKACQUIRE_NOT_AVAIL)
+ 	{
+ 		char	   *relname = get_rel_name(relid);
+ 		if (relname)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 						errmsg("could not obtain lock on relation \"%s\"",
+ 						relname)));
+ 		else
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 						errmsg("could not obtain lock on relation with OID %u",
+ 						relid)));
+ 	}
+ 
  	/*
  	 * Now that we have the lock, check for invalidation messages, so that we
  	 * will update or flush any stale relcache entry before we try to use it.
*************** LockRelation(Relation relation, LOCKMODE
*** 173,178 ****
--- 191,202 ----
  
  	res = LockAcquire(&tag, lockmode, false, false);
  
+ 	if (res == LOCKACQUIRE_NOT_AVAIL)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 					errmsg("could not obtain lock on relation \"%s\"",
+ 				RelationGetRelationName(relation))));
+ 
  	/*
  	 * Now that we have the lock, check for invalidation messages; see notes
  	 * in LockRelationOid.
*************** LockRelationIdForSession(LockRelId *reli
*** 250,256 ****
  
  	SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId);
  
! 	(void) LockAcquire(&tag, lockmode, true, false);
  }
  
  /*
--- 274,293 ----
  
  	SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId);
  
! 	if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL)
! 	{
! 		char	   *relname = get_rel_name(relid->relId);
! 		if (relname)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on relation \"%s\"",
! 						relname)));
! 		else
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on relation with OID %u",
! 						relid->relId)));
! 	}
  }
  
  /*
*************** LockRelationForExtension(Relation relati
*** 285,291 ****
  								relation->rd_lockInfo.lockRelId.dbId,
  								relation->rd_lockInfo.lockRelId.relId);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 322,332 ----
  								relation->rd_lockInfo.lockRelId.dbId,
  								relation->rd_lockInfo.lockRelId.relId);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on index \"%s\"",
! 				RelationGetRelationName(relation))));
  }
  
  /*
*************** LockPage(Relation relation, BlockNumber 
*** 319,325 ****
  					 relation->rd_lockInfo.lockRelId.relId,
  					 blkno);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 360,370 ----
  					 relation->rd_lockInfo.lockRelId.relId,
  					 blkno);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on page %u of relation \"%s\"",
! 				blkno, RelationGetRelationName(relation))));
  }
  
  /*
*************** LockTuple(Relation relation, ItemPointer
*** 375,381 ****
  					  ItemPointerGetBlockNumber(tid),
  					  ItemPointerGetOffsetNumber(tid));
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 420,430 ----
  					  ItemPointerGetBlockNumber(tid),
  					  ItemPointerGetOffsetNumber(tid));
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on row in relation \"%s\"",
! 				RelationGetRelationName(relation))));
  }
  
  /*
*************** XactLockTableInsert(TransactionId xid)
*** 429,435 ****
  
  	SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 	(void) LockAcquire(&tag, ExclusiveLock, false, false);
  }
  
  /*
--- 478,487 ----
  
  	SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 	if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on transaction with ID %u", xid)));
  }
  
  /*
*************** XactLockTableWait(TransactionId xid)
*** 473,479 ****
  
  		SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 		(void) LockAcquire(&tag, ShareLock, false, false);
  
  		LockRelease(&tag, ShareLock, false);
  
--- 525,534 ----
  
  		SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 		if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on transaction with ID %u", xid)));
  
  		LockRelease(&tag, ShareLock, false);
  
*************** VirtualXactLockTableInsert(VirtualTransa
*** 531,537 ****
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	(void) LockAcquire(&tag, ExclusiveLock, false, false);
  }
  
  /*
--- 586,596 ----
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on virtual transaction with ID %u",
! 				vxid.localTransactionId)));
  }
  
  /*
*************** VirtualXactLockTableWait(VirtualTransact
*** 549,555 ****
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	(void) LockAcquire(&tag, ShareLock, false, false);
  
  	LockRelease(&tag, ShareLock, false);
  }
--- 608,618 ----
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on virtual transaction with ID %u",
! 				vxid.localTransactionId)));
  
  	LockRelease(&tag, ShareLock, false);
  }
*************** LockDatabaseObject(Oid classid, Oid obji
*** 598,604 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 661,671 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on class:object: %u:%u",
! 				classid, objid)));
  }
  
  /*
*************** LockSharedObject(Oid classid, Oid objid,
*** 636,642 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  
  	/* Make sure syscaches are up-to-date with any changes we waited for */
  	AcceptInvalidationMessages();
--- 703,713 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on class:object: %u:%u",
! 				classid, objid)));
  
  	/* Make sure syscaches are up-to-date with any changes we waited for */
  	AcceptInvalidationMessages();
*************** LockSharedObjectForSession(Oid classid, 
*** 678,684 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, true, false);
  }
  
  /*
--- 749,770 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL)
! 		switch(classid)
! 		{
! 		case DatabaseRelationId:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on database with ID %u",
! 					objid)));
! 			break;
! 		default:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on class:object: %u:%u",
! 					classid, objid)));
! 			break;
! 		}
  }
  
  /*
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lock.c pgsql.lck-experimental-framework/src/backend/storage/lmgr/lock.c
*** pgsql.orig/src/backend/storage/lmgr/lock.c	2010-04-29 12:09:03.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/storage/lmgr/lock.c	2010-07-26 10:53:11.000000000 +0200
*************** PROCLOCK_PRINT(const char *where, const 
*** 255,261 ****
  static uint32 proclock_hash(const void *key, Size keysize);
  static void RemoveLocalLock(LOCALLOCK *locallock);
  static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
! static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
  static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
  			PROCLOCK *proclock, LockMethod lockMethodTable);
  static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
--- 255,261 ----
  static uint32 proclock_hash(const void *key, Size keysize);
  static void RemoveLocalLock(LOCALLOCK *locallock);
  static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
! static int WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
  static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
  			PROCLOCK *proclock, LockMethod lockMethodTable);
  static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
*************** ProcLockHashCode(const PROCLOCKTAG *proc
*** 447,453 ****
   *	dontWait: if true, don't wait to acquire lock
   *
   * Returns one of:
!  *		LOCKACQUIRE_NOT_AVAIL		lock not available, and dontWait=true
   *		LOCKACQUIRE_OK				lock successfully acquired
   *		LOCKACQUIRE_ALREADY_HELD	incremented count for lock already held
   *
--- 447,453 ----
   *	dontWait: if true, don't wait to acquire lock
   *
   * Returns one of:
!  *		LOCKACQUIRE_NOT_AVAIL		lock not available, either dontWait=true or timeout
   *		LOCKACQUIRE_OK				lock successfully acquired
   *		LOCKACQUIRE_ALREADY_HELD	incremented count for lock already held
   *
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 833,839 ****
  										 locktag->locktag_type,
  										 lockmode);
  
! 		WaitOnLock(locallock, owner);
  
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
--- 833,839 ----
  										 locktag->locktag_type,
  										 lockmode);
  
! 		status = WaitOnLock(locallock, owner);
  
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 848,867 ****
  		 * done when the lock was granted to us --- see notes in WaitOnLock.
  		 */
  
! 		/*
! 		 * Check the proclock entry status, in case something in the ipc
! 		 * communication doesn't work correctly.
! 		 */
! 		if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
  		{
! 			PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock);
! 			LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode);
! 			/* Should we retry ? */
! 			LWLockRelease(partitionLock);
! 			elog(ERROR, "LockAcquire failed");
  		}
- 		PROCLOCK_PRINT("LockAcquire: granted", proclock);
- 		LOCK_PRINT("LockAcquire: granted", lock, lockmode);
  	}
  
  	LWLockRelease(partitionLock);
--- 848,879 ----
  		 * done when the lock was granted to us --- see notes in WaitOnLock.
  		 */
  
! 		switch (status)
  		{
! 		case STATUS_OK:
! 			/*
! 			 * Check the proclock entry status, in case something in the ipc
! 			 * communication doesn't work correctly.
! 			 */
! 			if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
! 			{
! 				PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock);
! 				LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode);
! 				/* Should we retry ? */
! 				LWLockRelease(partitionLock);
! 				elog(ERROR, "LockAcquire failed");
! 			}
! 			PROCLOCK_PRINT("LockAcquire: granted", proclock);
! 			LOCK_PRINT("LockAcquire: granted", lock, lockmode);
! 			break;
! 		case STATUS_WAITING:
! 			PROCLOCK_PRINT("LockAcquire: timed out", proclock);
! 			LOCK_PRINT("LockAcquire: timed out", lock, lockmode);
! 			break;
! 		default:
! 			elog(ERROR, "LockAcquire invalid status");
! 			break;
  		}
  	}
  
  	LWLockRelease(partitionLock);
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 887,893 ****
  							   locktag->locktag_field2);
  	}
  
! 	return LOCKACQUIRE_OK;
  }
  
  /*
--- 899,905 ----
  							   locktag->locktag_field2);
  	}
  
! 	return (status == STATUS_OK ? LOCKACQUIRE_OK : LOCKACQUIRE_NOT_AVAIL);
  }
  
  /*
*************** GrantAwaitedLock(void)
*** 1165,1178 ****
   * Caller must have set MyProc->heldLocks to reflect locks already held
   * on the lockable object by this process.
   *
   * The appropriate partition lock must be held at entry.
   */
! static void
  WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
  {
  	LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock);
  	LockMethod	lockMethodTable = LockMethods[lockmethodid];
  	char	   *volatile new_status = NULL;
  
  	LOCK_PRINT("WaitOnLock: sleeping on lock",
  			   locallock->lock, locallock->tag.mode);
--- 1177,1196 ----
   * Caller must have set MyProc->heldLocks to reflect locks already held
   * on the lockable object by this process.
   *
+  * Result: returns value of ProcSleep()
+  *	STATUS_OK if we acquired the lock
+  *	STATUS_ERROR if not (deadlock)
+  *	STATUS_WAITING if not (timeout)
+  *
   * The appropriate partition lock must be held at entry.
   */
! static int
  WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
  {
  	LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock);
  	LockMethod	lockMethodTable = LockMethods[lockmethodid];
  	char	   *volatile new_status = NULL;
+ 	int		wait_status;
  
  	LOCK_PRINT("WaitOnLock: sleeping on lock",
  			   locallock->lock, locallock->tag.mode);
*************** WaitOnLock(LOCALLOCK *locallock, Resourc
*** 1214,1221 ****
  	 */
  	PG_TRY();
  	{
! 		if (ProcSleep(locallock, lockMethodTable) != STATUS_OK)
  		{
  			/*
  			 * We failed as a result of a deadlock, see CheckDeadLock(). Quit
  			 * now.
--- 1232,1244 ----
  	 */
  	PG_TRY();
  	{
! 		wait_status = ProcSleep(locallock, lockMethodTable);
! 		switch (wait_status)
  		{
+ 		case STATUS_OK:
+ 		case STATUS_WAITING:
+ 			break;
+ 		default:
  			/*
  			 * We failed as a result of a deadlock, see CheckDeadLock(). Quit
  			 * now.
*************** WaitOnLock(LOCALLOCK *locallock, Resourc
*** 1260,1267 ****
  		pfree(new_status);
  	}
  
! 	LOCK_PRINT("WaitOnLock: wakeup on lock",
  			   locallock->lock, locallock->tag.mode);
  }
  
  /*
--- 1283,1296 ----
  		pfree(new_status);
  	}
  
! 	if (wait_status == STATUS_OK)
! 		LOCK_PRINT("WaitOnLock: wakeup on lock",
! 			   locallock->lock, locallock->tag.mode);
! 	else if (wait_status == STATUS_WAITING)
! 		LOCK_PRINT("WaitOnLock: timeout on lock",
  			   locallock->lock, locallock->tag.mode);
+ 
+ 	return wait_status;
  }
  
  /*
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/proc.c pgsql.lck-experimental-framework/src/backend/storage/lmgr/proc.c
*** pgsql.orig/src/backend/storage/lmgr/proc.c	2010-07-11 11:14:54.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/storage/lmgr/proc.c	2010-07-28 18:43:28.000000000 +0200
***************
*** 52,57 ****
--- 52,59 ----
  /* GUC variables */
  int			DeadlockTimeout = 1000;
  int			StatementTimeout = 0;
+ int			LockTimeout = 0;
+ 
  bool		log_lock_waits = false;
  
  /* Pointer to this process's PGPROC struct, if any */
*************** static LOCALLOCK *lockAwaited = NULL;
*** 75,99 ****
  
  /* Mark these volatile because they can be changed by signal handler */
  static volatile bool standby_timeout_active = false;
- static volatile bool statement_timeout_active = false;
- static volatile bool deadlock_timeout_active = false;
  static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
  volatile bool cancel_from_timeout = false;
  
  /* timeout_start_time is set when log_lock_waits is true */
  static TimestampTz timeout_start_time;
  
  /* statement_fin_time is valid only if statement_timeout_active is true */
- static TimestampTz statement_fin_time;
- static TimestampTz statement_fin_time2; /* valid only in recovery */
  
  
  static void RemoveProcFromArray(int code, Datum arg);
  static void ProcKill(int code, Datum arg);
  static void AuxiliaryProcKill(int code, Datum arg);
  static bool CheckStatementTimeout(void);
  static bool CheckStandbyTimeout(void);
  
  
  /*
   * Report shared-memory space needed by InitProcGlobal.
--- 77,134 ----
  
  /* Mark these volatile because they can be changed by signal handler */
  static volatile bool standby_timeout_active = false;
  static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
  volatile bool cancel_from_timeout = false;
+ volatile bool lock_timeout_detected = false;
  
  /* timeout_start_time is set when log_lock_waits is true */
  static TimestampTz timeout_start_time;
+ static TimestampTz timeout_fin_time;
  
  /* statement_fin_time is valid only if statement_timeout_active is true */
  
  
  static void RemoveProcFromArray(int code, Datum arg);
  static void ProcKill(int code, Datum arg);
  static void AuxiliaryProcKill(int code, Datum arg);
+ static bool CheckDeadLock(void);
  static bool CheckStatementTimeout(void);
+ static bool CheckLockTimeout(void);
  static bool CheckStandbyTimeout(void);
+ static bool enable_lock_timeout_sig_alarm(int delayms);
+ static bool disable_lock_timeout_sig_alarm(void);
+ 
+ /*
+  * Infrastructure for multiple timeouts
+  */
+ typedef bool (*timeout_check)(void);
+ 
+ typedef struct {
+ 	int			index;
+ 	bool			reschedule;	/* can be rescheduled if the previous one triggered */
+ 	timeout_check		timeout_check_fn;
+ 	TimestampTz		fin_time;
+ } timeout_params;
  
+ typedef enum {
+ 	DEADLOCK_TIMEOUT,
+ 	STATEMENT_TIMEOUT,
+ 	LOCK_TIMEOUT,
+ 	TIMEOUT_MAX
+ } TimeoutName;
+ 
+ static timeout_params	base_timeouts[TIMEOUT_MAX] = {
+ 	{ DEADLOCK_TIMEOUT,	false,	CheckDeadLock,		0 },
+ 	{ STATEMENT_TIMEOUT,	true,	CheckStatementTimeout,	0 },
+ 	{ LOCK_TIMEOUT,		true,	CheckLockTimeout,	0 }
+ };
+ 
+ /*
+  * List of active timeouts ordered by their fin_time.
+  * A particular timeout is active if it's in the list.
+  */
+ static int		n_timeouts = 0;
+ static timeout_params	*timeouts[TIMEOUT_MAX];
  
  /*
   * Report shared-memory space needed by InitProcGlobal.
*************** LockWaitCancel(void)
*** 584,590 ****
  		return;
  
  	/* Turn off the deadlock timer, if it's still running (see ProcSleep) */
! 	disable_sig_alarm(false);
  
  	/* Unlink myself from the wait queue, if on it (might not be anymore!) */
  	partitionLock = LockHashPartitionLock(lockAwaited->hashcode);
--- 619,625 ----
  		return;
  
  	/* Turn off the deadlock timer, if it's still running (see ProcSleep) */
! 	disable_sig_alarm();
  
  	/* Unlink myself from the wait queue, if on it (might not be anymore!) */
  	partitionLock = LockHashPartitionLock(lockAwaited->hashcode);
*************** ProcQueueInit(PROC_QUEUE *queue)
*** 797,803 ****
   * The lock table's partition lock must be held at entry, and will be held
   * at exit.
   *
!  * Result: STATUS_OK if we acquired the lock, STATUS_ERROR if not (deadlock).
   *
   * ASSUME: that no one will fiddle with the queue until after
   *		we release the partition lock.
--- 832,841 ----
   * The lock table's partition lock must be held at entry, and will be held
   * at exit.
   *
!  * Result:
!  *     STATUS_OK if we acquired the lock
!  *     STATUS_ERROR if not (deadlock)
!  *     STATUS_WAITING if not (timeout)
   *
   * ASSUME: that no one will fiddle with the queue until after
   *		we release the partition lock.
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 947,957 ****
  	 * By delaying the check until we've waited for a bit, we can avoid
  	 * running the rather expensive deadlock-check code in most cases.
  	 */
! 	if (!enable_sig_alarm(DeadlockTimeout, false))
  		elog(FATAL, "could not set timer for process wakeup");
  
  	/*
! 	 * If someone wakes us between LWLockRelease and PGSemaphoreLock,
  	 * PGSemaphoreLock will not block.	The wakeup is "saved" by the semaphore
  	 * implementation.	While this is normally good, there are cases where a
  	 * saved wakeup might be leftover from a previous operation (for example,
--- 985,1003 ----
  	 * By delaying the check until we've waited for a bit, we can avoid
  	 * running the rather expensive deadlock-check code in most cases.
  	 */
! 	if (!enable_sig_alarm(DeadlockTimeout))
  		elog(FATAL, "could not set timer for process wakeup");
  
  	/*
! 	 * Reset timer so we are awaken in case of lock timeout.
! 	 * This doesn't modify the timer for deadlock check in case
! 	 * the deadlock check happens earlier.
! 	 */
! 	if (!enable_lock_timeout_sig_alarm(LockTimeout))
! 		elog(FATAL, "could not set timer for process wakeup");
! 
! 	/*
! 	 * If someone wakes us between LWLockRelease and PGSemaphoreTimedLock,
  	 * PGSemaphoreLock will not block.	The wakeup is "saved" by the semaphore
  	 * implementation.	While this is normally good, there are cases where a
  	 * saved wakeup might be leftover from a previous operation (for example,
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 969,975 ****
  	 */
  	do
  	{
! 		PGSemaphoreLock(&MyProc->sem, true);
  
  		/*
  		 * waitStatus could change from STATUS_WAITING to something else
--- 1015,1021 ----
  	 */
  	do
  	{
! 		PGSemaphoreTimedLock(&MyProc->sem, true);
  
  		/*
  		 * waitStatus could change from STATUS_WAITING to something else
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1098,1104 ****
  	/*
  	 * Disable the timer, if it's still running
  	 */
! 	if (!disable_sig_alarm(false))
  		elog(FATAL, "could not disable timer for process wakeup");
  
  	/*
--- 1144,1153 ----
  	/*
  	 * Disable the timer, if it's still running
  	 */
! 	if (!disable_sig_alarm())
! 		elog(FATAL, "could not disable timer for process wakeup");
! 
! 	if (!disable_lock_timeout_sig_alarm())
  		elog(FATAL, "could not disable timer for process wakeup");
  
  	/*
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1109,1114 ****
--- 1158,1171 ----
  	LWLockAcquire(partitionLock, LW_EXCLUSIVE);
  
  	/*
+ 	 * If we're in timeout, so we're not waiting anymore and
+ 	 * we're not the one that the lock will be granted to.
+ 	 * So remove ourselves from the wait queue.
+ 	 */
+ 	if (lock_timeout_detected)
+ 		RemoveFromWaitQueue(MyProc, hashcode);
+ 
+ 	/*
  	 * We no longer want LockWaitCancel to do anything.
  	 */
  	lockAwaited = NULL;
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1122,1129 ****
  	/*
  	 * We don't have to do anything else, because the awaker did all the
  	 * necessary update of the lock table and MyProc.
  	 */
! 	return MyProc->waitStatus;
  }
  
  
--- 1179,1188 ----
  	/*
  	 * We don't have to do anything else, because the awaker did all the
  	 * necessary update of the lock table and MyProc.
+ 	 * RemoveFromWaitQueue() have set MyProc->waitStatus = STATUS_ERROR,
+ 	 * we need to distinguish this case.
  	 */
! 	return (lock_timeout_detected ? STATUS_WAITING : MyProc->waitStatus);
  }
  
  
*************** ProcLockWakeup(LockMethod lockMethodTabl
*** 1242,1248 ****
   * NB: this is run inside a signal handler, so be very wary about what is done
   * here or in called routines.
   */
! static void
  CheckDeadLock(void)
  {
  	int			i;
--- 1301,1307 ----
   * NB: this is run inside a signal handler, so be very wary about what is done
   * here or in called routines.
   */
! static bool
  CheckDeadLock(void)
  {
  	int			i;
*************** CheckDeadLock(void)
*** 1341,1346 ****
--- 1400,1407 ----
  check_done:
  	for (i = NUM_LOCK_PARTITIONS; --i >= 0;)
  		LWLockRelease(FirstLockMgrLock + i);
+ 
+ 	return true;
  }
  
  
*************** ProcSendSignal(int pid)
*** 1404,1538 ****
   * Maybe these should be in pqsignal.c?
   *****************************************************************************/
  
  /*
!  * Enable the SIGALRM interrupt to fire after the specified delay
!  *
!  * Delay is given in milliseconds.	Caller should be sure a SIGALRM
!  * signal handler is installed before this is called.
!  *
!  * This code properly handles nesting of deadlock timeout alarms within
!  * statement timeout alarms.
   *
!  * Returns TRUE if okay, FALSE on failure.
   */
! bool
! enable_sig_alarm(int delayms, bool is_statement_timeout)
  {
! 	TimestampTz fin_time;
! 	struct itimerval timeval;
  
! 	if (is_statement_timeout)
! 	{
! 		/*
! 		 * Begin statement-level timeout
! 		 *
! 		 * Note that we compute statement_fin_time with reference to the
! 		 * statement_timestamp, but apply the specified delay without any
! 		 * correction; that is, we ignore whatever time has elapsed since
! 		 * statement_timestamp was set.  In the normal case only a small
! 		 * interval will have elapsed and so this doesn't matter, but there
! 		 * are corner cases (involving multi-statement query strings with
! 		 * embedded COMMIT or ROLLBACK) where we might re-initialize the
! 		 * statement timeout long after initial receipt of the message. In
! 		 * such cases the enforcement of the statement timeout will be a bit
! 		 * inconsistent.  This annoyance is judged not worth the cost of
! 		 * performing an additional gettimeofday() here.
! 		 */
! 		Assert(!deadlock_timeout_active);
! 		fin_time = GetCurrentStatementStartTimestamp();
! 		fin_time = TimestampTzPlusMilliseconds(fin_time, delayms);
! 		statement_fin_time = fin_time;
! 		cancel_from_timeout = false;
! 		statement_timeout_active = true;
! 	}
! 	else if (statement_timeout_active)
  	{
! 		/*
! 		 * Begin deadlock timeout with statement-level timeout active
! 		 *
! 		 * Here, we want to interrupt at the closer of the two timeout times.
! 		 * If fin_time >= statement_fin_time then we need not touch the
! 		 * existing timer setting; else set up to interrupt at the deadlock
! 		 * timeout time.
! 		 *
! 		 * NOTE: in this case it is possible that this routine will be
! 		 * interrupted by the previously-set timer alarm.  This is okay
! 		 * because the signal handler will do only what it should do according
! 		 * to the state variables.	The deadlock checker may get run earlier
! 		 * than normal, but that does no harm.
! 		 */
! 		timeout_start_time = GetCurrentTimestamp();
! 		fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
! 		deadlock_timeout_active = true;
! 		if (fin_time >= statement_fin_time)
! 			return true;
  	}
  	else
! 	{
! 		/* Begin deadlock timeout with no statement-level timeout */
! 		deadlock_timeout_active = true;
! 		/* GetCurrentTimestamp can be expensive, so only do it if we must */
! 		if (log_lock_waits)
! 			timeout_start_time = GetCurrentTimestamp();
! 	}
  
! 	/* If we reach here, okay to set the timer interrupt */
! 	MemSet(&timeval, 0, sizeof(struct itimerval));
! 	timeval.it_value.tv_sec = delayms / 1000;
! 	timeval.it_value.tv_usec = (delayms % 1000) * 1000;
! 	if (setitimer(ITIMER_REAL, &timeval, NULL))
! 		return false;
! 	return true;
  }
  
  /*
!  * Cancel the SIGALRM timer, either for a deadlock timeout or a statement
!  * timeout.  If a deadlock timeout is canceled, any active statement timeout
!  * remains in force.
   *
   * Returns TRUE if okay, FALSE on failure.
   */
  bool
! disable_sig_alarm(bool is_statement_timeout)
  {
  	/*
! 	 * Always disable the interrupt if it is active; this avoids being
! 	 * interrupted by the signal handler and thereby possibly getting
! 	 * confused.
  	 *
! 	 * We will re-enable the interrupt if necessary in CheckStatementTimeout.
  	 */
- 	if (statement_timeout_active || deadlock_timeout_active)
- 	{
- 		struct itimerval timeval;
  
! 		MemSet(&timeval, 0, sizeof(struct itimerval));
! 		if (setitimer(ITIMER_REAL, &timeval, NULL))
! 		{
! 			statement_timeout_active = false;
! 			cancel_from_timeout = false;
! 			deadlock_timeout_active = false;
! 			return false;
! 		}
! 	}
  
! 	/* Always cancel deadlock timeout, in case this is error cleanup */
! 	deadlock_timeout_active = false;
  
- 	/* Cancel or reschedule statement timeout */
- 	if (is_statement_timeout)
- 	{
- 		statement_timeout_active = false;
- 		cancel_from_timeout = false;
- 	}
- 	else if (statement_timeout_active)
- 	{
- 		if (!CheckStatementTimeout())
- 			return false;
- 	}
  	return true;
  }
  
  
  /*
   * Check for statement timeout.  If the timeout time has come,
--- 1465,1764 ----
   * Maybe these should be in pqsignal.c?
   *****************************************************************************/
  
+ static bool
+ schedule_sigalarm(void)
+ {
+ 	TimestampTz	now;
+ 	long		secs;
+ 	int			usecs;
+ 	struct itimerval timeval;
+ 
+ 	if (n_timeouts == 0)
+ 		return true;
+ 
+ 	now = GetCurrentTimestamp();
+ 
+ 	TimestampDifference(now, timeouts[0]->fin_time, &secs, &usecs);
+ 	if (secs == 0 && usecs == 0)
+ 		usecs = 1;
+ 	MemSet(&timeval, 0, sizeof(struct itimerval));
+ 	timeval.it_value.tv_sec = secs;
+ 	timeval.it_value.tv_usec = usecs;
+ 	if (setitimer(ITIMER_REAL, &timeval, NULL))
+ 		return false;
+ 
+ 	return true;
+ }
+ 
+ static bool
+ disable_sigalarm(void)
+ {
+ 	struct itimerval timeval;
+ 
+ 	MemSet(&timeval, 0, sizeof(struct itimerval));
+ 	if (setitimer(ITIMER_REAL, &timeval, NULL))
+ 		return false;
+ 	return true;
+ }
+ 
+ 
+ 
+ static int
+ timeout_active(TimeoutName timeout)
+ {
+ 	int	i;
+ 
+ 	for (i = 0; i < n_timeouts; i++)
+ 		if (timeouts[i]->index == timeout)
+ 			return i;
+ 	return -1;
+ }
+ 
  /*
!  * Add a new timeout
   *
!  * Flag for reschedule is necessary is returned
   */
! static void
! add_timeout(TimeoutName timeout, TimestampTz fin_time, bool *reschedule)
  {
! 	int		i, j;
! 	TimestampTz	tmp;
! 	TimeoutName	old = TIMEOUT_MAX;
  
! 	if (reschedule && n_timeouts > 0)
! 		old = timeouts[0]->index;
! 
! 	/* Add the new timeout */
! 	i = n_timeouts;
! 	timeouts[i] = &base_timeouts[timeout];
! 	timeouts[i]->fin_time = fin_time;
! 	n_timeouts++;
! 
! 	/*
! 	 * Sort them according to their fin_time.
! 	 * Bubble sort is not a problem for an array with at most 3 elements.
! 	 */
! 	for (i = 0; i < n_timeouts - 1; i++)
! 		for (j = i + 1; j < n_timeouts; j++)
! 		{
! 			if (timeouts[i]->fin_time > timeouts[j]->fin_time)
! 			{
! 				tmp = timeouts[i]->fin_time;
! 				timeouts[i]->fin_time = timeouts[j]->fin_time;
! 				timeouts[j]->fin_time = tmp;
! 			}
! 		}
! 
! 	/* if the first timeout is different then reschedule is necessary */
! 	if (reschedule)
! 		*reschedule = (old != timeouts[0]->index);
! }
! 
! /*
!  * Delete the specified timeout and reschedule SIGALRM if necessary.
!  */
! static bool
! del_timeout(TimeoutName timeout)
! {
! 	int	i;
! 	bool	reschedule = false;
! 
! 	for (i = 0; i < n_timeouts; i++)
! 		if (timeouts[i]->index == timeout)
! 			break;
! 
! 	if (i == n_timeouts)
! 		return true;
! 
! 	if (i == 0)
! 		reschedule = true;
! 
! 	for (; i < n_timeouts - 1; i++)
! 		timeouts[i] = timeouts[i + 1];
! 
! 	n_timeouts--;
! 
! 	if (n_timeouts > 0)
  	{
! 		if (reschedule)
! 			return schedule_sigalarm();
! 		return true;
  	}
  	else
! 		return disable_sigalarm();
! }
  
! /*
!  * Delete the timeout that will expire first 
!  */
! static void
! pop_timeout(void)
! {
! 	int	i;
! 
! 	for (i = 0; i < n_timeouts - 1; i++)
! 		timeouts[i] = timeouts[i + 1];
! 
! 	n_timeouts--;
  }
  
  /*
!  * Enable SIGALRM interrupt to fire after a specified delay
!  * for statement timeout
!  *
!  * Delay is given in milliseconds.	Caller should be sure a SIGALRM
!  * signal handler is installed before this is called.
   *
   * Returns TRUE if okay, FALSE on failure.
   */
  bool
! enable_statement_sig_alarm(int delayms)
  {
+ 	TimestampTz	fin_time;
+ 	bool		reschedule;
+ 
  	/*
! 	 * Begin statement-level timeout
  	 *
! 	 * Note that we compute statement_fin_time with reference to the
! 	 * statement_timestamp, but apply the specified delay without any
! 	 * correction; that is, we ignore whatever time has elapsed since
! 	 * statement_timestamp was set.  In the normal case only a small
! 	 * interval will have elapsed and so this doesn't matter, but there
! 	 * are corner cases (involving multi-statement query strings with
! 	 * embedded COMMIT or ROLLBACK) where we might re-initialize the
! 	 * statement timeout long after initial receipt of the message. In
! 	 * such cases the enforcement of the statement timeout will be a bit
! 	 * inconsistent.  This annoyance is judged not worth the cost of
! 	 * performing an additional gettimeofday() here.
  	 */
  
! 	Assert(n_timeouts == 0); /* There should be no other timeouts active yet. */
  
! 	if (delayms == 0)
! 		return true;
! 
! 	fin_time = GetCurrentStatementStartTimestamp();
! 	fin_time = TimestampTzPlusMilliseconds(fin_time, delayms);
! 
! 	add_timeout(STATEMENT_TIMEOUT, fin_time, &reschedule);
! 	Assert(reschedule);
! 
! 	cancel_from_timeout = false;
! 
! 	return schedule_sigalarm();
! }
! 
! /*
!  * Enable the SIGALRM interrupt to fire after the specified delay
!  * for deadlock_timeout. The deadlock check is also used for
!  * authentication timeouts.
!  *
!  * Delay is given in milliseconds.	Caller should be sure a SIGALRM
!  * signal handler is installed before this is called.
!  *
!  * Returns TRUE if okay, FALSE on failure.
!  */
! bool
! enable_sig_alarm(int delayms)
! {
! 	bool	reschedule;
! 
! 	if (delayms == 0)
! 		return true;
! 
! 	timeout_start_time = GetCurrentTimestamp();
! 	timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
! 
! 	add_timeout(DEADLOCK_TIMEOUT, timeout_fin_time, &reschedule);
! 
! 	if (reschedule)
! 		return schedule_sigalarm();
  
  	return true;
  }
  
+ /*
+  * Enable the SIGALRM interrupt to fire after the specified delay
+  * in case LockTimeout is set.
+  *
+  * This code properly handles nesting of lock_timeout timeout alarm
+  * within deadlock timeout and statement timeout alarms.
+  *
+  * Returns TRUE if okay, FALSE on failure.
+  */
+ static bool
+ enable_lock_timeout_sig_alarm(int delayms)
+ {
+ 	TimestampTz		fin_time;
+ 	bool			reschedule;
+ 
+ 	if (delayms == 0)
+ 		return true;
+ 
+ 	fin_time = GetCurrentTimestamp();
+ 	fin_time = TimestampTzPlusMilliseconds(fin_time, delayms);
+ 
+ 	add_timeout(LOCK_TIMEOUT, fin_time, &reschedule);
+ 
+ 	if (reschedule)
+ 		return schedule_sigalarm();
+ 
+ 	return true;
+ }
+ 
+ /*
+  * Cancel the SIGALRM timer for statement timeout
+  * Any active statement timeout remains in force.
+  *
+  * Returns TRUE if okay, FALSE on failure.
+  */
+ bool
+ disable_statement_sig_alarm(void) 
+ {
+ 	cancel_from_timeout = false;
+ 	return del_timeout(STATEMENT_TIMEOUT);
+ }
+  
+ /*
+  * Cancel the SIGALRM timer for deadlock timeout
+  * Any active statement timeout remains in force.
+  *
+  * Returns TRUE if okay, FALSE on failure.
+  */
+ bool
+ disable_sig_alarm(void)
+ {
+ 	return del_timeout(DEADLOCK_TIMEOUT);
+ }
+ 
+ /*
+  * Cancel SIGALRM for all timeouts at once
+  */
+ bool
+ disable_all_sig_alarm(void)
+ {
+ 	bool	ret;
+ 
+ 	ret = disable_sigalarm();
+ 	n_timeouts = 0;
+ 
+ 	return ret;
+ }
+ 
+ /*
+  * Cancel the SIGALRM timer for the lock timeout
+  * Any active statement timeout remains in force.
+  *
+  * Returns TRUE if okay, FALSE on failure.
+  */
+ static bool
+ disable_lock_timeout_sig_alarm(void)
+ {
+ 	return del_timeout(LOCK_TIMEOUT);
+ }
+ 
  
  /*
   * Check for statement timeout.  If the timeout time has come,
*************** CheckStatementTimeout(void)
*** 1546,1560 ****
  {
  	TimestampTz now;
  
! 	if (!statement_timeout_active)
  		return true;			/* do nothing if not active */
  
  	now = GetCurrentTimestamp();
  
! 	if (now >= statement_fin_time)
  	{
  		/* Time to die */
! 		statement_timeout_active = false;
  		cancel_from_timeout = true;
  #ifdef HAVE_SETSID
  		/* try to signal whole process group */
--- 1772,1790 ----
  {
  	TimestampTz now;
  
! 	if (!timeout_active(STATEMENT_TIMEOUT))
  		return true;			/* do nothing if not active */
  
+ 	/* also do nothing if the statement timeout is not the first to trigger */
+ 	if (timeouts[0]->index != STATEMENT_TIMEOUT)
+ 		return true;
+ 
  	now = GetCurrentTimestamp();
  
! 	if (now >= timeouts[0]->fin_time)
  	{
  		/* Time to die */
! 		pop_timeout();
  		cancel_from_timeout = true;
  #ifdef HAVE_SETSID
  		/* try to signal whole process group */
*************** CheckStatementTimeout(void)
*** 1563,1593 ****
  		kill(MyProcPid, SIGINT);
  	}
  	else
- 	{
  		/* Not time yet, so (re)schedule the interrupt */
! 		long		secs;
! 		int			usecs;
! 		struct itimerval timeval;
  
! 		TimestampDifference(now, statement_fin_time,
! 							&secs, &usecs);
  
! 		/*
! 		 * It's possible that the difference is less than a microsecond;
! 		 * ensure we don't cancel, rather than set, the interrupt.
! 		 */
! 		if (secs == 0 && usecs == 0)
! 			usecs = 1;
! 		MemSet(&timeval, 0, sizeof(struct itimerval));
! 		timeval.it_value.tv_sec = secs;
! 		timeval.it_value.tv_usec = usecs;
! 		if (setitimer(ITIMER_REAL, &timeval, NULL))
! 			return false;
  	}
  
  	return true;
- }
  
  
  /*
   * Signal handler for SIGALRM for normal user backends
--- 1793,1832 ----
  		kill(MyProcPid, SIGINT);
  	}
  	else
  		/* Not time yet, so (re)schedule the interrupt */
! 		return schedule_sigalarm();
  
! 	return true;
! }
  
! static bool
! CheckLockTimeout(void)
! {
! 	TimestampTz now;
! 
! 	if (!timeout_active(LOCK_TIMEOUT))
! 		return true;			/* do nothing if not active */
! 
! 	/* also do nothing if the lock timeout is not the first to trigger */
! 	if (timeouts[0]->index != LOCK_TIMEOUT)
! 		return true;
! 
! 	now = GetCurrentTimestamp();
! 
! 	if (now >= timeouts[0]->fin_time)
! 	{
! 		/* Time to die */
! 		pop_timeout();
! 
! 		lock_timeout_detected = true;
  	}
+ 	else
+ 		/* Not time yet, so (re)schedule the interrupt */
+ 		return schedule_sigalarm();
  
  	return true;
  
+ }
  
  /*
   * Signal handler for SIGALRM for normal user backends
*************** handle_sig_alarm(SIGNAL_ARGS)
*** 1602,1615 ****
  {
  	int			save_errno = errno;
  
! 	if (deadlock_timeout_active)
! 	{
! 		deadlock_timeout_active = false;
! 		CheckDeadLock();
! 	}
  
! 	if (statement_timeout_active)
! 		(void) CheckStatementTimeout();
  
  	errno = save_errno;
  }
--- 1841,1851 ----
  {
  	int			save_errno = errno;
  
! 	Assert(n_timeouts > 0);
! 	Assert(timeouts[0] != NULL);
! 	Assert(timeouts[0]->timeout_check_fn != NULL);
  
! 	(void) timeouts[0]->timeout_check_fn();
  
  	errno = save_errno;
  }
*************** enable_standby_sig_alarm(TimestampTz now
*** 1635,1681 ****
  		/*
  		 * Wake up at deadlock_time only, then wait forever
  		 */
! 		statement_fin_time = deadlock_time;
! 		deadlock_timeout_active = true;
! 		statement_timeout_active = false;
  	}
  	else if (fin_time > deadlock_time)
  	{
  		/*
  		 * Wake up at deadlock_time, then again at fin_time
  		 */
! 		statement_fin_time = deadlock_time;
! 		statement_fin_time2 = fin_time;
! 		deadlock_timeout_active = true;
! 		statement_timeout_active = true;
  	}
  	else
  	{
  		/*
  		 * Wake only at fin_time because its fairly soon
  		 */
! 		statement_fin_time = fin_time;
! 		deadlock_timeout_active = false;
! 		statement_timeout_active = true;
  	}
  
! 	if (deadlock_timeout_active || statement_timeout_active)
! 	{
! 		long		secs;
! 		int			usecs;
! 		struct itimerval timeval;
  
! 		TimestampDifference(now, statement_fin_time,
! 							&secs, &usecs);
! 		if (secs == 0 && usecs == 0)
! 			usecs = 1;
! 		MemSet(&timeval, 0, sizeof(struct itimerval));
! 		timeval.it_value.tv_sec = secs;
! 		timeval.it_value.tv_usec = usecs;
! 		if (setitimer(ITIMER_REAL, &timeval, NULL))
! 			return false;
! 		standby_timeout_active = true;
! 	}
  
  	return true;
  }
--- 1871,1898 ----
  		/*
  		 * Wake up at deadlock_time only, then wait forever
  		 */
! 		add_timeout(DEADLOCK_TIMEOUT, DeadlockTimeout, NULL);
  	}
  	else if (fin_time > deadlock_time)
  	{
  		/*
  		 * Wake up at deadlock_time, then again at fin_time
  		 */
! 		add_timeout(DEADLOCK_TIMEOUT, deadlock_time, NULL);
! 		add_timeout(STATEMENT_TIMEOUT, fin_time, NULL);
  	}
  	else
  	{
  		/*
  		 * Wake only at fin_time because its fairly soon
  		 */
! 		add_timeout(STATEMENT_TIMEOUT, fin_time, NULL);
  	}
  
! 	if (!schedule_sigalarm())
! 		return false;
  
! 	standby_timeout_active = true;
  
  	return true;
  }
*************** disable_standby_sig_alarm(void)
*** 1692,1705 ****
  	 */
  	if (standby_timeout_active)
  	{
! 		struct itimerval timeval;
! 
! 		MemSet(&timeval, 0, sizeof(struct itimerval));
! 		if (setitimer(ITIMER_REAL, &timeval, NULL))
! 		{
! 			standby_timeout_active = false;
! 			return false;
! 		}
  	}
  
  	standby_timeout_active = false;
--- 1909,1916 ----
  	 */
  	if (standby_timeout_active)
  	{
! 		standby_timeout_active = false;
! 		return disable_sigalarm();
  	}
  
  	standby_timeout_active = false;
*************** static bool
*** 1716,1722 ****
  CheckStandbyTimeout(void)
  {
  	TimestampTz now;
- 	bool		reschedule = false;
  
  	standby_timeout_active = false;
  
--- 1927,1932 ----
*************** CheckStandbyTimeout(void)
*** 1726,1734 ****
  	 * Reschedule the timer if its not time to wake yet, or if we have both
  	 * timers set and the first one has just been reached.
  	 */
! 	if (now >= statement_fin_time)
  	{
! 		if (deadlock_timeout_active)
  		{
  			/*
  			 * We're still waiting when we reach deadlock timeout, so send out
--- 1936,1944 ----
  	 * Reschedule the timer if its not time to wake yet, or if we have both
  	 * timers set and the first one has just been reached.
  	 */
! 	if (now >= timeouts[STATEMENT_TIMEOUT]->fin_time)
  	{
! 		if (timeout_active(DEADLOCK_TIMEOUT))
  		{
  			/*
  			 * We're still waiting when we reach deadlock timeout, so send out
*************** CheckStandbyTimeout(void)
*** 1736,1751 ****
  			 * Then continue waiting until statement_fin_time, if that's set.
  			 */
  			SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
- 			deadlock_timeout_active = false;
  
  			/*
! 			 * Begin second waiting period if required.
  			 */
! 			if (statement_timeout_active)
! 			{
! 				reschedule = true;
! 				statement_fin_time = statement_fin_time2;
! 			}
  		}
  		else
  		{
--- 1946,1957 ----
  			 * Then continue waiting until statement_fin_time, if that's set.
  			 */
  			SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
  
  			/*
! 			 * Delete the deadlock timer. It reschedules the second waiting period
! 			 * automatically, if statement_timeout was also added.
  			 */
! 			del_timeout(DEADLOCK_TIMEOUT);
  		}
  		else
  		{
*************** CheckStandbyTimeout(void)
*** 1757,1778 ****
  		}
  	}
  	else
- 		reschedule = true;
- 
- 	if (reschedule)
  	{
! 		long		secs;
! 		int			usecs;
! 		struct itimerval timeval;
! 
! 		TimestampDifference(now, statement_fin_time,
! 							&secs, &usecs);
! 		if (secs == 0 && usecs == 0)
! 			usecs = 1;
! 		MemSet(&timeval, 0, sizeof(struct itimerval));
! 		timeval.it_value.tv_sec = secs;
! 		timeval.it_value.tv_usec = usecs;
! 		if (setitimer(ITIMER_REAL, &timeval, NULL))
  			return false;
  		standby_timeout_active = true;
  	}
--- 1963,1970 ----
  		}
  	}
  	else
  	{
! 		if (!schedule_sigalarm())
  			return false;
  		standby_timeout_active = true;
  	}
diff -dcrpN pgsql.orig/src/backend/tcop/postgres.c pgsql.lck-experimental-framework/src/backend/tcop/postgres.c
*** pgsql.orig/src/backend/tcop/postgres.c	2010-07-11 11:14:55.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/tcop/postgres.c	2010-07-28 18:45:10.000000000 +0200
*************** start_xact_command(void)
*** 2461,2467 ****
  		/* Set statement timeout running, if any */
  		/* NB: this mustn't be enabled until we are within an xact */
  		if (StatementTimeout > 0)
! 			enable_sig_alarm(StatementTimeout, true);
  		else
  			cancel_from_timeout = false;
  
--- 2461,2467 ----
  		/* Set statement timeout running, if any */
  		/* NB: this mustn't be enabled until we are within an xact */
  		if (StatementTimeout > 0)
! 			enable_statement_sig_alarm(StatementTimeout);
  		else
  			cancel_from_timeout = false;
  
*************** finish_xact_command(void)
*** 2475,2481 ****
  	if (xact_started)
  	{
  		/* Cancel any active statement timeout before committing */
! 		disable_sig_alarm(true);
  
  		/* Now commit the command */
  		ereport(DEBUG3,
--- 2475,2481 ----
  	if (xact_started)
  	{
  		/* Cancel any active statement timeout before committing */
! 		disable_statement_sig_alarm();
  
  		/* Now commit the command */
  		ereport(DEBUG3,
*************** PostgresMain(int argc, char *argv[], con
*** 3703,3709 ****
  		 * the idle loop anyway, and cancel the statement timer if running.
  		 */
  		QueryCancelPending = false;
! 		disable_sig_alarm(true);
  		QueryCancelPending = false;		/* again in case timeout occurred */
  
  		/*
--- 3703,3709 ----
  		 * the idle loop anyway, and cancel the statement timer if running.
  		 */
  		QueryCancelPending = false;
! 		disable_all_sig_alarm(); /* was disable_sig_alarm(true) a.k.a. disable_statement_sig_alarm() */
  		QueryCancelPending = false;		/* again in case timeout occurred */
  
  		/*
diff -dcrpN pgsql.orig/src/backend/utils/init/postinit.c pgsql.lck-experimental-framework/src/backend/utils/init/postinit.c
*** pgsql.orig/src/backend/utils/init/postinit.c	2010-07-11 11:14:55.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/utils/init/postinit.c	2010-07-28 14:57:09.000000000 +0200
*************** PerformAuthentication(Port *port)
*** 203,209 ****
  	 * during authentication.  Since we're inside a transaction and might do
  	 * database access, we have to use the statement_timeout infrastructure.
  	 */
! 	if (!enable_sig_alarm(AuthenticationTimeout * 1000, true))
  		elog(FATAL, "could not set timer for authorization timeout");
  
  	/*
--- 203,209 ----
  	 * during authentication.  Since we're inside a transaction and might do
  	 * database access, we have to use the statement_timeout infrastructure.
  	 */
! 	if (!enable_statement_sig_alarm(AuthenticationTimeout * 1000))
  		elog(FATAL, "could not set timer for authorization timeout");
  
  	/*
*************** PerformAuthentication(Port *port)
*** 214,220 ****
  	/*
  	 * Done with authentication.  Disable the timeout, and log if needed.
  	 */
! 	if (!disable_sig_alarm(true))
  		elog(FATAL, "could not disable timer for authorization timeout");
  
  	/*
--- 214,220 ----
  	/*
  	 * Done with authentication.  Disable the timeout, and log if needed.
  	 */
! 	if (!disable_statement_sig_alarm())
  		elog(FATAL, "could not disable timer for authorization timeout");
  
  	/*
diff -dcrpN pgsql.orig/src/backend/utils/misc/guc.c pgsql.lck-experimental-framework/src/backend/utils/misc/guc.c
*** pgsql.orig/src/backend/utils/misc/guc.c	2010-07-26 10:05:55.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/utils/misc/guc.c	2010-07-26 10:53:11.000000000 +0200
*************** static struct config_int ConfigureNamesI
*** 1648,1653 ****
--- 1648,1663 ----
  	},
  
  	{
+ 		{"lock_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT,
+ 			gettext_noop("Sets the maximum allowed timeout for any lock taken by a statement."),
+ 			gettext_noop("A value of 0 turns off the timeout."),
+ 			GUC_UNIT_MS
+ 		},
+ 		&LockTimeout,
+ 		0, 0, INT_MAX, NULL, NULL
+ 	},
+ 
+ 	{
  		{"vacuum_freeze_min_age", PGC_USERSET, CLIENT_CONN_STATEMENT,
  			gettext_noop("Minimum age at which VACUUM should freeze a table row."),
  			NULL
diff -dcrpN pgsql.orig/src/backend/utils/misc/postgresql.conf.sample pgsql.lck-experimental-framework/src/backend/utils/misc/postgresql.conf.sample
*** pgsql.orig/src/backend/utils/misc/postgresql.conf.sample	2010-07-26 10:05:55.000000000 +0200
--- pgsql.lck-experimental-framework/src/backend/utils/misc/postgresql.conf.sample	2010-07-26 10:53:11.000000000 +0200
***************
*** 492,497 ****
--- 492,500 ----
  #------------------------------------------------------------------------------
  
  #deadlock_timeout = 1s
+ #lock_timeout = 0			# timeout value for heavy-weight locks
+ 					# taken by statements. 0 disables timeout
+ 					# unit in milliseconds, default is 0
  #max_locks_per_transaction = 64		# min 10
  					# (change requires restart)
  # Note:  Each lock table slot uses ~270 bytes of shared memory, and there are
diff -dcrpN pgsql.orig/src/include/storage/pg_sema.h pgsql.lck-experimental-framework/src/include/storage/pg_sema.h
*** pgsql.orig/src/include/storage/pg_sema.h	2010-01-02 17:58:08.000000000 +0100
--- pgsql.lck-experimental-framework/src/include/storage/pg_sema.h	2010-07-26 10:53:11.000000000 +0200
*************** extern void PGSemaphoreUnlock(PGSemaphor
*** 80,83 ****
--- 80,86 ----
  /* Lock a semaphore only if able to do so without blocking */
  extern bool PGSemaphoreTryLock(PGSemaphore sema);
  
+ /* Lock a semaphore (decrement count), blocking if count would be < 0 */
+ extern void PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK);
+ 
  #endif   /* PG_SEMA_H */
diff -dcrpN pgsql.orig/src/include/storage/proc.h pgsql.lck-experimental-framework/src/include/storage/proc.h
*** pgsql.orig/src/include/storage/proc.h	2010-07-11 11:15:00.000000000 +0200
--- pgsql.lck-experimental-framework/src/include/storage/proc.h	2010-07-28 18:24:16.000000000 +0200
*************** typedef struct PROC_HDR
*** 163,170 ****
--- 163,172 ----
  /* configurable options */
  extern int	DeadlockTimeout;
  extern int	StatementTimeout;
+ extern int	LockTimeout;
  extern bool log_lock_waits;
  
+ extern volatile bool lock_timeout_detected;
  extern volatile bool cancel_from_timeout;
  
  
*************** extern void LockWaitCancel(void);
*** 195,202 ****
  extern void ProcWaitForSignal(void);
  extern void ProcSendSignal(int pid);
  
! extern bool enable_sig_alarm(int delayms, bool is_statement_timeout);
! extern bool disable_sig_alarm(bool is_statement_timeout);
  extern void handle_sig_alarm(SIGNAL_ARGS);
  
  extern bool enable_standby_sig_alarm(TimestampTz now,
--- 197,207 ----
  extern void ProcWaitForSignal(void);
  extern void ProcSendSignal(int pid);
  
! extern bool enable_statement_sig_alarm(int delayms);
! extern bool disable_statement_sig_alarm(void);
! extern bool enable_sig_alarm(int delayms);
! extern bool disable_sig_alarm(void);
! extern bool disable_all_sig_alarm(void);
  extern void handle_sig_alarm(SIGNAL_ARGS);
  
  extern bool enable_standby_sig_alarm(TimestampTz now,

Marc Cousin

cousinmarc@gmail.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#3)

The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :

I fixed this by adding CheckLockTimeout() function that works like
CheckStatementTimeout() and ensuring that the same start time is
used for both deadlock_timeout and lock_timeout if both are active.
The preference of errors if their timeout values are equal is:
statement_timeout > lock_timeout > deadlock_timeout

As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
work, with this new version.

Keeping the deadlock_timeout to 1s, when lock_timeout >= 1001,
lock_timeout doesn't trigger anymore.

* Consider the changes to the code in the context of the project as a whole:
* Is everything done in a way that fits together coherently with
other features/modules?
I have a feeling that
enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a
very specific problem, and it gets complicated because there is no
infrastructure in the code to handle several timeouts at the same time
with sigalarm, so each timeout has its own dedicated and intertwined
code. But I'm still discovering this part of the code.

This WIP patch is also attached for reference, too. I would prefer
this way, but I don't have more time to work on it and there are some
interdependencies in the signal handler when e.g. disable_sig_alarm(true);
means to disable ALL timers not just the statement_timeout.
The specifically coded lock_timeout patch goes with the flow and doesn't
change the semantics and works. If someone wants to pick up the timer
framework patch and can make it work, fine. But I am not explicitely
submitting it for the commitfest. The original patch with the fixes works
and needs only a little more review.

Ok, understood. But I like the principle of this framework much more (the rest
of the code seems simpler to me as a consequence of this framework).

But it goes far beyond the initial intent of the patch.

Alvaro Herrera

alvherre@commandprompt.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#3)

Excerpts from Boszormenyi Zoltan's message of jue jul 29 07:55:38 -0400 2010:

Hi,

Marc Cousin írta:

Hi, I've been reviewing this patch for the last few days. Here it is :

...

* Are there dangers?
As it is a guc, it could be set globally, is that a danger ?

I haven't added any new code covering supplemental (e.g. autovacuum)
processes,
the interactions are yet to be discovered. Setting it globally is not
recommended.

FWIW there is some code in autovacuum and other auxiliary processes that
forcibly sets statement_timeout to 0. I think this patch should do
likewise.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Marc Cousin (#4)

1 attachment(s)

Marc Cousin írta:

The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :

I fixed this by adding CheckLockTimeout() function that works like
CheckStatementTimeout() and ensuring that the same start time is
used for both deadlock_timeout and lock_timeout if both are active.
The preference of errors if their timeout values are equal is:
statement_timeout > lock_timeout > deadlock_timeout

As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
work, with this new version.

Keeping the deadlock_timeout to 1s, when lock_timeout >= 1001,
lock_timeout doesn't trigger anymore.

I missed one case when the lock_timeout_active should have been set
but the timer must not have been re-set, this caused the problem.
I blame the hot weather and having no air conditioning. The second is
now fixed. :-)

I also added one line in autovacuum.c to disable lock_timeout,
in case it's globally set in postgresq.conf as per Alvaro's comment.

Also, I made sure that only one or two timeout causes (one of
deadlock_timeout
and lock_timeout in the first case or statement_timeout plus one of the
other two)
can be active at a time. Previously I was able to trigger a segfault
with the default
1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
system's
clock resolution makes the lock_timeout and deadlock_timeout equal and
RemoveFromWaitQueue() was called twice. This way it's a lot more robust.

Best regards,
Zoltán Böszörményi

Attachments:

5-pg91-locktimeout-19-ctxdiff.patchtext/x-patch; name=5-pg91-locktimeout-19-ctxdiff.patchDownload

diff -dcrpN pgsql.orig/doc/src/sgml/config.sgml pgsql/doc/src/sgml/config.sgml
*** pgsql.orig/doc/src/sgml/config.sgml	2010-07-26 10:05:37.000000000 +0200
--- pgsql/doc/src/sgml/config.sgml	2010-07-29 11:58:56.000000000 +0200
*************** COPY postgres_log FROM '/full/path/to/lo
*** 4479,4484 ****
--- 4479,4508 ----
        </listitem>
       </varlistentry>
  
+      <varlistentry id="guc-lock-timeout" xreflabel="lock_timeout">
+       <term><varname>lock_timeout</varname> (<type>integer</type>)</term>
+       <indexterm>
+        <primary><varname>lock_timeout</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Abort any statement that tries to acquire a heavy-weight lock (e.g. rows,
+         pages, tables, indices or other objects) and the lock has to wait more
+         than the specified number of milliseconds, starting from the time the
+         command arrives at the server from the client.
+         If <varname>log_min_error_statement</> is set to <literal>ERROR</> or lower,
+         the statement that timed out will also be logged. A value of zero
+         (the default) turns off the limitation.
+        </para>
+ 
+        <para>
+         Setting <varname>lock_timeout</> in
+         <filename>postgresql.conf</> is not recommended because it
+         affects all sessions.
+        </para>      
+       </listitem>   
+      </varlistentry>
+ 
       <varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age">
        <term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>)</term>
        <indexterm>
diff -dcrpN pgsql.orig/doc/src/sgml/ref/lock.sgml pgsql/doc/src/sgml/ref/lock.sgml
*** pgsql.orig/doc/src/sgml/ref/lock.sgml	2010-04-03 09:23:01.000000000 +0200
--- pgsql/doc/src/sgml/ref/lock.sgml	2010-07-29 11:58:56.000000000 +0200
*************** LOCK [ TABLE ] [ ONLY ] <replaceable cla
*** 39,46 ****
     <literal>NOWAIT</literal> is specified, <command>LOCK
     TABLE</command> does not wait to acquire the desired lock: if it
     cannot be acquired immediately, the command is aborted and an
!    error is emitted.  Once obtained, the lock is held for the
!    remainder of the current transaction.  (There is no <command>UNLOCK
     TABLE</command> command; locks are always released at transaction
     end.)
    </para>
--- 39,49 ----
     <literal>NOWAIT</literal> is specified, <command>LOCK
     TABLE</command> does not wait to acquire the desired lock: if it
     cannot be acquired immediately, the command is aborted and an
!    error is emitted. If <varname>lock_timeout</varname> is set to a value
!    higher than 0, and the lock cannot be acquired under the specified
!    timeout value in milliseconds, the command is aborted and an error
!    is emitted. Once obtained, the lock is held for the remainder of  
!    the current transaction.  (There is no <command>UNLOCK
     TABLE</command> command; locks are always released at transaction
     end.)
    </para>
diff -dcrpN pgsql.orig/doc/src/sgml/ref/select.sgml pgsql/doc/src/sgml/ref/select.sgml
*** pgsql.orig/doc/src/sgml/ref/select.sgml	2010-06-20 13:59:13.000000000 +0200
--- pgsql/doc/src/sgml/ref/select.sgml	2010-07-29 11:58:56.000000000 +0200
*************** FOR SHARE [ OF <replaceable class="param
*** 1160,1165 ****
--- 1160,1173 ----
     </para>
  
     <para>
+     If <literal>NOWAIT</> option is not specified and <varname>lock_timeout</varname>
+     is set to a value higher than 0, and the lock needs to wait more than
+     the specified value in milliseconds, the command reports an error after
+     timing out, rather than waiting indefinitely. The note in the previous
+     paragraph applies to the <varname>lock_timeout</varname>, too.
+    </para>
+ 
+    <para>
      If specific tables are named in <literal>FOR UPDATE</literal>
      or <literal>FOR SHARE</literal>,
      then only rows coming from those tables are locked; any other
diff -dcrpN pgsql.orig/src/backend/port/posix_sema.c pgsql/src/backend/port/posix_sema.c
*** pgsql.orig/src/backend/port/posix_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql/src/backend/port/posix_sema.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 24,29 ****
--- 24,30 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  
  #ifdef USE_NAMED_POSIX_SEMAPHORES
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 313,315 ****
--- 314,341 ----
  
  	return true;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0
+  * Return if lock_timeout expired
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	int			errStatus;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = sem_wait(PG_SEM_REF(sema));
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "sem_wait failed: %m");
+ }
diff -dcrpN pgsql.orig/src/backend/port/sysv_sema.c pgsql/src/backend/port/sysv_sema.c
*** pgsql.orig/src/backend/port/sysv_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql/src/backend/port/sysv_sema.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 30,35 ****
--- 30,36 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  
  #ifndef HAVE_UNION_SEMUN
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 497,499 ****
--- 498,530 ----
  
  	return true;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0
+  * Return if lock_timeout expired
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	int			errStatus;
+ 	struct sembuf sops;
+ 
+ 	sops.sem_op = -1;			/* decrement */
+ 	sops.sem_flg = 0;
+ 	sops.sem_num = sema->semNum;
+ 
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 		errStatus = semop(sema->semId, &sops, 1);
+ 		ImmediateInterruptOK = false;
+ 	} while (errStatus < 0 && errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errStatus < 0)
+ 		elog(FATAL, "semop(id=%d) failed: %m", sema->semId);
+ }
diff -dcrpN pgsql.orig/src/backend/port/win32_sema.c pgsql/src/backend/port/win32_sema.c
*** pgsql.orig/src/backend/port/win32_sema.c	2010-01-02 17:57:50.000000000 +0100
--- pgsql/src/backend/port/win32_sema.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 16,21 ****
--- 16,22 ----
  #include "miscadmin.h"
  #include "storage/ipc.h"
  #include "storage/pg_sema.h"
+ #include "storage/proc.h"
  
  static HANDLE *mySemSet;		/* IDs of sema sets acquired so far */
  static int	numSems;			/* number of sema sets acquired so far */
*************** PGSemaphoreTryLock(PGSemaphore sema)
*** 205,207 ****
--- 206,263 ----
  	/* keep compiler quiet */
  	return false;
  }
+ 
+ /*
+  * PGSemaphoreTimedLock
+  *
+  * Lock a semaphore (decrement count), blocking if count would be < 0.
+  * Serve the interrupt if interruptOK is true.
+  * Return if lock_timeout expired.
+  */
+ void
+ PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+ {
+ 	DWORD		ret;
+ 	HANDLE		wh[2];
+ 
+ 	wh[0] = *sema;
+ 	wh[1] = pgwin32_signal_event;
+ 
+ 	/*
+ 	 * As in other implementations of PGSemaphoreLock, we need to check for
+ 	 * cancel/die interrupts each time through the loop.  But here, there is
+ 	 * no hidden magic about whether the syscall will internally service a
+ 	 * signal --- we do that ourselves.
+ 	 */
+ 	do
+ 	{
+ 		ImmediateInterruptOK = interruptOK;
+ 		CHECK_FOR_INTERRUPTS();
+ 
+ 		errno = 0;
+ 		ret = WaitForMultipleObjectsEx(2, wh, FALSE, INFINITE, TRUE);
+ 
+ 		if (ret == WAIT_OBJECT_0)
+ 		{
+ 			/* We got it! */
+ 			return;
+ 		}
+ 		else if (ret == WAIT_OBJECT_0 + 1)
+ 		{
+ 			/* Signal event is set - we have a signal to deliver */
+ 			pgwin32_dispatch_queued_signals();
+ 			errno = EINTR;
+ 		}
+ 		else
+ 			/* Otherwise we are in trouble */
+ 			errno = EIDRM;
+ 
+ 		ImmediateInterruptOK = false;
+ 	} while (errno == EINTR && !lock_timeout_detected);
+ 
+ 	if (lock_timeout_detected)
+ 		return;
+ 	if (errno != 0)
+ 		ereport(FATAL,
+ 				(errmsg("could not lock semaphore: error code %d", (int) GetLastError())));
+ }
diff -dcrpN pgsql.orig/src/backend/postmaster/autovacuum.c pgsql/src/backend/postmaster/autovacuum.c
*** pgsql.orig/src/backend/postmaster/autovacuum.c	2010-04-29 12:09:03.000000000 +0200
--- pgsql/src/backend/postmaster/autovacuum.c	2010-08-02 09:36:21.000000000 +0200
*************** AutoVacWorkerMain(int argc, char *argv[]
*** 1521,1530 ****
  	SetConfigOption("zero_damaged_pages", "false", PGC_SUSET, PGC_S_OVERRIDE);
  
  	/*
! 	 * Force statement_timeout to zero to avoid a timeout setting from
! 	 * preventing regular maintenance from being executed.
  	 */
  	SetConfigOption("statement_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
  
  	/*
  	 * Get the info about the database we're going to work on.
--- 1521,1531 ----
  	SetConfigOption("zero_damaged_pages", "false", PGC_SUSET, PGC_S_OVERRIDE);
  
  	/*
! 	 * Force statement_timeout and lock_timeout to zero to avoid a timeout setting
! 	 * from preventing regular maintenance from being executed.
  	 */
  	SetConfigOption("statement_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
+ 	SetConfigOption("lock_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
  
  	/*
  	 * Get the info about the database we're going to work on.
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lmgr.c pgsql/src/backend/storage/lmgr/lmgr.c
*** pgsql.orig/src/backend/storage/lmgr/lmgr.c	2010-01-02 17:57:52.000000000 +0100
--- pgsql/src/backend/storage/lmgr/lmgr.c	2010-07-29 11:58:56.000000000 +0200
***************
*** 19,26 ****
--- 19,29 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/catalog.h"
+ #include "catalog/pg_database.h"
  #include "miscadmin.h"
  #include "storage/lmgr.h"
+ #include "utils/lsyscache.h"
+ #include "storage/proc.h"
  #include "storage/procarray.h"
  #include "utils/inval.h"
  
*************** LockRelationOid(Oid relid, LOCKMODE lock
*** 78,83 ****
--- 81,101 ----
  
  	res = LockAcquire(&tag, lockmode, false, false);
  
+ 	if (res == LOCKACQUIRE_NOT_AVAIL)
+ 	{
+ 		char	   *relname = get_rel_name(relid);
+ 		if (relname)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 						errmsg("could not obtain lock on relation \"%s\"",
+ 						relname)));
+ 		else
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 						errmsg("could not obtain lock on relation with OID %u",
+ 						relid)));
+ 	}
+ 
  	/*
  	 * Now that we have the lock, check for invalidation messages, so that we
  	 * will update or flush any stale relcache entry before we try to use it.
*************** LockRelation(Relation relation, LOCKMODE
*** 173,178 ****
--- 191,202 ----
  
  	res = LockAcquire(&tag, lockmode, false, false);
  
+ 	if (res == LOCKACQUIRE_NOT_AVAIL)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
+ 					errmsg("could not obtain lock on relation \"%s\"",
+ 				RelationGetRelationName(relation))));
+ 
  	/*
  	 * Now that we have the lock, check for invalidation messages; see notes
  	 * in LockRelationOid.
*************** LockRelationIdForSession(LockRelId *reli
*** 250,256 ****
  
  	SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId);
  
! 	(void) LockAcquire(&tag, lockmode, true, false);
  }
  
  /*
--- 274,293 ----
  
  	SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId);
  
! 	if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL)
! 	{
! 		char	   *relname = get_rel_name(relid->relId);
! 		if (relname)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on relation \"%s\"",
! 						relname)));
! 		else
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on relation with OID %u",
! 						relid->relId)));
! 	}
  }
  
  /*
*************** LockRelationForExtension(Relation relati
*** 285,291 ****
  								relation->rd_lockInfo.lockRelId.dbId,
  								relation->rd_lockInfo.lockRelId.relId);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 322,332 ----
  								relation->rd_lockInfo.lockRelId.dbId,
  								relation->rd_lockInfo.lockRelId.relId);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on index \"%s\"",
! 				RelationGetRelationName(relation))));
  }
  
  /*
*************** LockPage(Relation relation, BlockNumber 
*** 319,325 ****
  					 relation->rd_lockInfo.lockRelId.relId,
  					 blkno);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 360,370 ----
  					 relation->rd_lockInfo.lockRelId.relId,
  					 blkno);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on page %u of relation \"%s\"",
! 				blkno, RelationGetRelationName(relation))));
  }
  
  /*
*************** LockTuple(Relation relation, ItemPointer
*** 375,381 ****
  					  ItemPointerGetBlockNumber(tid),
  					  ItemPointerGetOffsetNumber(tid));
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 420,430 ----
  					  ItemPointerGetBlockNumber(tid),
  					  ItemPointerGetOffsetNumber(tid));
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on row in relation \"%s\"",
! 				RelationGetRelationName(relation))));
  }
  
  /*
*************** XactLockTableInsert(TransactionId xid)
*** 429,435 ****
  
  	SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 	(void) LockAcquire(&tag, ExclusiveLock, false, false);
  }
  
  /*
--- 478,487 ----
  
  	SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 	if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on transaction with ID %u", xid)));
  }
  
  /*
*************** XactLockTableWait(TransactionId xid)
*** 473,479 ****
  
  		SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 		(void) LockAcquire(&tag, ShareLock, false, false);
  
  		LockRelease(&tag, ShareLock, false);
  
--- 525,534 ----
  
  		SET_LOCKTAG_TRANSACTION(tag, xid);
  
! 		if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on transaction with ID %u", xid)));
  
  		LockRelease(&tag, ShareLock, false);
  
*************** VirtualXactLockTableInsert(VirtualTransa
*** 531,537 ****
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	(void) LockAcquire(&tag, ExclusiveLock, false, false);
  }
  
  /*
--- 586,596 ----
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on virtual transaction with ID %u",
! 				vxid.localTransactionId)));
  }
  
  /*
*************** VirtualXactLockTableWait(VirtualTransact
*** 549,555 ****
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	(void) LockAcquire(&tag, ShareLock, false, false);
  
  	LockRelease(&tag, ShareLock, false);
  }
--- 608,618 ----
  
  	SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);
  
! 	if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on virtual transaction with ID %u",
! 				vxid.localTransactionId)));
  
  	LockRelease(&tag, ShareLock, false);
  }
*************** LockDatabaseObject(Oid classid, Oid obji
*** 598,604 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  }
  
  /*
--- 661,671 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on class:object: %u:%u",
! 				classid, objid)));
  }
  
  /*
*************** LockSharedObject(Oid classid, Oid objid,
*** 636,642 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, false, false);
  
  	/* Make sure syscaches are up-to-date with any changes we waited for */
  	AcceptInvalidationMessages();
--- 703,713 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 					errmsg("could not obtain lock on class:object: %u:%u",
! 				classid, objid)));
  
  	/* Make sure syscaches are up-to-date with any changes we waited for */
  	AcceptInvalidationMessages();
*************** LockSharedObjectForSession(Oid classid, 
*** 678,684 ****
  					   objid,
  					   objsubid);
  
! 	(void) LockAcquire(&tag, lockmode, true, false);
  }
  
  /*
--- 749,770 ----
  					   objid,
  					   objsubid);
  
! 	if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL)
! 		switch(classid)
! 		{
! 		case DatabaseRelationId:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on database with ID %u",
! 					objid)));
! 			break;
! 		default:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
! 						errmsg("could not obtain lock on class:object: %u:%u",
! 					classid, objid)));
! 			break;
! 		}
  }
  
  /*
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lock.c pgsql/src/backend/storage/lmgr/lock.c
*** pgsql.orig/src/backend/storage/lmgr/lock.c	2010-04-29 12:09:03.000000000 +0200
--- pgsql/src/backend/storage/lmgr/lock.c	2010-07-29 11:58:56.000000000 +0200
*************** PROCLOCK_PRINT(const char *where, const 
*** 255,261 ****
  static uint32 proclock_hash(const void *key, Size keysize);
  static void RemoveLocalLock(LOCALLOCK *locallock);
  static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
! static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
  static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
  			PROCLOCK *proclock, LockMethod lockMethodTable);
  static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
--- 255,261 ----
  static uint32 proclock_hash(const void *key, Size keysize);
  static void RemoveLocalLock(LOCALLOCK *locallock);
  static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner);
! static int WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
  static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
  			PROCLOCK *proclock, LockMethod lockMethodTable);
  static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
*************** ProcLockHashCode(const PROCLOCKTAG *proc
*** 447,453 ****
   *	dontWait: if true, don't wait to acquire lock
   *
   * Returns one of:
!  *		LOCKACQUIRE_NOT_AVAIL		lock not available, and dontWait=true
   *		LOCKACQUIRE_OK				lock successfully acquired
   *		LOCKACQUIRE_ALREADY_HELD	incremented count for lock already held
   *
--- 447,453 ----
   *	dontWait: if true, don't wait to acquire lock
   *
   * Returns one of:
!  *		LOCKACQUIRE_NOT_AVAIL		lock not available, either dontWait=true or timeout
   *		LOCKACQUIRE_OK				lock successfully acquired
   *		LOCKACQUIRE_ALREADY_HELD	incremented count for lock already held
   *
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 833,839 ****
  										 locktag->locktag_type,
  										 lockmode);
  
! 		WaitOnLock(locallock, owner);
  
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
--- 833,839 ----
  										 locktag->locktag_type,
  										 lockmode);
  
! 		status = WaitOnLock(locallock, owner);
  
  		TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1,
  										locktag->locktag_field2,
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 848,867 ****
  		 * done when the lock was granted to us --- see notes in WaitOnLock.
  		 */
  
! 		/*
! 		 * Check the proclock entry status, in case something in the ipc
! 		 * communication doesn't work correctly.
! 		 */
! 		if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
  		{
! 			PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock);
! 			LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode);
! 			/* Should we retry ? */
! 			LWLockRelease(partitionLock);
! 			elog(ERROR, "LockAcquire failed");
  		}
- 		PROCLOCK_PRINT("LockAcquire: granted", proclock);
- 		LOCK_PRINT("LockAcquire: granted", lock, lockmode);
  	}
  
  	LWLockRelease(partitionLock);
--- 848,879 ----
  		 * done when the lock was granted to us --- see notes in WaitOnLock.
  		 */
  
! 		switch (status)
  		{
! 		case STATUS_OK:
! 			/*
! 			 * Check the proclock entry status, in case something in the ipc
! 			 * communication doesn't work correctly.
! 			 */
! 			if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
! 			{
! 				PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock);
! 				LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode);
! 				/* Should we retry ? */
! 				LWLockRelease(partitionLock);
! 				elog(ERROR, "LockAcquire failed");
! 			}
! 			PROCLOCK_PRINT("LockAcquire: granted", proclock);
! 			LOCK_PRINT("LockAcquire: granted", lock, lockmode);
! 			break;
! 		case STATUS_WAITING:
! 			PROCLOCK_PRINT("LockAcquire: timed out", proclock);
! 			LOCK_PRINT("LockAcquire: timed out", lock, lockmode);
! 			break;
! 		default:
! 			elog(ERROR, "LockAcquire invalid status");
! 			break;
  		}
  	}
  
  	LWLockRelease(partitionLock);
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 887,893 ****
  							   locktag->locktag_field2);
  	}
  
! 	return LOCKACQUIRE_OK;
  }
  
  /*
--- 899,905 ----
  							   locktag->locktag_field2);
  	}
  
! 	return (status == STATUS_OK ? LOCKACQUIRE_OK : LOCKACQUIRE_NOT_AVAIL);
  }
  
  /*
*************** GrantAwaitedLock(void)
*** 1165,1178 ****
   * Caller must have set MyProc->heldLocks to reflect locks already held
   * on the lockable object by this process.
   *
   * The appropriate partition lock must be held at entry.
   */
! static void
  WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
  {
  	LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock);
  	LockMethod	lockMethodTable = LockMethods[lockmethodid];
  	char	   *volatile new_status = NULL;
  
  	LOCK_PRINT("WaitOnLock: sleeping on lock",
  			   locallock->lock, locallock->tag.mode);
--- 1177,1196 ----
   * Caller must have set MyProc->heldLocks to reflect locks already held
   * on the lockable object by this process.
   *
+  * Result: returns value of ProcSleep()
+  *	STATUS_OK if we acquired the lock
+  *	STATUS_ERROR if not (deadlock)
+  *	STATUS_WAITING if not (timeout)
+  *
   * The appropriate partition lock must be held at entry.
   */
! static int
  WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner)
  {
  	LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock);
  	LockMethod	lockMethodTable = LockMethods[lockmethodid];
  	char	   *volatile new_status = NULL;
+ 	int		wait_status;
  
  	LOCK_PRINT("WaitOnLock: sleeping on lock",
  			   locallock->lock, locallock->tag.mode);
*************** WaitOnLock(LOCALLOCK *locallock, Resourc
*** 1214,1221 ****
  	 */
  	PG_TRY();
  	{
! 		if (ProcSleep(locallock, lockMethodTable) != STATUS_OK)
  		{
  			/*
  			 * We failed as a result of a deadlock, see CheckDeadLock(). Quit
  			 * now.
--- 1232,1244 ----
  	 */
  	PG_TRY();
  	{
! 		wait_status = ProcSleep(locallock, lockMethodTable);
! 		switch (wait_status)
  		{
+ 		case STATUS_OK:
+ 		case STATUS_WAITING:
+ 			break;
+ 		default:
  			/*
  			 * We failed as a result of a deadlock, see CheckDeadLock(). Quit
  			 * now.
*************** WaitOnLock(LOCALLOCK *locallock, Resourc
*** 1260,1267 ****
  		pfree(new_status);
  	}
  
! 	LOCK_PRINT("WaitOnLock: wakeup on lock",
  			   locallock->lock, locallock->tag.mode);
  }
  
  /*
--- 1283,1296 ----
  		pfree(new_status);
  	}
  
! 	if (wait_status == STATUS_OK)
! 		LOCK_PRINT("WaitOnLock: wakeup on lock",
! 			   locallock->lock, locallock->tag.mode);
! 	else if (wait_status == STATUS_WAITING)
! 		LOCK_PRINT("WaitOnLock: timeout on lock",
  			   locallock->lock, locallock->tag.mode);
+ 
+ 	return wait_status;
  }
  
  /*
diff -dcrpN pgsql.orig/src/backend/storage/lmgr/proc.c pgsql/src/backend/storage/lmgr/proc.c
*** pgsql.orig/src/backend/storage/lmgr/proc.c	2010-07-11 11:14:54.000000000 +0200
--- pgsql/src/backend/storage/lmgr/proc.c	2010-08-02 12:41:13.000000000 +0200
***************
*** 52,57 ****
--- 52,58 ----
  /* GUC variables */
  int			DeadlockTimeout = 1000;
  int			StatementTimeout = 0;
+ int			LockTimeout = 0;
  bool		log_lock_waits = false;
  
  /* Pointer to this process's PGPROC struct, if any */
*************** static volatile bool statement_timeout_a
*** 79,98 ****
  static volatile bool deadlock_timeout_active = false;
  static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
  volatile bool cancel_from_timeout = false;
  
! /* timeout_start_time is set when log_lock_waits is true */
  static TimestampTz timeout_start_time;
  
  /* statement_fin_time is valid only if statement_timeout_active is true */
  static TimestampTz statement_fin_time;
  static TimestampTz statement_fin_time2; /* valid only in recovery */
  
  
  static void RemoveProcFromArray(int code, Datum arg);
  static void ProcKill(int code, Datum arg);
  static void AuxiliaryProcKill(int code, Datum arg);
  static bool CheckStatementTimeout(void);
  static bool CheckStandbyTimeout(void);
  
  
  /*
--- 80,106 ----
  static volatile bool deadlock_timeout_active = false;
  static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
  volatile bool cancel_from_timeout = false;
+ static volatile bool lock_timeout_active = false;
+ volatile bool lock_timeout_detected = false;
  
! /* timeout_start_time and timeout_fin_time are valid when deadlock_timeout_active is true */
  static TimestampTz timeout_start_time;
+ static TimestampTz timeout_fin_time;
  
  /* statement_fin_time is valid only if statement_timeout_active is true */
  static TimestampTz statement_fin_time;
  static TimestampTz statement_fin_time2; /* valid only in recovery */
  
+ /* lock_timeout_fin_time is valid only if lock_timeout_active is true */
+ static TimestampTz lock_timeout_fin_time;
  
  static void RemoveProcFromArray(int code, Datum arg);
  static void ProcKill(int code, Datum arg);
  static void AuxiliaryProcKill(int code, Datum arg);
  static bool CheckStatementTimeout(void);
+ static bool CheckLockTimeout(void);
  static bool CheckStandbyTimeout(void);
+ static bool enable_sig_alarm_for_lock_timeout(int delayms);
  
  
  /*
*************** ProcQueueInit(PROC_QUEUE *queue)
*** 797,803 ****
   * The lock table's partition lock must be held at entry, and will be held
   * at exit.
   *
!  * Result: STATUS_OK if we acquired the lock, STATUS_ERROR if not (deadlock).
   *
   * ASSUME: that no one will fiddle with the queue until after
   *		we release the partition lock.
--- 805,814 ----
   * The lock table's partition lock must be held at entry, and will be held
   * at exit.
   *
!  * Result:
!  *     STATUS_OK if we acquired the lock
!  *     STATUS_ERROR if not (deadlock)
!  *     STATUS_WAITING if not (timeout)
   *
   * ASSUME: that no one will fiddle with the queue until after
   *		we release the partition lock.
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 951,957 ****
  		elog(FATAL, "could not set timer for process wakeup");
  
  	/*
! 	 * If someone wakes us between LWLockRelease and PGSemaphoreLock,
  	 * PGSemaphoreLock will not block.	The wakeup is "saved" by the semaphore
  	 * implementation.	While this is normally good, there are cases where a
  	 * saved wakeup might be leftover from a previous operation (for example,
--- 962,976 ----
  		elog(FATAL, "could not set timer for process wakeup");
  
  	/*
! 	 * Reset timer so we are awaken in case of lock timeout.
! 	 * This doesn't modify the timer for deadlock check in case
! 	 * the deadlock check happens earlier.
! 	 */
! 	if (!enable_sig_alarm_for_lock_timeout(LockTimeout))
! 		elog(FATAL, "could not set timer for process wakeup");
! 
! 	/*
! 	 * If someone wakes us between LWLockRelease and PGSemaphoreTimedLock,
  	 * PGSemaphoreLock will not block.	The wakeup is "saved" by the semaphore
  	 * implementation.	While this is normally good, there are cases where a
  	 * saved wakeup might be leftover from a previous operation (for example,
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 969,975 ****
  	 */
  	do
  	{
! 		PGSemaphoreLock(&MyProc->sem, true);
  
  		/*
  		 * waitStatus could change from STATUS_WAITING to something else
--- 988,994 ----
  	 */
  	do
  	{
! 		PGSemaphoreTimedLock(&MyProc->sem, true);
  
  		/*
  		 * waitStatus could change from STATUS_WAITING to something else
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1093,1099 ****
  
  			pfree(buf.data);
  		}
! 	} while (myWaitStatus == STATUS_WAITING);
  
  	/*
  	 * Disable the timer, if it's still running
--- 1112,1118 ----
  
  			pfree(buf.data);
  		}
! 	} while (myWaitStatus == STATUS_WAITING && !lock_timeout_detected);
  
  	/*
  	 * Disable the timer, if it's still running
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1109,1114 ****
--- 1128,1141 ----
  	LWLockAcquire(partitionLock, LW_EXCLUSIVE);
  
  	/*
+ 	 * If we're in timeout, so we're not waiting anymore and
+ 	 * we're not the one that the lock will be granted to.
+ 	 * So remove ourselves from the wait queue.
+ 	 */
+ 	if (lock_timeout_detected)
+ 		RemoveFromWaitQueue(MyProc, hashcode);
+ 
+ 	/*
  	 * We no longer want LockWaitCancel to do anything.
  	 */
  	lockAwaited = NULL;
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1122,1129 ****
  	/*
  	 * We don't have to do anything else, because the awaker did all the
  	 * necessary update of the lock table and MyProc.
  	 */
! 	return MyProc->waitStatus;
  }
  
  
--- 1149,1158 ----
  	/*
  	 * We don't have to do anything else, because the awaker did all the
  	 * necessary update of the lock table and MyProc.
+ 	 * RemoveFromWaitQueue() have set MyProc->waitStatus = STATUS_ERROR,
+ 	 * we need to distinguish this case.
  	 */
! 	return (lock_timeout_detected ? STATUS_WAITING : MyProc->waitStatus);
  }
  
  
*************** CheckDeadLock(void)
*** 1301,1306 ****
--- 1330,1352 ----
  		RemoveFromWaitQueue(MyProc, LockTagHashCode(&(MyProc->waitLock->tag)));
  
  		/*
+ 		 * We found a deadlock, we already removed ourselves from
+ 		 * the wait queue above. Disable the lock_timeout check,
+ 		 * so RemoveFromWaitQueue() is not called again. This can happen
+ 		 * in the case when deadlock_timeout and lock_timeout are so close
+ 		 * that the system's clock resolution effectively makes them equal,
+ 		 * so the checks below are both true in the same signal handler:
+ 		 *
+ 		 *	TimestampTz now = GetCurrentTimestamp()
+ 		 *
+ 		 *	if (timeout_fin_time <= now) ...
+ 		 *
+ 		 *	if (lock_timeout_fin_time <= now) ...
+ 		 *
+ 		 */
+ 		lock_timeout_active = false;
+ 
+ 		/*
  		 * Unlock my semaphore so that the interrupted ProcSleep() call can
  		 * finish.
  		 */
*************** enable_sig_alarm(int delayms, bool is_st
*** 1462,1479 ****
  		 * than normal, but that does no harm.
  		 */
  		timeout_start_time = GetCurrentTimestamp();
! 		fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
! 		deadlock_timeout_active = true;
! 		if (fin_time >= statement_fin_time)
  			return true;
  	}
  	else
  	{
  		/* Begin deadlock timeout with no statement-level timeout */
  		deadlock_timeout_active = true;
! 		/* GetCurrentTimestamp can be expensive, so only do it if we must */
! 		if (log_lock_waits)
! 			timeout_start_time = GetCurrentTimestamp();
  	}
  
  	/* If we reach here, okay to set the timer interrupt */
--- 1508,1533 ----
  		 * than normal, but that does no harm.
  		 */
  		timeout_start_time = GetCurrentTimestamp();
! 		timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
! 
! 		/*
! 		 * Activate deadlock_timeout only if it should happen earlier
! 		 * than statement_timeout.
! 		 */
! 		if (timeout_fin_time >= statement_fin_time)
  			return true;
+ 		deadlock_timeout_active = true;
  	}
  	else
  	{
  		/* Begin deadlock timeout with no statement-level timeout */
  		deadlock_timeout_active = true;
! 		/*
! 		 * Computing the timeout_fin_time is needed because
! 		 * the lock timeout logic checks for it.
! 		 */
! 		timeout_start_time = GetCurrentTimestamp();
! 		timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms);
  	}
  
  	/* If we reach here, okay to set the timer interrupt */
*************** enable_sig_alarm(int delayms, bool is_st
*** 1486,1491 ****
--- 1540,1618 ----
  }
  
  /*
+  * Enable the SIGALRM interrupt to fire after the specified delay
+  * in case LockTimeout is set.
+  *
+  * This code properly handles nesting of lock_timeout timeout alarm
+  * within deadlock timeout and statement timeout alarms.
+  *
+  * Returns TRUE if okay, FALSE on failure.
+  */
+ static bool
+ enable_sig_alarm_for_lock_timeout(int delayms)
+ {
+ 	struct itimerval	timeval;
+ 	TimestampTz		fin_time;
+ 
+ 	lock_timeout_detected = false;
+ 	if (LockTimeout == 0)
+ 		return true;
+ 
+ 	if (deadlock_timeout_active)
+ 		/*
+ 		 * ensure the same starting time for deadlock_timeout and lock_timeout
+ 		 */
+ 		fin_time = timeout_start_time;
+ 	else
+ 		fin_time = GetCurrentTimestamp();
+ 	fin_time = TimestampTzPlusMilliseconds(fin_time, delayms);
+ 
+ 	if (statement_timeout_active)
+ 	{
+ 		/*
+ 		 * If statement_timeout is active and should happen before us
+ 		 * then don't bother setting up lock_timeout. statement_timeout
+ 		 * may span over multiple acquired locks during the same statement
+ 		 * so leave it in place.
+ 		 */
+ 		if (fin_time >= statement_fin_time)
+ 			return true;
+ 	}
+ 
+ 	if (deadlock_timeout_active)
+ 	{
+ 		/*
+ 		 * If deadlock_timeout is active but happens earlier then
+ 		 * don't modify the timer but set lock_timeout_active
+ 		 * so the timer will be re-set when deadlock_timeout triggers.
+ 		 */
+ 		if (fin_time >= timeout_fin_time)
+ 		{
+ 			lock_timeout_active = true;
+ 			return true;
+ 		}
+ 		/*
+ 		 * On the other hand, if deadlock_timeout should happens later
+ 		 * than lock_timeout, disable it. Life span of deadlock_timeout and
+ 		 * lock_timeout is the same.
+ 		 */
+ 		else
+ 			deadlock_timeout_active = false;
+ 	}
+ 
+ 	/* If we reach here, okay to set the timer interrupt */
+ 	MemSet(&timeval, 0, sizeof(struct itimerval));
+ 	timeval.it_value.tv_sec = delayms / 1000;
+ 	timeval.it_value.tv_usec = (delayms % 1000) * 1000;
+ 	if (setitimer(ITIMER_REAL, &timeval, NULL))
+ 		return false;
+ 
+ 	lock_timeout_fin_time = fin_time;
+ 	lock_timeout_active = true;
+ 	return true;
+ }
+ 
+ /*
   * Cancel the SIGALRM timer, either for a deadlock timeout or a statement
   * timeout.  If a deadlock timeout is canceled, any active statement timeout
   * remains in force.
*************** disable_sig_alarm(bool is_statement_time
*** 1502,1508 ****
  	 *
  	 * We will re-enable the interrupt if necessary in CheckStatementTimeout.
  	 */
! 	if (statement_timeout_active || deadlock_timeout_active)
  	{
  		struct itimerval timeval;
  
--- 1629,1635 ----
  	 *
  	 * We will re-enable the interrupt if necessary in CheckStatementTimeout.
  	 */
! 	if (statement_timeout_active || deadlock_timeout_active || lock_timeout_active)
  	{
  		struct itimerval timeval;
  
*************** disable_sig_alarm(bool is_statement_time
*** 1512,1517 ****
--- 1639,1646 ----
  			statement_timeout_active = false;
  			cancel_from_timeout = false;
  			deadlock_timeout_active = false;
+ 			lock_timeout_active = false;
+ 			lock_timeout_detected = false;
  			return false;
  		}
  	}
*************** disable_sig_alarm(bool is_statement_time
*** 1519,1524 ****
--- 1648,1656 ----
  	/* Always cancel deadlock timeout, in case this is error cleanup */
  	deadlock_timeout_active = false;
  
+ 	/* Ditto for lock_timeout */
+ 	lock_timeout_active = false;
+ 
  	/* Cancel or reschedule statement timeout */
  	if (is_statement_timeout)
  	{
*************** CheckStatementTimeout(void)
*** 1590,1595 ****
--- 1722,1777 ----
  
  
  /*
+  * Check for lock timeout.  If the timeout time has come,
+  * indicate it; if not, reschedule the SIGALRM interrupt to occur
+  * at the right time.
+  *
+  * Returns true if okay, false if failed to set the interrupt.
+  */
+ static bool
+ CheckLockTimeout(void)
+ {
+ 	TimestampTz now;
+ 
+ 	if (!lock_timeout_active)
+ 		return true;			/* do nothing if not active */
+ 
+ 	now = GetCurrentTimestamp();
+ 
+ 	if (now >= lock_timeout_fin_time)
+ 	{
+ 		/* Time to die */
+ 		lock_timeout_active = false;
+ 		lock_timeout_detected = true;
+ 	}
+ 	else
+ 	{
+ 		/* Not time yet, so (re)schedule the interrupt */
+ 		long		secs;
+ 		int			usecs;
+ 		struct itimerval timeval;
+ 
+ 		TimestampDifference(now, statement_fin_time,
+ 							&secs, &usecs);
+ 
+ 		/*
+ 		 * It's possible that the difference is less than a microsecond;
+ 		 * ensure we don't cancel, rather than set, the interrupt.
+ 		 */
+ 		if (secs == 0 && usecs == 0)
+ 			usecs = 1;
+ 		MemSet(&timeval, 0, sizeof(struct itimerval));
+ 		timeval.it_value.tv_sec = secs;
+ 		timeval.it_value.tv_usec = usecs;
+ 		if (setitimer(ITIMER_REAL, &timeval, NULL))
+ 			return false;
+ 	}
+ 
+ 	return true;
+ }
+ 
+ 
+ /*
   * Signal handler for SIGALRM for normal user backends
   *
   * Process deadlock check and/or statement timeout check, as needed.
*************** handle_sig_alarm(SIGNAL_ARGS)
*** 1608,1613 ****
--- 1790,1798 ----
  		CheckDeadLock();
  	}
  
+ 	if (lock_timeout_active)
+ 		(void) CheckLockTimeout();
+ 
  	if (statement_timeout_active)
  		(void) CheckStatementTimeout();
  
diff -dcrpN pgsql.orig/src/backend/utils/misc/guc.c pgsql/src/backend/utils/misc/guc.c
*** pgsql.orig/src/backend/utils/misc/guc.c	2010-07-26 10:05:55.000000000 +0200
--- pgsql/src/backend/utils/misc/guc.c	2010-07-29 11:58:56.000000000 +0200
*************** static struct config_int ConfigureNamesI
*** 1648,1653 ****
--- 1648,1663 ----
  	},
  
  	{
+ 		{"lock_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT,
+ 			gettext_noop("Sets the maximum allowed timeout for any lock taken by a statement."),
+ 			gettext_noop("A value of 0 turns off the timeout."),
+ 			GUC_UNIT_MS
+ 		},
+ 		&LockTimeout,
+ 		0, 0, INT_MAX, NULL, NULL
+ 	},
+ 
+ 	{
  		{"vacuum_freeze_min_age", PGC_USERSET, CLIENT_CONN_STATEMENT,
  			gettext_noop("Minimum age at which VACUUM should freeze a table row."),
  			NULL
diff -dcrpN pgsql.orig/src/backend/utils/misc/postgresql.conf.sample pgsql/src/backend/utils/misc/postgresql.conf.sample
*** pgsql.orig/src/backend/utils/misc/postgresql.conf.sample	2010-07-26 10:05:55.000000000 +0200
--- pgsql/src/backend/utils/misc/postgresql.conf.sample	2010-07-29 11:58:56.000000000 +0200
***************
*** 492,497 ****
--- 492,500 ----
  #------------------------------------------------------------------------------
  
  #deadlock_timeout = 1s
+ #lock_timeout = 0			# timeout value for heavy-weight locks
+ 					# taken by statements. 0 disables timeout
+ 					# unit in milliseconds, default is 0
  #max_locks_per_transaction = 64		# min 10
  					# (change requires restart)
  # Note:  Each lock table slot uses ~270 bytes of shared memory, and there are
diff -dcrpN pgsql.orig/src/include/storage/pg_sema.h pgsql/src/include/storage/pg_sema.h
*** pgsql.orig/src/include/storage/pg_sema.h	2010-01-02 17:58:08.000000000 +0100
--- pgsql/src/include/storage/pg_sema.h	2010-07-29 11:58:56.000000000 +0200
*************** extern void PGSemaphoreUnlock(PGSemaphor
*** 80,83 ****
--- 80,86 ----
  /* Lock a semaphore only if able to do so without blocking */
  extern bool PGSemaphoreTryLock(PGSemaphore sema);
  
+ /* Lock a semaphore (decrement count), blocking if count would be < 0 */
+ extern void PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK);
+ 
  #endif   /* PG_SEMA_H */
diff -dcrpN pgsql.orig/src/include/storage/proc.h pgsql/src/include/storage/proc.h
*** pgsql.orig/src/include/storage/proc.h	2010-07-11 11:15:00.000000000 +0200
--- pgsql/src/include/storage/proc.h	2010-07-29 11:58:56.000000000 +0200
*************** typedef struct PROC_HDR
*** 163,170 ****
--- 163,172 ----
  /* configurable options */
  extern int	DeadlockTimeout;
  extern int	StatementTimeout;
+ extern int	LockTimeout;
  extern bool log_lock_waits;
  
+ extern volatile bool lock_timeout_detected;
  extern volatile bool cancel_from_timeout;

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Boszormenyi Zoltan (#6)

Boszormenyi Zoltan írta:

Marc Cousin írta:

The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :

I fixed this by adding CheckLockTimeout() function that works like
CheckStatementTimeout() and ensuring that the same start time is
used for both deadlock_timeout and lock_timeout if both are active.
The preference of errors if their timeout values are equal is:
statement_timeout > lock_timeout > deadlock_timeout

As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
work, with this new version.

Keeping the deadlock_timeout to 1s, when lock_timeout >= 1001,
lock_timeout doesn't trigger anymore.

I missed one case when the lock_timeout_active should have been set
but the timer must not have been re-set, this caused the problem.
I blame the hot weather and having no air conditioning. The second is
now fixed. :-)

I also added one line in autovacuum.c to disable lock_timeout,
in case it's globally set in postgresq.conf as per Alvaro's comment.

Also, I made sure that only one or two timeout causes (one of
deadlock_timeout
and lock_timeout in the first case or statement_timeout plus one of the
other two)
can be active at a time.

A little clarification is needed. The above statement is not entirely true.
There can be a case when all three timeout causes can be active, but only
when deadlock_timeout is the smallest of the three. If the fin_time value
for the another timeout cause is smaller than for deadlock_timeout then
there's no point to make deadlock_timeout active. And in case
deadlock_timeout
triggers and CheckDeadLock() finds that there really is a deadlock then the
possibly active lock_timeout gets disabled to avoid calling
RemoveFromWaitQueue() twice. I hope the comments in the code explain it
well.

Previously I was able to trigger a segfault
with the default
1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
system's
clock resolution makes the lock_timeout and deadlock_timeout equal and
RemoveFromWaitQueue() was called twice. This way it's a lot more robust.

Best regards,
Zoltán Böszörményi

Marc Cousin

cousinmarc@gmail.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#7)

The Monday 02 August 2010 13:59:59, Boszormenyi Zoltan wrote :

Also, I made sure that only one or two timeout causes (one of
deadlock_timeout
and lock_timeout in the first case or statement_timeout plus one of the
other two)
can be active at a time.

A little clarification is needed. The above statement is not entirely true.
There can be a case when all three timeout causes can be active, but only
when deadlock_timeout is the smallest of the three. If the fin_time value
for the another timeout cause is smaller than for deadlock_timeout then
there's no point to make deadlock_timeout active. And in case
deadlock_timeout
triggers and CheckDeadLock() finds that there really is a deadlock then the
possibly active lock_timeout gets disabled to avoid calling
RemoveFromWaitQueue() twice. I hope the comments in the code explain it
well.

Previously I was able to trigger a segfault
with the default
1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
system's
clock resolution makes the lock_timeout and deadlock_timeout equal and
RemoveFromWaitQueue() was called twice. This way it's a lot more robust.

Ok, I've tested this new version:

This time, it's this case that doesn't work :

Session 1 : lock a table

Session 2 : set lock_timeout to a large value, just to make it obvious (10 seconds).
It times out after 1 s (the deadlock timeout), returning 'could not obtain lock'.
Of course, it should wait 10 seconds.

I really feel that the timeout framework is the way to go here. I know what
you said about it before, but I think that the current code is getting
too complicated, with too many special cases all over.

As this is only my second review and I'm sure I am missing things here, could
someone with more experience have a look and give advice ?

Cheers

Marc

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Marc Cousin (#8)

Marc Cousin <cousinmarc@gmail.com> wrote:

This time, it's this case that doesn't work :

I really feel that the timeout framework is the way to go here.

Since Zoltï¿œn also seems to feel this way:

http://archives.postgresql.org/message-id/4C516C3A.6090102@cybertec.at

I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF. Does anyone
feel this approach (without the framework) should be pursued
further?

-Kevin

#10

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Kevin Grittner (#9)

Hi,

Kevin Grittner ï¿œrta:

Marc Cousin <cousinmarc@gmail.com> wrote:

This time, it's this case that doesn't work :

I really feel that the timeout framework is the way to go here.

Since Zoltï¿œn also seems to feel this way:

http://archives.postgresql.org/message-id/4C516C3A.6090102@cybertec.at

I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF. Does anyone
feel this approach (without the framework) should be pursued
further?

I certainly think so, the current scheme seems to be very fragile
and doesn't want to be extended.

#11

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Boszormenyi Zoltan (#10)

Boszormenyi Zoltan <zb@cybertec.at> wrote:

Kevin Grittner ï¿œrta:

I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF. Does
anyone feel this approach (without the framework) should be
pursued further?

I certainly think so, the current scheme seems to be very fragile
and doesn't want to be extended.

Sorry, I phrased that question in a rather confusing way; I'm not
sure, but it sounds like you're in favor of dropping this approach
and pursuing the timeout framework in the next CF -- is that right?

-Kevin

#12

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Kevin Grittner (#9)

On Mon, Aug 2, 2010 at 3:09 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Marc Cousin <cousinmarc@gmail.com> wrote:

This time, it's this case that doesn't work :

I really feel that the timeout framework is the way to go here.

Since Zoltán also seems to feel this way:

http://archives.postgresql.org/message-id/4C516C3A.6090102@cybertec.at

I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF. Does anyone
feel this approach (without the framework) should be pursued
further?

I think "Returned with Feedback" would be more appropriate than
"Rejected", since we're asking for a rework, rather than saying - we
just don't want this. But otherwise, +1.

Generally, I'm of the opinion that patches needing significant rework
should be set to "Returned with Feedback" and resubmitted for the next
CF; otherwise, it just gets unmanageable. Our goal for a CF should be
to review everything thoroughly, not to get everything committed.
Otherwise, we end up with a never-ending train of what are effectively
new patches popping up, and it becomes impossible to close out the
CommitFest on time. Reviewers and committers end up getting slammed,
and it's not very much fun for patch authors (who are trying to
frantically do last-minute rewrites) either; nor is it good for the
overall quality of code the ends up in our tree. Better to take a
breather and then start fresh.

(If you don't believe in committer fatigue, look at the review I gave
Itagaki Takahiro on the partitioning patch in January vs. the review I
gave in July. One of those reviews is a whole lot more specific,
detailed, and accurate than the other one...)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#13

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 15 years ago

In reply to: Robert Haas (#12)

Robert Haas <robertmhaas@gmail.com> wrote:

I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF.

I think "Returned with Feedback" would be more appropriate than
"Rejected", since we're asking for a rework, rather than saying -
we just don't want this. But otherwise, +1.

Done.

-Kevin