Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

Started by Andres Freundalmost 11 years ago14 messages
#1Andres Freund
andres@2ndquadrant.com
10 attachment(s)

Hi,

This mail is a answer to
http://archives.postgresql.org/message-id/20150110022542.GF12509%40alap3.anarazel.de
but I decided that it's a good go move it into a new thread since the
patchseries has outgrown its original target. It's invasive and touches
many arcane areas of the code, so that I think more eyes than a long
deep thread is likely to have, are a good thing.

As a short description of the goal of the patchseries:
This started out as steps on the way to be able to safely handle
terminate_backend() et al when we're reading/writing from the
client. E.g. because the client is dead and we haven't noticed yet and
are stuck writing, or because we want to implement a idle_in_transaction
timeout.

Making these changes allowed me to see that not doing significant work
in signal handlers can make code much simpler and more robust. The
number of bugs postgres had in the past that involved doing too much in
signal handlers is significant. Thus I've since extended the
patchseries to get rid of nearly all nontrivial work in signal
handlers.

All the patches in the series up to 0008 hav ecommit messages providing
more detail. A short description of each patch follows:

0001: Replace walsender's latch with the general shared latch.

New patch that removes ImmediateInteruptOK behaviour from walsender. I
think that's a rather good idea, because walsender currently seems to
assume WaitLatchOrSocket is reentrant - which I don't think is really
guaranteed.
Hasn't been reviewed yet, but I think it's not far from being
committable.

0002: Use a nonblocking socket for FE/BE communication and block using
latches.

Has previously been reviewed by Heikki. I think Noah also had a
look, although I'm not sure how close that was.

I think this can be committed soon.

0003: Introduce and use infrastructure for interrupt processing during client reads.

From here on ImmediateInterruptOK isn't set during client
communication. Normal interrupts and sinval/async interrupts are
processed outside of signal handlers. Especially the sinval/async
greatly simplify the respective code.

Has been reviewed by Heikki in an earlier version - I think I
addressed all the review comments.

I'd like somebody else to look at the sinval/async.c changes
before committing. I really like the simplification this change
allowed.

I think this patch might not be safe without 0004 because I can't
convince myself that it's safe to interrupt latch waits with work
that actually might also use the same latch (sinval
interrupts). But it's easier to review this way as 0004 is quite
big.

0004: Process 'die' interrupts while reading/writing from the client socket.

This is the reason Horiguchi-san started this thread.

I think the important debate here is whether we think it's
acceptable that there are situations (a full socket buffer, but a
alive connection) where the client previously got an error, but
not anymore afterwards. I think that's much better than having
unkillable connections, but I can see others being of a different
opinion.

0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.

I'm surprised how comparatively simple this turned out to be. For
robustness and understanding I think this is a great improvement.

New and not reviewed at all. Needs significant review. But works
in my simple testcases.

0006: Don't allow immediate interupts during authentication anymore.

So far we've set ImmediateInterruptOK to true during large parts
of the client authentication - that's not all that pretty,
interrupts might arrive while we're in some system routines.

Due to patches 0003/0004 we now are able to safely serve
interrupts during client communication which is the major are
where we want to adhere to authentication_timeout.

I additionally placed some CHECK_FOR_INTERRUPTS()s in some
somewhat randomly chosen places in auth.c. Those don't completely
get back the previous 'appruptness' (?) of reacting to
interrupts, but that's partially for the better, because we don't
interrupt foreign code anymore.

0007: Remove the option to service interrupts during PGSemaphoreLock().

Due to patch 0005 there's no user of PGSemaphoreLock(lock, interruptOK = true)

anymore, so remove the code to handle that. I find it rather odd
that the code did a CHECK_FOR_INTERRUPTS before the semwait even
when interruptOK was set to to false - that only happened to work
because lwlock.c did a HOLD_INTERRUPTS() around the semaphore
code. It's now removed entirely.

This is a API break because the signature of PGSemaphoreLock()
changed. But I have a hard time seing that as a problem. For one I
don't think there's many users of PGSemaphoreLock, for another
they should change their interrupt handling.

New and not reviewed.

0008: Remove remnants of ImmediateInterruptOK handling.

Now that ImmediateInterruptOK is never set to true anymore, we can
remove related code and comments.

New and not reviewed.

0009: WIP: lwlock: use latches

Optional patch that removes the use of semaphores from the lwlock
code. There's no benefit for lwlock.c itself due to this. But it
does get rid of the last user of semaphores in a normal postgres
build. I primarily wrote this to test the performance of latches.

I guess we want to do this anyway.

0010: WIP: unix_latch: WIP efficiency hackery

Nothing ready, although I think using eventfds on linux is worth
the cost.

Questions I have about the series right now:

1) Do others agree that getting rid of ImmediateInterruptOK is
worthwile. I personally think that we really should pursue that
aggressively as a goal.

2) Does anybody see a significant problem with the reduction of cases
where we immediately react to authentication_timeout? Since we still
react immediately when we're waiting for the client (except maybe in
sspi/kerberos?) I think that's ok. The waiting cases are slow
ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
out of the relevant code anyway.

3) If we do everything (in corrected), upto 0009, there's no remaining
user of semaphores in postgres, except the spinlock emulation.

Does anybody see a need for PGPROC->sem to remain? Removing it would
have the significant benefit that semaphores are the last thing users
frequently need to tune to get postgres to start up when using a
higher max_connection or multiple clusters.

If we remove PGPROC->sem does anybody see a way to get rid of the
semaphore code alltogether? I personally still think we should just
gank --disable-spinlocks. Now that it's only a couple lines (in an
preexisting patch) to rely on compiler intrinsics for spinlocks that
doesn't sound like a big deal. Even if not, we could decide to get
rid of win32_sem.c...

4) There remains one user of something like ImmediateInterruptOK -
walreceiver. It has WalRcvImmediateInterruptOK. We definitely should
get rid of that as well, but that's a independent patch that can be
written in the future.

5) In a future patch I think we could also handle catchup interrupts
during other client reads, not just ReadCommand(). Does anybody see
that as a worthwile goal? I can't remember many problems with not
enough catchup happening, but I think someone mentioned that as a
problem in the past.

6) Review ;)

Comments?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Replace-walsender-s-latch-with-the-general-shared-la.patchtext/x-patch; charset=us-asciiDownload
>From bdc09b352892fb67cd8731a3a3aa794193a048e2 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sat, 10 Jan 2015 15:55:14 +0100
Subject: [PATCH 01/10] Replace walsender's latch with the general shared
 latch.

Relying on the normal shared latch simplifies interrupt/signal
handling because we can rely on all signal handlers setting the proc
latch. That in turn allows us to avoid the use of
ImmediateInterruptOK, which arguably isn't correct because
WaitLatchOrSocket isn't declared to be immediately interruptible.

Also change sections that wait on the walsender's latch to notice
interrupts quicker/more reliably and make them more consistent with
each other.
---
 src/backend/replication/basebackup.c        | 10 +++-
 src/backend/replication/walsender.c         | 86 +++++++++++++++--------------
 src/include/replication/walsender_private.h |  6 +-
 3 files changed, 55 insertions(+), 47 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 07030a2..3058ce9 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1294,15 +1294,21 @@ throttle(size_t increment)
 	/* Only sleep if the transfer is faster than it should be. */
 	if (sleep > 0)
 	{
-		ResetLatch(&MyWalSnd->latch);
+		ResetLatch(MyLatch);
+
+		/* We're eating a potentially set latch, so check for interrupts */
+		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * (TAR_SEND_SIZE / throttling_sample * elapsed_min_unit) should be
 		 * the maximum time to sleep. Thus the cast to long is safe.
 		 */
-		wait_result = WaitLatch(&MyWalSnd->latch,
+		wait_result = WaitLatch(MyLatch,
 							 WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
 								(long) (sleep / 1000));
+
+		if (wait_result & WL_LATCH_SET)
+			CHECK_FOR_INTERRUPTS();
 	}
 	else
 	{
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 86c36bf..05d2339 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1081,6 +1081,11 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
 		if (!PostmasterIsAlive())
 			exit(1);
 
+		/* Clear any already-pending wakeups */
+		ResetLatch(MyLatch);
+
+		CHECK_FOR_INTERRUPTS();
+
 		/* Process any requests or signals received recently */
 		if (got_SIGHUP)
 		{
@@ -1092,9 +1097,6 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
 		/* Check for input from the client */
 		ProcessRepliesIfAny();
 
-		/* Clear any already-pending wakeups */
-		ResetLatch(&MyWalSnd->latch);
-
 		/* Try to flush pending output to the client */
 		if (pq_flush_if_writable() != 0)
 			WalSndShutdown();
@@ -1117,15 +1119,12 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
 			WL_SOCKET_WRITEABLE | WL_SOCKET_READABLE | WL_TIMEOUT;
 
 		/* Sleep until something happens or we time out */
-		ImmediateInterruptOK = true;
-		CHECK_FOR_INTERRUPTS();
-		WaitLatchOrSocket(&MyWalSnd->latch, wakeEvents,
+		WaitLatchOrSocket(MyLatch, wakeEvents,
 						  MyProcPort->sock, sleeptime);
-		ImmediateInterruptOK = false;
 	}
 
 	/* reactivate latch so WalSndLoop knows to continue */
-	SetLatch(&MyWalSnd->latch);
+	SetLatch(MyLatch);
 }
 
 /*
@@ -1165,6 +1164,11 @@ WalSndWaitForWal(XLogRecPtr loc)
 		if (!PostmasterIsAlive())
 			exit(1);
 
+		/* Clear any already-pending wakeups */
+		ResetLatch(MyLatch);
+
+		CHECK_FOR_INTERRUPTS();
+
 		/* Process any requests or signals received recently */
 		if (got_SIGHUP)
 		{
@@ -1176,9 +1180,6 @@ WalSndWaitForWal(XLogRecPtr loc)
 		/* Check for input from the client */
 		ProcessRepliesIfAny();
 
-		/* Clear any already-pending wakeups */
-		ResetLatch(&MyWalSnd->latch);
-
 		/* Update our idea of the currently flushed position. */
 		if (!RecoveryInProgress())
 			RecentFlushPtr = GetFlushRecPtr();
@@ -1244,15 +1245,12 @@ WalSndWaitForWal(XLogRecPtr loc)
 			wakeEvents |= WL_SOCKET_WRITEABLE;
 
 		/* Sleep until something happens or we time out */
-		ImmediateInterruptOK = true;
-		CHECK_FOR_INTERRUPTS();
-		WaitLatchOrSocket(&MyWalSnd->latch, wakeEvents,
+		WaitLatchOrSocket(MyLatch, wakeEvents,
 						  MyProcPort->sock, sleeptime);
-		ImmediateInterruptOK = false;
 	}
 
 	/* reactivate latch so WalSndLoop knows to continue */
-	SetLatch(&MyWalSnd->latch);
+	SetLatch(MyLatch);
 	return RecentFlushPtr;
 }
 
@@ -1813,6 +1811,11 @@ WalSndLoop(WalSndSendDataCallback send_data)
 		if (!PostmasterIsAlive())
 			exit(1);
 
+		/* Clear any already-pending wakeups */
+		ResetLatch(MyLatch);
+
+		CHECK_FOR_INTERRUPTS();
+
 		/* Process any requests or signals received recently */
 		if (got_SIGHUP)
 		{
@@ -1821,14 +1824,9 @@ WalSndLoop(WalSndSendDataCallback send_data)
 			SyncRepInitConfig();
 		}
 
-		CHECK_FOR_INTERRUPTS();
-
 		/* Check for input from the client */
 		ProcessRepliesIfAny();
 
-		/* Clear any already-pending wakeups */
-		ResetLatch(&MyWalSnd->latch);
-
 		/*
 		 * If we have received CopyDone from the client, sent CopyDone
 		 * ourselves, and the output buffer is empty, it's time to exit
@@ -1912,11 +1910,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
 				wakeEvents |= WL_SOCKET_WRITEABLE;
 
 			/* Sleep until something happens or we time out */
-			ImmediateInterruptOK = true;
-			CHECK_FOR_INTERRUPTS();
-			WaitLatchOrSocket(&MyWalSnd->latch, wakeEvents,
+			WaitLatchOrSocket(MyLatch, wakeEvents,
 							  MyProcPort->sock, sleeptime);
-			ImmediateInterruptOK = false;
 		}
 	}
 	return;
@@ -1959,9 +1954,9 @@ InitWalSenderSlot(void)
 			walsnd->pid = MyProcPid;
 			walsnd->sentPtr = InvalidXLogRecPtr;
 			walsnd->state = WALSNDSTATE_STARTUP;
+			walsnd->latch = &MyProc->procLatch;
 			SpinLockRelease(&walsnd->mutex);
 			/* don't need the lock anymore */
-			OwnLatch((Latch *) &walsnd->latch);
 			MyWalSnd = (WalSnd *) walsnd;
 
 			break;
@@ -1986,19 +1981,14 @@ WalSndKill(int code, Datum arg)
 
 	Assert(walsnd != NULL);
 
-	/*
-	 * Clear MyWalSnd first; then disown the latch.  This is so that signal
-	 * handlers won't try to touch the latch after it's no longer ours.
-	 */
 	MyWalSnd = NULL;
 
-	DisownLatch(&walsnd->latch);
-
-	/*
-	 * Mark WalSnd struct no longer in use. Assume that no lock is required
-	 * for this.
-	 */
+	SpinLockAcquire(&walsnd->mutex);
+	/* clear latch while holding the spinlock, so it can safely be read */
+	walsnd->latch = NULL;
+	/* Mark WalSnd struct as no longer being in use. */
 	walsnd->pid = 0;
+	SpinLockRelease(&walsnd->mutex);
 }
 
 /*
@@ -2570,8 +2560,8 @@ WalSndSigHupHandler(SIGNAL_ARGS)
 	int			save_errno = errno;
 
 	got_SIGHUP = true;
-	if (MyWalSnd)
-		SetLatch(&MyWalSnd->latch);
+
+	SetLatch(MyLatch);
 
 	errno = save_errno;
 }
@@ -2603,8 +2593,7 @@ WalSndLastCycleHandler(SIGNAL_ARGS)
 		kill(MyProcPid, SIGTERM);
 
 	walsender_ready_to_stop = true;
-	if (MyWalSnd)
-		SetLatch(&MyWalSnd->latch);
+	SetLatch(MyLatch);
 
 	errno = save_errno;
 }
@@ -2668,7 +2657,6 @@ WalSndShmemInit(void)
 			WalSnd	   *walsnd = &WalSndCtl->walsnds[i];
 
 			SpinLockInit(&walsnd->mutex);
-			InitSharedLatch(&walsnd->latch);
 		}
 	}
 }
@@ -2685,7 +2673,21 @@ WalSndWakeup(void)
 	int			i;
 
 	for (i = 0; i < max_wal_senders; i++)
-		SetLatch(&WalSndCtl->walsnds[i].latch);
+	{
+		Latch *latch;
+		WalSnd *walsnd = &WalSndCtl->walsnds[i];
+
+		/*
+		 * Get latch pointer with spinlock held, for the unlikely case that
+		 * pointer reads aren't atomic (as they're 8 bytes).
+		 */
+		SpinLockAcquire(&walsnd->mutex);
+		latch = walsnd->latch;
+		SpinLockRelease(&walsnd->mutex);
+
+		if (latch != NULL)
+			SetLatch(latch);
+	}
 }
 
 /* Set state for current walsender (only called in walsender) */
diff --git a/src/include/replication/walsender_private.h b/src/include/replication/walsender_private.h
index cc351d6..8867750 100644
--- a/src/include/replication/walsender_private.h
+++ b/src/include/replication/walsender_private.h
@@ -51,10 +51,10 @@ typedef struct WalSnd
 	slock_t		mutex;
 
 	/*
-	 * Latch used by backends to wake up this walsender when it has work to
-	 * do.
+	 * Pointer to the walsender's latch. Used by backends to wake up this
+	 * walsender when it has work to do. NULL if the walsender isn't active.
 	 */
-	Latch		latch;
+	Latch		*latch;
 
 	/*
 	 * The priority order of the standby managed by this WALSender, as listed
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0002-Use-a-nonblocking-socket-for-FE-BE-communication-and.patchtext/x-patch; charset=us-asciiDownload
>From cd9b902fa0dfd046fa32f3101b17e692757596a6 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 8 Jan 2015 15:50:10 +0100
Subject: [PATCH 02/10] Use a nonblocking socket for FE/BE communication and
 block using latches.

This allows to introduce more elaborate handling of interrupts while
reading from a socket.  Currently some interrupt handlers have to do
significant work from inside signal handlers, and it's very hard to
correctly write code to do so.  Generic signal handler limitations,
combined with the fact that we can't safely jump out of a signal
handler while reading from the client have prohibited implemenation of
features like timeouts for idle-in-transaction.

This commit probably can't actually safely be used without the next
patch in the series, as WaitLatchOrSocket might not be fully signal
safe. Reviewing them separately is easier though, so I'll just fold
them before the final commit.

Author: Andres Freund
Reviewed-By: Heikki Linnakangas
---
 src/backend/libpq/be-secure-openssl.c | 32 ++++-----------
 src/backend/libpq/be-secure.c         | 76 ++++++++++++++++++++++++++++++++++-
 src/backend/libpq/pqcomm.c            | 41 ++++++++-----------
 3 files changed, 99 insertions(+), 50 deletions(-)

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 1dd7770..729b746 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -371,12 +371,8 @@ aloop:
 		{
 			case SSL_ERROR_WANT_READ:
 			case SSL_ERROR_WANT_WRITE:
-#ifdef WIN32
-				pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
-											(err == SSL_ERROR_WANT_READ) ?
-						FD_READ | FD_CLOSE | FD_ACCEPT : FD_WRITE | FD_CLOSE,
-											INFINITE);
-#endif
+				/* not allowed during connection establishment */
+				Assert(!port->noblock);
 				goto aloop;
 			case SSL_ERROR_SYSCALL:
 				if (r < 0)
@@ -516,18 +512,9 @@ rloop:
 			break;
 		case SSL_ERROR_WANT_READ:
 		case SSL_ERROR_WANT_WRITE:
+			/* Don't retry if the socket is in nonblocking mode. */
 			if (port->noblock)
-			{
-				errno = EWOULDBLOCK;
-				n = -1;
 				break;
-			}
-#ifdef WIN32
-			pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
-										(err == SSL_ERROR_WANT_READ) ?
-									FD_READ | FD_CLOSE : FD_WRITE | FD_CLOSE,
-										INFINITE);
-#endif
 			goto rloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
@@ -630,12 +617,9 @@ wloop:
 			break;
 		case SSL_ERROR_WANT_READ:
 		case SSL_ERROR_WANT_WRITE:
-#ifdef WIN32
-			pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
-										(err == SSL_ERROR_WANT_READ) ?
-									FD_READ | FD_CLOSE : FD_WRITE | FD_CLOSE,
-										INFINITE);
-#endif
+			/* Don't retry if the socket is in nonblocking mode. */
+			if (port->noblock)
+				break;
 			goto wloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
@@ -722,7 +706,7 @@ my_sock_read(BIO *h, char *buf, int size)
 		if (res <= 0)
 		{
 			/* If we were interrupted, tell caller to retry */
-			if (errno == EINTR)
+			if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
 			{
 				BIO_set_retry_read(h);
 			}
@@ -741,7 +725,7 @@ my_sock_write(BIO *h, const char *buf, int size)
 	BIO_clear_retry_flags(h);
 	if (res <= 0)
 	{
-		if (errno == EINTR)
+		if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
 		{
 			BIO_set_retry_write(h);
 		}
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index c592f85..709131f 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -18,6 +18,8 @@
 
 #include "postgres.h"
 
+#include "miscadmin.h"
+
 #include <sys/stat.h>
 #include <signal.h>
 #include <fcntl.h>
@@ -34,6 +36,7 @@
 #include "libpq/libpq.h"
 #include "tcop/tcopprot.h"
 #include "utils/memutils.h"
+#include "storage/proc.h"
 
 
 char	   *ssl_cert_file;
@@ -147,7 +150,37 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 
 	prepare_for_client_read();
 
+	/*
+	 * Try to read from the socket without blocking. If it suceeds we're
+	 * done, otherwise we'll wait for the socket using the latch mechanism.
+	 */
+rloop:
+#ifdef WIN32
+	pgwin32_noblock = true;
+#endif
 	n = recv(port->sock, ptr, len, 0);
+#ifdef WIN32
+	pgwin32_noblock = false;
+#endif
+
+	if (!port->noblock && n < 0 && (errno == EWOULDBLOCK || errno == EAGAIN))
+	{
+		int		w;
+		int		save_errno = errno;
+
+		w = WaitLatchOrSocket(MyLatch,
+							  WL_SOCKET_READABLE,
+							  port->sock, 0);
+
+		if (w & WL_SOCKET_READABLE)
+			goto rloop;
+
+		/*
+		 * Restore errno, clobbered by WaitLatchOrSocket, so the caller can
+		 * react properly.
+		 */
+		errno = save_errno;
+	}
 
 	client_read_ended();
 
@@ -170,7 +203,9 @@ secure_write(Port *port, void *ptr, size_t len)
 	}
 	else
 #endif
+	{
 		n = secure_raw_write(port, ptr, len);
+	}
 
 	return n;
 }
@@ -178,5 +213,44 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port->sock, ptr, len, 0);
+	ssize_t		n;
+
+wloop:
+
+#ifdef WIN32
+	pgwin32_noblock = true;
+#endif
+	n = send(port->sock, ptr, len, 0);
+#ifdef WIN32
+	pgwin32_noblock = false;
+#endif
+
+	if (!port->noblock && n < 0 && (errno == EWOULDBLOCK || errno == EAGAIN))
+	{
+		int		w;
+		int		save_errno = errno;
+
+		/*
+		 * We probably want to check for latches being set at some point
+		 * here. That'd allow us to handle interrupts while blocked on
+		 * writes. If set we'd not retry directly, but return. That way we
+		 * don't do anything while (possibly) inside a ssl library.
+		 */
+		w = WaitLatchOrSocket(MyLatch,
+							  WL_SOCKET_WRITEABLE,
+							  port->sock, 0);
+
+		if (w & WL_SOCKET_WRITEABLE)
+		{
+			goto wloop;
+		}
+
+		/*
+		 * Restore errno, clobbered by WaitLatchOrSocket, so the caller can
+		 * react properly.
+		 */
+		errno = save_errno;
+	}
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index e3efac3..6f35508 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -179,6 +179,22 @@ pq_init(void)
 	PqCommBusy = false;
 	DoingCopyOut = false;
 	on_proc_exit(socket_close, 0);
+
+	/*
+	 * In individual backends we operate the underlying socket in nonblocking
+	 * mode and use latches to implement blocking semantics if needed. That
+	 * allows us to provide safely interruptible reads.
+	 *
+	 * Use COMMERROR on failure, because ERROR would try to send the error to
+	 * the client, which might require changing the mode again, leading to
+	 * infinite recursion.
+	 */
+#ifndef WIN32
+	if (!pg_set_noblock(MyProcPort->sock))
+		ereport(COMMERROR,
+				(errmsg("could not set socket to nonblocking mode: %m")));
+#endif
+
 }
 
 /* --------------------------------
@@ -818,31 +834,6 @@ socket_set_nonblocking(bool nonblocking)
 				(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST),
 				 errmsg("there is no client connection")));
 
-	if (MyProcPort->noblock == nonblocking)
-		return;
-
-#ifdef WIN32
-	pgwin32_noblock = nonblocking ? 1 : 0;
-#else
-
-	/*
-	 * Use COMMERROR on failure, because ERROR would try to send the error to
-	 * the client, which might require changing the mode again, leading to
-	 * infinite recursion.
-	 */
-	if (nonblocking)
-	{
-		if (!pg_set_noblock(MyProcPort->sock))
-			ereport(COMMERROR,
-					(errmsg("could not set socket to nonblocking mode: %m")));
-	}
-	else
-	{
-		if (!pg_set_block(MyProcPort->sock))
-			ereport(COMMERROR,
-					(errmsg("could not set socket to blocking mode: %m")));
-	}
-#endif
 	MyProcPort->noblock = nonblocking;
 }
 
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0003-Introduce-and-use-infrastructure-for-interrupt-proce.patchtext/x-patch; charset=us-asciiDownload
>From 7fd6fd457d102f141e1b336e4d411b1f2d197cab Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sat, 27 Sep 2014 23:49:30 +0200
Subject: [PATCH 03/10] Introduce and use infrastructure for interrupt
 processing during client reads.

Up to now large swathes of backend code ran inside signal handlers
while reading commands from the client, to allow for speedy reaction to
asynchronous events. Most prominently shared invalidation and NOTIFY
handling. That means that complex code like the starting/stopping of
transactions is run in signal handlers...  The required code was
fragile and verbose, and is likely to contain bugs.

That approach also severily limited what could be done while
communicating with the client. As the read might be from within
openssl it wasn't safely possible to trigger an error, e.g. to cancel
a backend in idle-in-transaction state.

Now that FE/BE communication in the backend employs nonblocking
sockets and latches to block we can quite simply interrupt reads from
signal handlers by setting a signal. That allows us to signal an
interrupted read wich is supposed to be retried after returning from
within the ssl libray.

As signal handlers now only need to set the latch to guarantee timely
interrupt processing, remove a fair amount of complicated & fragile
code from async.c and sinval.c.

This work will hopefully allow to handle cases like being blocked by
sending data, interrupting idle transactions and similar to be
implemented without too much effort.

Author: Andres Freund
Reviewed-By: Heikki Linnakangas
---
 src/backend/commands/async.c          | 199 ++++++------------------------
 src/backend/libpq/be-secure-openssl.c |  18 +++
 src/backend/libpq/be-secure.c         |  51 ++++++--
 src/backend/postmaster/autovacuum.c   |   6 +-
 src/backend/storage/ipc/sinval.c      | 223 +++++++---------------------------
 src/backend/tcop/postgres.c           | 110 ++++++-----------
 src/include/commands/async.h          |  12 +-
 src/include/storage/sinval.h          |   7 +-
 src/include/tcop/tcopprot.h           |   4 +-
 9 files changed, 188 insertions(+), 442 deletions(-)

diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index b945d9d..81632e6 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -79,10 +79,12 @@
  *	  either, but just process the queue directly.
  *
  * 5. Upon receipt of a PROCSIG_NOTIFY_INTERRUPT signal, the signal handler
- *	  can call inbound-notify processing immediately if this backend is idle
- *	  (ie, it is waiting for a frontend command and is not within a transaction
- *	  block).  Otherwise the handler may only set a flag, which will cause the
- *	  processing to occur just before we next go idle.
+ *	  sets the process's latch which triggers the event to be processed
+ *	  immediately if this backend is idle (ie, it is waiting for a frontend
+ *	  command and is not within a transaction block. C.f.
+ *	  ProcessClientReadInterrupt()).  Otherwise the handler may only set a
+ *	  flag, which will cause the processing to occur just before we next go
+ *	  idle.
  *
  *	  Inbound-notify processing consists of reading all of the notifications
  *	  that have arrived since scanning last time. We read every notification
@@ -126,6 +128,7 @@
 #include "miscadmin.h"
 #include "storage/ipc.h"
 #include "storage/lmgr.h"
+#include "storage/proc.h"
 #include "storage/procarray.h"
 #include "storage/procsignal.h"
 #include "storage/sinval.h"
@@ -334,17 +337,13 @@ static List *pendingNotifies = NIL;		/* list of Notifications */
 static List *upperPendingNotifies = NIL;		/* list of upper-xact lists */
 
 /*
- * State for inbound notifications consists of two flags: one saying whether
- * the signal handler is currently allowed to call ProcessIncomingNotify
- * directly, and one saying whether the signal has occurred but the handler
- * was not allowed to call ProcessIncomingNotify at the time.
- *
- * NB: the "volatile" on these declarations is critical!  If your compiler
- * does not grok "volatile", you'd be best advised to compile this file
- * with all optimization turned off.
+ * Inbound notifications are initially processed by HandleNotifyInterrupt(),
+ * called from inside a signal handler. That just sets the
+ * notifyInterruptPending flag and sets the process
+ * latch. ProcessNotifyInterrupt() will then be called whenever it's safe to
+ * actually deal with the interrupt.
  */
-static volatile sig_atomic_t notifyInterruptEnabled = 0;
-static volatile sig_atomic_t notifyInterruptOccurred = 0;
+volatile sig_atomic_t notifyInterruptPending = false;
 
 /* True if we've registered an on_shmem_exit cleanup */
 static bool unlistenExitRegistered = false;
@@ -1625,164 +1624,45 @@ AtSubAbort_Notify(void)
 /*
  * HandleNotifyInterrupt
  *
- *		This is called when PROCSIG_NOTIFY_INTERRUPT is received.
- *
- *		If we are idle (notifyInterruptEnabled is set), we can safely invoke
- *		ProcessIncomingNotify directly.  Otherwise, just set a flag
- *		to do it later.
+ *		Signal handler portion of interrupt handling. Let the backend know
+ *		that there's a pending notify interrupt. If we're currently reading
+ *		from the client, this will interrupt the read and
+ *		ProcessClientReadInterrupt() will call ProcessNotifyInterrupt().
  */
 void
 HandleNotifyInterrupt(void)
 {
 	/*
 	 * Note: this is called by a SIGNAL HANDLER. You must be very wary what
-	 * you do here. Some helpful soul had this routine sprinkled with
-	 * TPRINTFs, which would likely lead to corruption of stdio buffers if
-	 * they were ever turned on.
+	 * you do here.
 	 */
 
-	/* Don't joggle the elbow of proc_exit */
-	if (proc_exit_inprogress)
-		return;
-
-	if (notifyInterruptEnabled)
-	{
-		bool		save_ImmediateInterruptOK = ImmediateInterruptOK;
-
-		/*
-		 * We may be called while ImmediateInterruptOK is true; turn it off
-		 * while messing with the NOTIFY state.  This prevents problems if
-		 * SIGINT or similar arrives while we're working.  Just to be real
-		 * sure, bump the interrupt holdoff counter as well.  That way, even
-		 * if something inside ProcessIncomingNotify() transiently sets
-		 * ImmediateInterruptOK (eg while waiting on a lock), we won't get
-		 * interrupted until we're done with the notify interrupt.
-		 */
-		ImmediateInterruptOK = false;
-		HOLD_INTERRUPTS();
-
-		/*
-		 * I'm not sure whether some flavors of Unix might allow another
-		 * SIGUSR1 occurrence to recursively interrupt this routine. To cope
-		 * with the possibility, we do the same sort of dance that
-		 * EnableNotifyInterrupt must do --- see that routine for comments.
-		 */
-		notifyInterruptEnabled = 0;		/* disable any recursive signal */
-		notifyInterruptOccurred = 1;	/* do at least one iteration */
-		for (;;)
-		{
-			notifyInterruptEnabled = 1;
-			if (!notifyInterruptOccurred)
-				break;
-			notifyInterruptEnabled = 0;
-			if (notifyInterruptOccurred)
-			{
-				/* Here, it is finally safe to do stuff. */
-				if (Trace_notify)
-					elog(DEBUG1, "HandleNotifyInterrupt: perform async notify");
-
-				ProcessIncomingNotify();
-
-				if (Trace_notify)
-					elog(DEBUG1, "HandleNotifyInterrupt: done");
-			}
-		}
+	/* signal that work needs to be done */
+	notifyInterruptPending = true;
 
-		/*
-		 * Restore the holdoff level and ImmediateInterruptOK, and check for
-		 * interrupts if needed.
-		 */
-		RESUME_INTERRUPTS();
-		ImmediateInterruptOK = save_ImmediateInterruptOK;
-		if (save_ImmediateInterruptOK)
-			CHECK_FOR_INTERRUPTS();
-	}
-	else
-	{
-		/*
-		 * In this path it is NOT SAFE to do much of anything, except this:
-		 */
-		notifyInterruptOccurred = 1;
-	}
+	/* make sure the event is processed in due course */
+	SetLatch(MyLatch);
 }
 
 /*
- * EnableNotifyInterrupt
- *
- *		This is called by the PostgresMain main loop just before waiting
- *		for a frontend command.  If we are truly idle (ie, *not* inside
- *		a transaction block), then process any pending inbound notifies,
- *		and enable the signal handler to process future notifies directly.
+ * ProcessNotifyInterrupt
  *
- *		NOTE: the signal handler starts out disabled, and stays so until
- *		PostgresMain calls this the first time.
+ *		This is called just after waiting for a frontend command.  If a
+ *		interrupt arrives (via HandleNotifyInterrupt()) while reading, the
+ *		read will be interrupted via the process's latch, and this routine
+ *		will get called.  If we are truly idle (ie, *not* inside a transaction
+ *		block), process the incoming notifies.
  */
 void
-EnableNotifyInterrupt(void)
+ProcessNotifyInterrupt(void)
 {
 	if (IsTransactionOrTransactionBlock())
 		return;					/* not really idle */
 
-	/*
-	 * This code is tricky because we are communicating with a signal handler
-	 * that could interrupt us at any point.  If we just checked
-	 * notifyInterruptOccurred and then set notifyInterruptEnabled, we could
-	 * fail to respond promptly to a signal that happens in between those two
-	 * steps.  (A very small time window, perhaps, but Murphy's Law says you
-	 * can hit it...)  Instead, we first set the enable flag, then test the
-	 * occurred flag.  If we see an unserviced interrupt has occurred, we
-	 * re-clear the enable flag before going off to do the service work. (That
-	 * prevents re-entrant invocation of ProcessIncomingNotify() if another
-	 * interrupt occurs.) If an interrupt comes in between the setting and
-	 * clearing of notifyInterruptEnabled, then it will have done the service
-	 * work and left notifyInterruptOccurred zero, so we have to check again
-	 * after clearing enable.  The whole thing has to be in a loop in case
-	 * another interrupt occurs while we're servicing the first. Once we get
-	 * out of the loop, enable is set and we know there is no unserviced
-	 * interrupt.
-	 *
-	 * NB: an overenthusiastic optimizing compiler could easily break this
-	 * code. Hopefully, they all understand what "volatile" means these days.
-	 */
-	for (;;)
-	{
-		notifyInterruptEnabled = 1;
-		if (!notifyInterruptOccurred)
-			break;
-		notifyInterruptEnabled = 0;
-		if (notifyInterruptOccurred)
-		{
-			if (Trace_notify)
-				elog(DEBUG1, "EnableNotifyInterrupt: perform async notify");
-
-			ProcessIncomingNotify();
-
-			if (Trace_notify)
-				elog(DEBUG1, "EnableNotifyInterrupt: done");
-		}
-	}
+	while (notifyInterruptPending)
+		ProcessIncomingNotify();
 }
 
-/*
- * DisableNotifyInterrupt
- *
- *		This is called by the PostgresMain main loop just after receiving
- *		a frontend command.  Signal handler execution of inbound notifies
- *		is disabled until the next EnableNotifyInterrupt call.
- *
- *		The PROCSIG_CATCHUP_INTERRUPT signal handler also needs to call this,
- *		so as to prevent conflicts if one signal interrupts the other.  So we
- *		must return the previous state of the flag.
- */
-bool
-DisableNotifyInterrupt(void)
-{
-	bool		result = (notifyInterruptEnabled != 0);
-
-	notifyInterruptEnabled = 0;
-
-	return result;
-}
 
 /*
  * Read all pending notifications from the queue, and deliver appropriate
@@ -2076,9 +1956,10 @@ asyncQueueAdvanceTail(void)
 /*
  * ProcessIncomingNotify
  *
- *		Deal with arriving NOTIFYs from other backends.
- *		This is called either directly from the PROCSIG_NOTIFY_INTERRUPT
- *		signal handler, or the next time control reaches the outer idle loop.
+ *		Deal with arriving NOTIFYs from other backends as soon as it's safe to
+ *		do so. This used to be called from the PROCSIG_NOTIFY_INTERRUPT
+ *		signal handler, but isn't anymore.
+ *
  *		Scan the queue for arriving notifications and report them to my front
  *		end.
  *
@@ -2087,18 +1968,13 @@ asyncQueueAdvanceTail(void)
 static void
 ProcessIncomingNotify(void)
 {
-	bool		catchup_enabled;
-
 	/* We *must* reset the flag */
-	notifyInterruptOccurred = 0;
+	notifyInterruptPending = false;
 
 	/* Do nothing else if we aren't actively listening */
 	if (listenChannels == NIL)
 		return;
 
-	/* Must prevent catchup interrupt while I am running */
-	catchup_enabled = DisableCatchupInterrupt();
-
 	if (Trace_notify)
 		elog(DEBUG1, "ProcessIncomingNotify");
 
@@ -2123,9 +1999,6 @@ ProcessIncomingNotify(void)
 
 	if (Trace_notify)
 		elog(DEBUG1, "ProcessIncomingNotify: done");
-
-	if (catchup_enabled)
-		EnableCatchupInterrupt();
 }
 
 /*
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 729b746..3a70f43 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -515,6 +515,20 @@ rloop:
 			/* Don't retry if the socket is in nonblocking mode. */
 			if (port->noblock)
 				break;
+
+			/*
+			 * We'll, among other situations, get here if the low level
+			 * routine doing the actual recv() via the socket got interrupted
+			 * by a signal. That's so we can handle interrupts once outside
+			 * openssl so we don't jump out from underneath its covers. We can
+			 * check this both, when reading and writing, because even when
+			 * writing that's just openssl's doing, not a 'proper' write
+			 * initiated by postgres.
+			 *
+			 * Only process interrupts here if we're blocking inside the
+			 * function. In the other cases secure_read() will do so.
+			 */
+			ProcessClientReadInterrupt();  /* preserves errno */
 			goto rloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
@@ -620,6 +634,10 @@ wloop:
 			/* Don't retry if the socket is in nonblocking mode. */
 			if (port->noblock)
 				break;
+			/*
+			 * XXX: We'll, at some later point, likely want to add interrupt
+			 * processing here.
+			 */
 			goto wloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 709131f..f96b8f3 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -129,6 +129,7 @@ secure_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
+retry:
 #ifdef USE_SSL
 	if (port->ssl_in_use)
 	{
@@ -140,6 +141,14 @@ secure_read(Port *port, void *ptr, size_t len)
 		n = secure_raw_read(port, ptr, len);
 	}
 
+	/* Process interrupts that happened while (or before) receiving. */
+	ProcessClientReadInterrupt(); /* preserves errno */
+
+	/* retry after processing interrupts */
+	if (n < 0 && errno == EINTR)
+	{
+		goto retry;
+	}
 	return n;
 }
 
@@ -148,8 +157,6 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
-
 	/*
 	 * Try to read from the socket without blocking. If it suceeds we're
 	 * done, otherwise we'll wait for the socket using the latch mechanism.
@@ -168,11 +175,22 @@ rloop:
 		int		w;
 		int		save_errno = errno;
 
+
 		w = WaitLatchOrSocket(MyLatch,
-							  WL_SOCKET_READABLE,
+							  WL_LATCH_SET | WL_SOCKET_READABLE,
 							  port->sock, 0);
 
-		if (w & WL_SOCKET_READABLE)
+		if (w & WL_LATCH_SET)
+		{
+			ResetLatch(MyLatch);
+			/*
+			 * Force a return, so interrupts can be processed when not
+			 * (possibly) underneath a ssl library.
+			 */
+			errno = EINTR;
+			return -1;
+		}
+		else if (w & WL_SOCKET_READABLE)
 			goto rloop;
 
 		/*
@@ -182,8 +200,6 @@ rloop:
 		errno = save_errno;
 	}
 
-	client_read_ended();
-
 	return n;
 }
 
@@ -196,6 +212,7 @@ secure_write(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
+retry:
 #ifdef USE_SSL
 	if (port->ssl_in_use)
 	{
@@ -207,6 +224,21 @@ secure_write(Port *port, void *ptr, size_t len)
 		n = secure_raw_write(port, ptr, len);
 	}
 
+	/*
+	 * XXX: We'll, at some later point, likely want to add interrupt
+	 * processing here.
+	 */
+
+	/*
+	 * Retry after processing interrupts. This can be triggered even though we
+	 * don't check for latch set's during writing yet, because SSL
+	 * renegotiations might have required reading from the socket.
+	 */
+	if (n < 0 && errno == EINTR)
+	{
+		goto retry;
+	}
+
 	return n;
 }
 
@@ -231,10 +263,9 @@ wloop:
 		int		save_errno = errno;
 
 		/*
-		 * We probably want to check for latches being set at some point
-		 * here. That'd allow us to handle interrupts while blocked on
-		 * writes. If set we'd not retry directly, but return. That way we
-		 * don't do anything while (possibly) inside a ssl library.
+		 * XXX: We'll, at some later point, likely want to add interrupt
+		 * processing here.  If set we'd not retry directly, but return. That
+		 * way we don't do anything while (possibly) inside a ssl library.
 		 */
 		w = WaitLatchOrSocket(MyLatch,
 							  WL_SOCKET_WRITEABLE,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 02f871c..7a4e382 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -582,9 +582,6 @@ AutoVacLauncherMain(int argc, char *argv[])
 		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
 								 false, &nap);
 
-		/* Allow sinval catchup interrupts while sleeping */
-		EnableCatchupInterrupt();
-
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
 		 * signal handlers will wake us by calling SetLatch).
@@ -595,7 +592,8 @@ AutoVacLauncherMain(int argc, char *argv[])
 
 		ResetLatch(MyLatch);
 
-		DisableCatchupInterrupt();
+		/* Process sinval catchup interrupts that happened while sleeping */
+		ProcessCatchupInterrupt();
 
 		/*
 		 * Emergency bailout if postmaster has died.  This is to avoid the
diff --git a/src/backend/storage/ipc/sinval.c b/src/backend/storage/ipc/sinval.c
index e6b0d49..67ec515 100644
--- a/src/backend/storage/ipc/sinval.c
+++ b/src/backend/storage/ipc/sinval.c
@@ -18,6 +18,7 @@
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "storage/ipc.h"
+#include "storage/proc.h"
 #include "storage/sinvaladt.h"
 #include "utils/inval.h"
 
@@ -32,19 +33,12 @@ uint64		SharedInvalidMessageCounter;
  * through a cache reset exercise.  This is done by sending
  * PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far behind.
  *
- * State for catchup events consists of two flags: one saying whether
- * the signal handler is currently allowed to call ProcessCatchupEvent
- * directly, and one saying whether the signal has occurred but the handler
- * was not allowed to call ProcessCatchupEvent at the time.
- *
- * NB: the "volatile" on these declarations is critical!  If your compiler
- * does not grok "volatile", you'd be best advised to compile this file
- * with all optimization turned off.
+ * The signal handler will set a interrupt pending flag and will set the
+ * processes latch. Whenever starting to read from the client, or when
+ * interrupted while doing so, ProcessClientReadInterrupt() will call
+ * ProcessCatchupEvent().
  */
-static volatile int catchupInterruptEnabled = 0;
-static volatile int catchupInterruptOccurred = 0;
-
-static void ProcessCatchupEvent(void);
+volatile sig_atomic_t catchupInterruptPending = false;
 
 
 /*
@@ -141,9 +135,9 @@ ReceiveSharedInvalidMessages(
 	 * catchup signal this way avoids creating spikes in system load for what
 	 * should be just a background maintenance activity.
 	 */
-	if (catchupInterruptOccurred)
+	if (catchupInterruptPending)
 	{
-		catchupInterruptOccurred = 0;
+		catchupInterruptPending = false;
 		elog(DEBUG4, "sinval catchup complete, cleaning queue");
 		SICleanupQueue(false, 0);
 	}
@@ -155,12 +149,9 @@ ReceiveSharedInvalidMessages(
  *
  * This is called when PROCSIG_CATCHUP_INTERRUPT is received.
  *
- * If we are idle (catchupInterruptEnabled is set), we can safely
- * invoke ProcessCatchupEvent directly.  Otherwise, just set a flag
- * to do it later.  (Note that it's quite possible for normal processing
- * of the current transaction to cause ReceiveSharedInvalidMessages()
- * to be run later on; in that case the flag will get cleared again,
- * since there's no longer any reason to do anything.)
+ * We used to directly call ProcessCatchupEvent directly when idle. These days
+ * we just set a flag to do it later and notify the process of that fact by
+ * setting the process's latch.
  */
 void
 HandleCatchupInterrupt(void)
@@ -170,174 +161,46 @@ HandleCatchupInterrupt(void)
 	 * you do here.
 	 */
 
-	/* Don't joggle the elbow of proc_exit */
-	if (proc_exit_inprogress)
-		return;
-
-	if (catchupInterruptEnabled)
-	{
-		bool		save_ImmediateInterruptOK = ImmediateInterruptOK;
-
-		/*
-		 * We may be called while ImmediateInterruptOK is true; turn it off
-		 * while messing with the catchup state.  This prevents problems if
-		 * SIGINT or similar arrives while we're working.  Just to be real
-		 * sure, bump the interrupt holdoff counter as well.  That way, even
-		 * if something inside ProcessCatchupEvent() transiently sets
-		 * ImmediateInterruptOK (eg while waiting on a lock), we won't get
-		 * interrupted until we're done with the catchup interrupt.
-		 */
-		ImmediateInterruptOK = false;
-		HOLD_INTERRUPTS();
-
-		/*
-		 * I'm not sure whether some flavors of Unix might allow another
-		 * SIGUSR1 occurrence to recursively interrupt this routine. To cope
-		 * with the possibility, we do the same sort of dance that
-		 * EnableCatchupInterrupt must do --- see that routine for comments.
-		 */
-		catchupInterruptEnabled = 0;	/* disable any recursive signal */
-		catchupInterruptOccurred = 1;	/* do at least one iteration */
-		for (;;)
-		{
-			catchupInterruptEnabled = 1;
-			if (!catchupInterruptOccurred)
-				break;
-			catchupInterruptEnabled = 0;
-			if (catchupInterruptOccurred)
-			{
-				/* Here, it is finally safe to do stuff. */
-				ProcessCatchupEvent();
-			}
-		}
+	catchupInterruptPending = true;
 
-		/*
-		 * Restore the holdoff level and ImmediateInterruptOK, and check for
-		 * interrupts if needed.
-		 */
-		RESUME_INTERRUPTS();
-		ImmediateInterruptOK = save_ImmediateInterruptOK;
-		if (save_ImmediateInterruptOK)
-			CHECK_FOR_INTERRUPTS();
-	}
-	else
-	{
-		/*
-		 * In this path it is NOT SAFE to do much of anything, except this:
-		 */
-		catchupInterruptOccurred = 1;
-	}
+	/* make sure the event is processed in due course */
+	SetLatch(MyLatch);
 }
 
 /*
- * EnableCatchupInterrupt
- *
- * This is called by the PostgresMain main loop just before waiting
- * for a frontend command.  We process any pending catchup events,
- * and enable the signal handler to process future events directly.
+ * ProcessCatchupInterrupt
  *
- * NOTE: the signal handler starts out disabled, and stays so until
- * PostgresMain calls this the first time.
+ * The portion of catchup interrupt handling that runs outside of the signal
+ * handler, which allows it to actually process pending invalidations.
  */
 void
-EnableCatchupInterrupt(void)
+ProcessCatchupInterrupt(void)
 {
-	/*
-	 * This code is tricky because we are communicating with a signal handler
-	 * that could interrupt us at any point.  If we just checked
-	 * catchupInterruptOccurred and then set catchupInterruptEnabled, we could
-	 * fail to respond promptly to a signal that happens in between those two
-	 * steps.  (A very small time window, perhaps, but Murphy's Law says you
-	 * can hit it...)  Instead, we first set the enable flag, then test the
-	 * occurred flag.  If we see an unserviced interrupt has occurred, we
-	 * re-clear the enable flag before going off to do the service work. (That
-	 * prevents re-entrant invocation of ProcessCatchupEvent() if another
-	 * interrupt occurs.) If an interrupt comes in between the setting and
-	 * clearing of catchupInterruptEnabled, then it will have done the service
-	 * work and left catchupInterruptOccurred zero, so we have to check again
-	 * after clearing enable.  The whole thing has to be in a loop in case
-	 * another interrupt occurs while we're servicing the first. Once we get
-	 * out of the loop, enable is set and we know there is no unserviced
-	 * interrupt.
-	 *
-	 * NB: an overenthusiastic optimizing compiler could easily break this
-	 * code. Hopefully, they all understand what "volatile" means these days.
-	 */
-	for (;;)
+	while (catchupInterruptPending)
 	{
-		catchupInterruptEnabled = 1;
-		if (!catchupInterruptOccurred)
-			break;
-		catchupInterruptEnabled = 0;
-		if (catchupInterruptOccurred)
-			ProcessCatchupEvent();
-	}
-}
-
-/*
- * DisableCatchupInterrupt
- *
- * This is called by the PostgresMain main loop just after receiving
- * a frontend command.  Signal handler execution of catchup events
- * is disabled until the next EnableCatchupInterrupt call.
- *
- * The PROCSIG_NOTIFY_INTERRUPT signal handler also needs to call this,
- * so as to prevent conflicts if one signal interrupts the other.  So we
- * must return the previous state of the flag.
- */
-bool
-DisableCatchupInterrupt(void)
-{
-	bool		result = (catchupInterruptEnabled != 0);
-
-	catchupInterruptEnabled = 0;
-
-	return result;
-}
-
-/*
- * ProcessCatchupEvent
- *
- * Respond to a catchup event (PROCSIG_CATCHUP_INTERRUPT) from another
- * backend.
- *
- * This is called either directly from the PROCSIG_CATCHUP_INTERRUPT
- * signal handler, or the next time control reaches the outer idle loop
- * (assuming there's still anything to do by then).
- */
-static void
-ProcessCatchupEvent(void)
-{
-	bool		notify_enabled;
-
-	/* Must prevent notify interrupt while I am running */
-	notify_enabled = DisableNotifyInterrupt();
-
-	/*
-	 * What we need to do here is cause ReceiveSharedInvalidMessages() to run,
-	 * which will do the necessary work and also reset the
-	 * catchupInterruptOccurred flag.  If we are inside a transaction we can
-	 * just call AcceptInvalidationMessages() to do this.  If we aren't, we
-	 * start and immediately end a transaction; the call to
-	 * AcceptInvalidationMessages() happens down inside transaction start.
-	 *
-	 * It is awfully tempting to just call AcceptInvalidationMessages()
-	 * without the rest of the xact start/stop overhead, and I think that
-	 * would actually work in the normal case; but I am not sure that things
-	 * would clean up nicely if we got an error partway through.
-	 */
-	if (IsTransactionOrTransactionBlock())
-	{
-		elog(DEBUG4, "ProcessCatchupEvent inside transaction");
-		AcceptInvalidationMessages();
-	}
-	else
-	{
-		elog(DEBUG4, "ProcessCatchupEvent outside transaction");
-		StartTransactionCommand();
-		CommitTransactionCommand();
+		/*
+		 * What we need to do here is cause ReceiveSharedInvalidMessages() to
+		 * run, which will do the necessary work and also reset the
+		 * catchupInterruptPending flag.  If we are inside a transaction we
+		 * can just call AcceptInvalidationMessages() to do this.  If we
+		 * aren't, we start and immediately end a transaction; the call to
+		 * AcceptInvalidationMessages() happens down inside transaction start.
+		 *
+		 * It is awfully tempting to just call AcceptInvalidationMessages()
+		 * without the rest of the xact start/stop overhead, and I think that
+		 * would actually work in the normal case; but I am not sure that things
+		 * would clean up nicely if we got an error partway through.
+		 */
+		if (IsTransactionOrTransactionBlock())
+		{
+			elog(DEBUG4, "ProcessCatchupEvent inside transaction");
+			AcceptInvalidationMessages();
+		}
+		else
+		{
+			elog(DEBUG4, "ProcessCatchupEvent outside transaction");
+			StartTransactionCommand();
+			CommitTransactionCommand();
+		}
 	}
-
-	if (notify_enabled)
-		EnableNotifyInterrupt();
 }
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8f74353..fa1aeb5 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -301,17 +301,24 @@ InteractiveBackend(StringInfo inBuf)
  * interactive_getc -- collect one character from stdin
  *
  * Even though we are not reading from a "client" process, we still want to
- * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * respond to signals, particularly SIGTERM/SIGQUIT. FIXME.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	/*
+	 * FIXME: this will not process catchup interrupts or notifications while
+	 * reading. But those can't really be relevant for a standalone backend
+	 * anyway?
+	 */
+	ProcessClientReadInterrupt();
+
 	c = getc(stdin);
-	client_read_ended();
+
+	ProcessClientReadInterrupt();
+
 	return c;
 }
 
@@ -486,53 +493,35 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client input
+ * ProcessClientReadInterrupt() - Process interrupts specific to client reads
+ *
+ * This is called just after low-level reads. That might be after the read
+ * finished successfully, or it was interrupted via interrupt.
  *
- * This must be called immediately before any low-level read from the
- * client connection.  It is necessary to do it at a sufficiently low level
- * that there won't be any other operations except the read kernel call
- * itself between this call and the subsequent client_read_ended() call.
- * In particular there mustn't be use of malloc() or other potentially
- * non-reentrant libc functions.  This restriction makes it safe for us
- * to allow interrupt service routines to execute nontrivial code while
- * we are waiting for input.
+ * Must preserve errno!
  */
 void
-prepare_for_client_read(void)
+ProcessClientReadInterrupt(void)
 {
-	if (DoingCommandRead)
-	{
-		/* Enable immediate processing of asynchronous signals */
-		EnableNotifyInterrupt();
-		EnableCatchupInterrupt();
+	int			save_errno = errno;
 
-		/* Allow cancel/die interrupts to be processed while waiting */
-		ImmediateInterruptOK = true;
+	Assert(InterruptHoldoffCount == 0 && CritSectionCount == 0);
 
-		/* And don't forget to detect one that already arrived */
-		CHECK_FOR_INTERRUPTS();
-	}
-}
-
-/*
- * client_read_ended -- get out of the client-input state
- *
- * This is called just after low-level reads.  It must preserve errno!
- */
-void
-client_read_ended(void)
-{
 	if (DoingCommandRead)
 	{
-		int			save_errno = errno;
-
-		ImmediateInterruptOK = false;
+		/* Check for general interrupts that arrived while reading */
+		CHECK_FOR_INTERRUPTS();
 
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
+		/* Process sinval catchup interrupts that happened while reading */
+		if (catchupInterruptPending)
+			ProcessCatchupInterrupt();
 
-		errno = save_errno;
+		/* Process sinval catchup interrupts that happened while reading */
+		if (notifyInterruptPending)
+			ProcessNotifyInterrupt();
 	}
+
+	errno = save_errno;
 }
 
 
@@ -2589,8 +2578,8 @@ die(SIGNAL_ARGS)
 		ProcDiePending = true;
 
 		/*
-		 * If it's safe to interrupt, and we're waiting for input or a lock,
-		 * service the interrupt immediately
+		 * If it's safe to interrupt, and we're waiting for a lock, service
+		 * the interrupt immediately
 		 */
 		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
 			CritSectionCount == 0)
@@ -2599,8 +2588,6 @@ die(SIGNAL_ARGS)
 			/* until we are done getting ready for it */
 			InterruptHoldoffCount++;
 			LockErrorCleanup(); /* prevent CheckDeadLock from running */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			InterruptHoldoffCount--;
 			ProcessInterrupts();
 		}
@@ -2630,8 +2617,8 @@ StatementCancelHandler(SIGNAL_ARGS)
 		QueryCancelPending = true;
 
 		/*
-		 * If it's safe to interrupt, and we're waiting for input or a lock,
-		 * service the interrupt immediately
+		 * If it's safe to interrupt, and we're waiting for a lock, service
+		 * the interrupt immediately
 		 */
 		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
 			CritSectionCount == 0)
@@ -2640,8 +2627,6 @@ StatementCancelHandler(SIGNAL_ARGS)
 			/* until we are done getting ready for it */
 			InterruptHoldoffCount++;
 			LockErrorCleanup(); /* prevent CheckDeadLock from running */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			InterruptHoldoffCount--;
 			ProcessInterrupts();
 		}
@@ -2787,8 +2772,8 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
 			RecoveryConflictRetryable = false;
 
 		/*
-		 * If it's safe to interrupt, and we're waiting for input or a lock,
-		 * service the interrupt immediately
+		 * If it's safe to interrupt, and we're waiting for a lock, service
+		 * the interrupt immediately
 		 */
 		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
 			CritSectionCount == 0)
@@ -2797,8 +2782,6 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
 			/* until we are done getting ready for it */
 			InterruptHoldoffCount++;
 			LockErrorCleanup(); /* prevent CheckDeadLock from running */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			InterruptHoldoffCount--;
 			ProcessInterrupts();
 		}
@@ -2835,8 +2818,6 @@ ProcessInterrupts(void)
 		ProcDiePending = false;
 		QueryCancelPending = false;		/* ProcDie trumps QueryCancel */
 		ImmediateInterruptOK = false;	/* not idle anymore */
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 		/* As in quickdie, don't risk sending to client during auth */
 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
 			whereToSendOutput = DestNone;
@@ -2871,8 +2852,6 @@ ProcessInterrupts(void)
 	{
 		QueryCancelPending = false;		/* lost connection trumps QueryCancel */
 		ImmediateInterruptOK = false;	/* not idle anymore */
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 		/* don't send to client, we already know the connection to be dead. */
 		whereToSendOutput = DestNone;
 		ereport(FATAL,
@@ -2885,8 +2864,6 @@ ProcessInterrupts(void)
 		if (ClientAuthInProgress)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			/* As in quickdie, don't risk sending to client during auth */
 			if (whereToSendOutput == DestRemote)
 				whereToSendOutput = DestNone;
@@ -2903,8 +2880,6 @@ ProcessInterrupts(void)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			(void) get_timeout_indicator(STATEMENT_TIMEOUT, true);
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 					 errmsg("canceling statement due to lock timeout")));
@@ -2912,8 +2887,6 @@ ProcessInterrupts(void)
 		if (get_timeout_indicator(STATEMENT_TIMEOUT, true))
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to statement timeout")));
@@ -2921,8 +2894,6 @@ ProcessInterrupts(void)
 		if (IsAutoVacuumWorkerProcess())
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling autovacuum task")));
@@ -2931,8 +2902,6 @@ ProcessInterrupts(void)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			RecoveryConflictPending = false;
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			pgstat_report_recovery_conflict(RecoveryConflictReason);
 			if (DoingCommandRead)
 				ereport(FATAL,
@@ -2956,13 +2925,12 @@ ProcessInterrupts(void)
 		if (!DoingCommandRead)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to user request")));
 		}
 	}
+
 	/* If we get here, do nothing (probably, QueryCancelPending was reset) */
 }
 
@@ -3806,13 +3774,9 @@ PostgresMain(int argc, char *argv[],
 		QueryCancelPending = false;		/* second to avoid race condition */
 
 		/*
-		 * Turn off these interrupts too.  This is only needed here and not in
-		 * other exception-catching places since these interrupts are only
-		 * enabled while we wait for client input.
+		 * Not reading from the client anymore.
 		 */
 		DoingCommandRead = false;
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 
 		/* Make sure libpq is in a good state */
 		pq_comm_reset();
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 87c3abb..8491f47 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -13,6 +13,8 @@
 #ifndef ASYNC_H
 #define ASYNC_H
 
+#include <signal.h>
+
 #include "fmgr.h"
 
 /*
@@ -21,6 +23,7 @@
 #define NUM_ASYNC_BUFFERS	8
 
 extern bool Trace_notify;
+extern volatile sig_atomic_t notifyInterruptPending;
 
 extern Size AsyncShmemSize(void);
 extern void AsyncShmemInit(void);
@@ -48,12 +51,7 @@ extern void ProcessCompletedNotifies(void);
 /* signal handler for inbound notifies (PROCSIG_NOTIFY_INTERRUPT) */
 extern void HandleNotifyInterrupt(void);
 
-/*
- * enable/disable processing of inbound notifies directly from signal handler.
- * The enable routine first performs processing of any inbound notifies that
- * have occurred since the last disable.
- */
-extern void EnableNotifyInterrupt(void);
-extern bool DisableNotifyInterrupt(void);
+/* process interrupts */
+extern void ProcessNotifyInterrupt(void);
 
 #endif   /* ASYNC_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index 1a6f2df..d9ffd72 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -14,8 +14,9 @@
 #ifndef SINVAL_H
 #define SINVAL_H
 
-#include "storage/relfilenode.h"
+#include <signal.h>
 
+#include "storage/relfilenode.h"
 
 /*
  * We support several types of shared-invalidation messages:
@@ -123,6 +124,7 @@ typedef union
 /* Counter of messages processed; don't worry about overflow. */
 extern uint64 SharedInvalidMessageCounter;
 
+extern volatile sig_atomic_t catchupInterruptPending;
 
 extern void SendSharedInvalidMessages(const SharedInvalidationMessage *msgs,
 						  int n);
@@ -138,8 +140,7 @@ extern void HandleCatchupInterrupt(void);
  * The enable routine first performs processing of any catchup events that
  * have occurred since the last disable.
  */
-extern void EnableCatchupInterrupt(void);
-extern bool DisableCatchupInterrupt(void);
+extern void ProcessCatchupInterrupt(void);
 
 extern int xactGetCommittedInvalidationMessages(SharedInvalidationMessage **msgs,
 									 bool *RelcacheInitFileInval);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 0a350fd..fe8c725 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -67,8 +67,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
 extern void FloatExceptionHandler(SIGNAL_ARGS) __attribute__((noreturn));
 extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
 																 * handler */
-extern void prepare_for_client_read(void);
-extern void client_read_ended(void);
+extern void ProcessClientReadInterrupt(void);
+
 extern void process_postgres_switches(int argc, char *argv[],
 						  GucContext ctx, const char **dbname);
 extern void PostgresMain(int argc, char *argv[],
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0004-Process-die-interrupts-while-reading-writing-from-th.patchtext/x-patch; charset=us-asciiDownload
>From 3e9bc8fdd7a68f8e16e88860a8f305c72baf66e9 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 28 Sep 2014 00:22:39 +0200
Subject: [PATCH 04/10] Process 'die' interrupts while reading/writing from the
 client socket.

Up to now it was impossible to terminate a backend that was trying to
send data to the client when the socket's buffer was already
full. While the send() call itself might have gotten interrupted by
signals on some platforms, we just immediately retried.

Change things so that that we process 'die' events immediately when
the appropriate signal arrives during both recv/send. We can't react
to query cancels at that point, because we might loose sync with the
client as we could be in the middle of writing a message.

We don't interrupt writes if the write buffer isn't full as that would
lead to far fewer error messages reaching clients.

Per discussion with Kyotaro HORIGUCHI and Heikki Linnakangas

Disccussion: 20140927191243.GD5423@alap3.anarazel.de
---
 src/backend/libpq/be-secure-openssl.c |  7 ++++--
 src/backend/libpq/be-secure.c         | 37 +++++++++++++++-------------
 src/backend/tcop/postgres.c           | 46 +++++++++++++++++++++++++++++++++++
 src/include/tcop/tcopprot.h           |  1 +
 4 files changed, 72 insertions(+), 19 deletions(-)

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 3a70f43..3f7680f 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -634,10 +634,13 @@ wloop:
 			/* Don't retry if the socket is in nonblocking mode. */
 			if (port->noblock)
 				break;
+
 			/*
-			 * XXX: We'll, at some later point, likely want to add interrupt
-			 * processing here.
+			 * Check for interrupts here, in addition to secure_write(),
+			 * because a interrupted write in secure_raw_write() can possibly
+			 * only return to here, not to secure_write().
 			 */
+			ProcessClientWriteInterrupt(true);
 			goto wloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index f96b8f3..cf0709a 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -224,18 +224,17 @@ retry:
 		n = secure_raw_write(port, ptr, len);
 	}
 
-	/*
-	 * XXX: We'll, at some later point, likely want to add interrupt
-	 * processing here.
-	 */
-
-	/*
-	 * Retry after processing interrupts. This can be triggered even though we
-	 * don't check for latch set's during writing yet, because SSL
-	 * renegotiations might have required reading from the socket.
-	 */
+	/* retry after processing interrupts */
 	if (n < 0 && errno == EINTR)
 	{
+		/*
+		 * Process interrupts that happened while (or before) writing. We only
+		 * process interrupts if we got interrupted while writing. In other
+		 * cases it's better to allow the interrupts to be handled at higher
+		 * layers.
+		 */
+		ProcessClientWriteInterrupt(!port->noblock);
+
 		goto retry;
 	}
 
@@ -262,16 +261,20 @@ wloop:
 		int		w;
 		int		save_errno = errno;
 
-		/*
-		 * XXX: We'll, at some later point, likely want to add interrupt
-		 * processing here.  If set we'd not retry directly, but return. That
-		 * way we don't do anything while (possibly) inside a ssl library.
-		 */
 		w = WaitLatchOrSocket(MyLatch,
-							  WL_SOCKET_WRITEABLE,
+							  WL_LATCH_SET | WL_SOCKET_WRITEABLE,
 							  port->sock, 0);
 
-		if (w & WL_SOCKET_WRITEABLE)
+		if (w & WL_LATCH_SET)
+		{
+			ResetLatch(MyLatch);
+			/*
+			 * Force a return, so interrupts can be processed when not
+			 * (possibly) underneath a ssl library.
+			 */
+			errno = EINTR;
+		}
+		else if (w & WL_SOCKET_WRITEABLE)
 		{
 			goto wloop;
 		}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index fa1aeb5..75ec53e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -520,10 +520,56 @@ ProcessClientReadInterrupt(void)
 		if (notifyInterruptPending)
 			ProcessNotifyInterrupt();
 	}
+	else if (ProcDiePending)
+	{
+		/*
+		 * We're dying. It's safe (and sane) to handle that now.
+		 */
+		CHECK_FOR_INTERRUPTS();
+	}
 
 	errno = save_errno;
 }
 
+/*
+ * ProcessClientWriteInterrupt() - Process interrupts specific to client writes
+ *
+ * This is called just after low-level writes. That might be after the read
+ * finished successfully, or it was interrupted via interrupt. 'blocked' tells
+ * us whether the
+ *
+ * Must preserve errno!
+ */
+void
+ProcessClientWriteInterrupt(bool blocked)
+{
+	int			save_errno = errno;
+
+	Assert(InterruptHoldoffCount == 0 && CritSectionCount == 0);
+
+	/*
+	 * We only want to process the interrupt here if socket writes are
+	 * blocking to increase the chance to get an error message to the
+	 * client. If we're not blocked there'll soon be a
+	 * CHECK_FOR_INTERRUPTS(). But if we're blocked we'll never get out of
+	 * that situation if the client has died.
+	 */
+	if (ProcDiePending && blocked)
+	{
+		/*
+		 * We're dying. It's safe (and sane) to handle that now. But we don't
+		 * want to send the client the error message as that a) would possibly
+		 * block again b) would possibly lead to sending an error message to
+		 * the client, while we already started to send something else.
+		 */
+		if (whereToSendOutput == DestRemote)
+			whereToSendOutput = DestNone;
+
+		CHECK_FOR_INTERRUPTS();
+	}
+
+	errno = save_errno;
+}
 
 /*
  * Do raw parsing (only).
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index fe8c725..fc04a3e 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -68,6 +68,7 @@ extern void FloatExceptionHandler(SIGNAL_ARGS) __attribute__((noreturn));
 extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
 																 * handler */
 extern void ProcessClientReadInterrupt(void);
+extern void ProcessClientWriteInterrupt(bool blocked);
 
 extern void process_postgres_switches(int argc, char *argv[],
 						  GucContext ctx, const char **dbname);
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0005-Move-deadlock-and-other-interrupt-handling-in-proc.c.patchtext/x-patch; charset=us-asciiDownload
>From 265e268270c4fc5d3037b91536070f158512ace8 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 13 Jan 2015 11:58:27 +0100
Subject: [PATCH 05/10] Move deadlock and other interrupt handling in proc.c
 out of signal handlers.

Deadlock checking was performed inside signal handlers up to
now. While it's a remarkable feat to have made this work reliably,
it's quite complex to understand why that is the case. Partially it
worked due to the assumption that semaphores are signal safe - which
is not actually documented to be the case for sysv semaphores.

The reason we had to rely on performing this work inside signal
handlers is that semaphores aren't guaranteed to be interruptable by
signals on all platforms. But now that latches provide a somewhat
similar API, which actually has the guarantee of being interruptible,
we can avoid doing so.

Signalling between ProcSleep, ProcWakeup, ProcWaitForSignal and
ProcSendSignal is now done using latches. This increases the
likelihood of spurious wakeups. As spurious wakeup already were
possible and aren't likely to be frequent enough to be an actual
problem, this seems acceptable.

This change would allow for further simplification of the deadlock
checking, now that it doesn't have to run in a signal handler. But
even if I were motivated to do so right now, it would still be better
to do that separately. Such a cleanup shouldn't have to be reviewed a
the same time as the more fundamental changes in this commit.

There is one possible usability regression due to this commit. Namely
it is more likely than before that log_lock_waits messages are output
more than once.
---
 src/backend/storage/lmgr/proc.c   | 134 ++++++++++++++++++--------------------
 src/backend/utils/init/postinit.c |   2 +-
 src/include/storage/proc.h        |   2 +-
 3 files changed, 65 insertions(+), 73 deletions(-)

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 65e8afe..d97c244 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -80,13 +80,15 @@ PGPROC	   *PreparedXactProcs = NULL;
 /* If we are waiting for a lock, this points to the associated LOCALLOCK */
 static LOCALLOCK *lockAwaited = NULL;
 
-/* Mark this volatile because it can be changed by signal handler */
-static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
+static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
 
+/* Is a deadlock check pending? */
+static volatile sig_atomic_t got_deadlock_timeout;
 
 static void RemoveProcFromArray(int code, Datum arg);
 static void ProcKill(int code, Datum arg);
 static void AuxiliaryProcKill(int code, Datum arg);
+static void CheckDeadLock(void);
 
 
 /*
@@ -699,16 +701,6 @@ LockErrorCleanup(void)
 	lockAwaited = NULL;
 
 	LWLockRelease(partitionLock);
-
-	/*
-	 * We used to do PGSemaphoreReset() here to ensure that our proc's wait
-	 * semaphore is reset to zero.  This prevented a leftover wakeup signal
-	 * from remaining in the semaphore if someone else had granted us the lock
-	 * we wanted before we were able to remove ourselves from the wait-list.
-	 * However, now that ProcSleep loops until waitStatus changes, a leftover
-	 * wakeup signal isn't harmful, and it seems not worth expending cycles to
-	 * get rid of a signal that most likely isn't there.
-	 */
 }
 
 
@@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
 	proc = MyProc;
 	MyProc = NULL;
 	DisownLatch(&proc->procLatch);
+	SetLatch(MyLatch);
 
 	SpinLockAcquire(ProcStructLock);
 
@@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
 	proc = MyProc;
 	MyProc = NULL;
 	DisownLatch(&proc->procLatch);
+	SetLatch(MyLatch);
 
 	SpinLockAcquire(ProcStructLock);
 
@@ -935,9 +929,6 @@ ProcQueueInit(PROC_QUEUE *queue)
  *		we release the partition lock.
  *
  * NOTES: The process queue is now a priority queue for locking.
- *
- * P() on the semaphore should put us to sleep.  The process
- * semaphore is normally zero, so when we try to acquire it, we sleep.
  */
 int
 ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
@@ -1077,6 +1068,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 
 	/* Reset deadlock_state before enabling the timeout handler */
 	deadlock_state = DS_NOT_YET_CHECKED;
+	got_deadlock_timeout = false;
 
 	/*
 	 * Set timer so we can wake up after awhile and check for a deadlock. If a
@@ -1106,32 +1098,37 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout);
 
 	/*
-	 * If someone wakes us between LWLockRelease and PGSemaphoreLock,
-	 * PGSemaphoreLock will not block.  The wakeup is "saved" by the semaphore
-	 * implementation.  While this is normally good, there are cases where a
-	 * saved wakeup might be leftover from a previous operation (for example,
-	 * we aborted ProcWaitForSignal just before someone did ProcSendSignal).
-	 * So, loop to wait again if the waitStatus shows we haven't been granted
-	 * nor denied the lock yet.
+	 * If somebody wakes us between LWLockRelease and the WaitLatch the latch
+	 * will not wait. But a set latch does not necessarily mean that the lock
+	 * is free now, as there are many other sources for latch sets than
+	 * somebody releasing the lock.
 	 *
-	 * We pass interruptOK = true, which eliminates a window in which
-	 * cancel/die interrupts would be held off undesirably.  This is a promise
-	 * that we don't mind losing control to a cancel/die interrupt here.  We
-	 * don't, because we have no shared-state-change work to do after being
-	 * granted the lock (the grantor did it all).  We do have to worry about
-	 * canceling the deadlock timeout and updating the locallock table, but if
-	 * we lose control to an error, LockErrorCleanup will fix that up.
+	 * We process interrupts whenever the latch has been set, so cancel/die
+	 * interrupts are processed quickly. This means we must not mind losing
+	 * control to a cancel/die interrupt here.  We don't, because we have no
+	 * shared-state-change work to do after being granted the lock (the
+	 * grantor did it all).  We do have to worry about canceling the deadlock
+	 * timeout and updating the locallock table, but if we lose control to an
+	 * error, LockErrorCleanup will fix that up.
 	 */
 	do
 	{
-		PGSemaphoreLock(&MyProc->sem, true);
+		WaitLatch(MyLatch, WL_LATCH_SET, 0);
+		ResetLatch(MyLatch);
+		/* check for deadlocks first, as that's probably log-worthy */
+		if (got_deadlock_timeout)
+		{
+			CheckDeadLock();
+			got_deadlock_timeout = false;
+		}
+		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * waitStatus could change from STATUS_WAITING to something else
 		 * asynchronously.  Read it just once per loop to prevent surprising
 		 * behavior (such as missing log messages).
 		 */
-		myWaitStatus = MyProc->waitStatus;
+		myWaitStatus = *((volatile int *) &MyProc->waitStatus);
 
 		/*
 		 * If we are not deadlocked, but are waiting on an autovacuum-induced
@@ -1428,8 +1425,7 @@ ProcWakeup(PGPROC *proc, int waitStatus)
 	proc->waitStatus = waitStatus;
 
 	/* And awaken it */
-	PGSemaphoreUnlock(&proc->sem);
-
+	SetLatch(&proc->procLatch);
 	return retProc;
 }
 
@@ -1495,17 +1491,13 @@ ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock)
 /*
  * CheckDeadLock
  *
- * We only get to this routine if the DEADLOCK_TIMEOUT fired
- * while waiting for a lock to be released by some other process.  Look
- * to see if there's a deadlock; if not, just return and continue waiting.
- * (But signal ProcSleep to log a message, if log_lock_waits is true.)
- * If we have a real deadlock, remove ourselves from the lock's wait queue
- * and signal an error to ProcSleep.
- *
- * NB: this is run inside a signal handler, so be very wary about what is done
- * here or in called routines.
+ * We only get to this routine if the DEADLOCK_TIMEOUT fired while waiting for
+ * a lock to be released by some other process.  Look to see if there's a
+ * deadlock; if not, just return.  (But signal ProcSleep to log a message, if
+ * log_lock_waits is true.)  If we have a real deadlock, remove ourselves from
+ * the lock's wait queue and signal an error to ProcSleep.
  */
-void
+static void
 CheckDeadLock(void)
 {
 	int			i;
@@ -1564,12 +1556,6 @@ CheckDeadLock(void)
 		RemoveFromWaitQueue(MyProc, LockTagHashCode(&(MyProc->waitLock->tag)));
 
 		/*
-		 * Unlock my semaphore so that the interrupted ProcSleep() call can
-		 * finish.
-		 */
-		PGSemaphoreUnlock(&MyProc->sem);
-
-		/*
 		 * We're done here.  Transaction abort caused by the error that
 		 * ProcSleep will raise will cause any other locks we hold to be
 		 * released, thus allowing other processes to wake up; we don't need
@@ -1580,19 +1566,6 @@ CheckDeadLock(void)
 		 * RemoveFromWaitQueue took care of waking up any such processes.
 		 */
 	}
-	else if (log_lock_waits || deadlock_state == DS_BLOCKED_BY_AUTOVACUUM)
-	{
-		/*
-		 * Unlock my semaphore so that the interrupted ProcSleep() call can
-		 * print the log message (we daren't do it here because we are inside
-		 * a signal handler).  It will then sleep again until someone releases
-		 * the lock.
-		 *
-		 * If blocked by autovacuum, this wakeup will enable ProcSleep to send
-		 * the canceling signal to the autovacuum worker.
-		 */
-		PGSemaphoreUnlock(&MyProc->sem);
-	}
 
 	/*
 	 * And release locks.  We do this in reverse order for two reasons: (1)
@@ -1606,22 +1579,39 @@ check_done:
 		LWLockRelease(LockHashPartitionLockByIndex(i));
 }
 
+/*
+ * CheckDeadLockAlert - Handle the expiry of deadlock_timeout.
+ *
+ * NB: Runs inside a signal handler, be careful.
+ */
+void
+CheckDeadLockAlert(void)
+{
+	int			save_errno = errno;
+
+	got_deadlock_timeout = true;
+	/*
+	 * Have to set the latch again, even if handle_sig_alarm already did. Back
+	 * then got_deadlock_timeout wasn't yet set... It's unlikely that this
+	 * ever would be a problem, but setting a set latch again is cheap.
+	 */
+	SetLatch(MyLatch);
+	errno = save_errno;
+}
 
 /*
  * ProcWaitForSignal - wait for a signal from another backend.
  *
- * This can share the semaphore normally used for waiting for locks,
- * since a backend could never be waiting for a lock and a signal at
- * the same time.  As with locks, it's OK if the signal arrives just
- * before we actually reach the waiting state.  Also as with locks,
- * it's necessary that the caller be robust against bogus wakeups:
- * always check that the desired state has occurred, and wait again
- * if not.  This copes with possible "leftover" wakeups.
+ * As this uses the generic process latch the caller has to be robust against
+ * unrelated wakeups: Always check that the desired state has occurred, and
+ * wait again if not.
  */
 void
 ProcWaitForSignal(void)
 {
-	PGSemaphoreLock(&MyProc->sem, true);
+	WaitLatch(MyLatch, WL_LATCH_SET, 0);
+	ResetLatch(MyLatch);
+	CHECK_FOR_INTERRUPTS();
 }
 
 /*
@@ -1657,5 +1647,7 @@ ProcSendSignal(int pid)
 		proc = BackendPidGetProc(pid);
 
 	if (proc != NULL)
-		PGSemaphoreUnlock(&proc->sem);
+	{
+		SetLatch(&proc->procLatch);
+	}
 }
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 1f5cf06..600fb04 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -575,7 +575,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	 */
 	if (!bootstrap)
 	{
-		RegisterTimeout(DEADLOCK_TIMEOUT, CheckDeadLock);
+		RegisterTimeout(DEADLOCK_TIMEOUT, CheckDeadLockAlert);
 		RegisterTimeout(STATEMENT_TIMEOUT, StatementTimeoutHandler);
 		RegisterTimeout(LOCK_TIMEOUT, LockTimeoutHandler);
 	}
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index d194f38..e807a2e 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -251,7 +251,7 @@ extern void ProcQueueInit(PROC_QUEUE *queue);
 extern int	ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable);
 extern PGPROC *ProcWakeup(PGPROC *proc, int waitStatus);
 extern void ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock);
-extern void CheckDeadLock(void);
+extern void CheckDeadLockAlert(void);
 extern bool IsWaitingForLock(void);
 extern void LockErrorCleanup(void);
 
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0006-Don-t-allow-immediate-interupts-during-authenticatio.patchtext/x-patch; charset=us-asciiDownload
>From a036d8e2245138c2f9efff2844dc8aad5efed803 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 15 Jan 2015 00:13:37 +0100
Subject: [PATCH 06/10] Don't allow immediate interupts during authentication
 anymore.

We used to handle authentication_timeout by setting
ImmediateInterruptOK to true during large parts of the authentication
phase of a new connection.  While that happens to work acceptably in
practice, it's not particularly nice and has ugly corner cases.

Previous commits converted the FE/BE communication to use latches and
implemented support for interrupt handling during both
send/recv. Building on top of that work we can get rid of
ImmediateInterruptOK during authentication, by immediately treating
timeouts during authentication as a reason to die. As die interrupts
are handled immediately during client communication that provides a
sensibly quick reaction time to authentication timeout.

Additionally add a few CHECK_FOR_INTERRUPTS() to some more complex
authentication methods. More could be added, but this already should
provides a reasonable coverage.

While it this overall reduces the maximum time till a timeout is
reacted to, it greatly reduces complexity and increases
reliability. That seems like a overall win.
---
 src/backend/libpq/auth.c          | 29 +++++++++++++++++++----------
 src/backend/libpq/crypt.c         | 10 ----------
 src/backend/tcop/postgres.c       | 19 ++++++++-----------
 src/backend/utils/init/postinit.c | 12 +++++++-----
 4 files changed, 34 insertions(+), 36 deletions(-)

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index e9c8e56..15eaed6 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -306,13 +306,6 @@ ClientAuthentication(Port *port)
 	 */
 	hba_getauthmethod(port);
 
-	/*
-	 * Enable immediate response to SIGTERM/SIGINT/timeout interrupts. (We
-	 * don't want this during hba_getauthmethod() because it might have to do
-	 * database access, eg for role membership checks.)
-	 */
-	ImmediateInterruptOK = true;
-	/* And don't forget to detect one that already arrived */
 	CHECK_FOR_INTERRUPTS();
 
 	/*
@@ -566,9 +559,6 @@ ClientAuthentication(Port *port)
 		sendAuthRequest(port, AUTH_REQ_OK);
 	else
 		auth_failed(port, status, logdetail);
-
-	/* Done with authentication, so we should turn off immediate interrupts */
-	ImmediateInterruptOK = false;
 }
 
 
@@ -580,6 +570,8 @@ sendAuthRequest(Port *port, AuthRequest areq)
 {
 	StringInfoData buf;
 
+	CHECK_FOR_INTERRUPTS();
+
 	pq_beginmessage(&buf, 'R');
 	pq_sendint(&buf, (int32) areq, sizeof(int32));
 
@@ -613,6 +605,8 @@ sendAuthRequest(Port *port, AuthRequest areq)
 	 */
 	if (areq != AUTH_REQ_OK)
 		pq_flush();
+
+	CHECK_FOR_INTERRUPTS();
 }
 
 /*
@@ -849,6 +843,8 @@ pg_GSS_recvauth(Port *port)
 	 */
 	do
 	{
+		CHECK_FOR_INTERRUPTS();
+
 		mtype = pq_getbyte();
 		if (mtype != 'p')
 		{
@@ -898,6 +894,8 @@ pg_GSS_recvauth(Port *port)
 			 maj_stat, min_stat,
 			 (unsigned int) port->gss->outbuf.length, gflags);
 
+		CHECK_FOR_INTERRUPTS();
+
 		if (port->gss->outbuf.length != 0)
 		{
 			/*
@@ -1393,6 +1391,9 @@ interpret_ident_response(const char *ident_response,
  *	IP addresses and port numbers are in network byte order.
  *
  *	But iff we're unable to get the information from ident, return false.
+ *
+ *	XXX: Using WaitLatchOrSocket() and doing a CHECK_FOR_INTERRUPTS() if the
+ *	latch was set would improve the responsiveness to timeouts/cancellations.
  */
 static int
 ident_inet(hbaPort *port)
@@ -1507,6 +1508,8 @@ ident_inet(hbaPort *port)
 	/* loop in case send is interrupted */
 	do
 	{
+		CHECK_FOR_INTERRUPTS();
+
 		rc = send(sock_fd, ident_query, strlen(ident_query), 0);
 	} while (rc < 0 && errno == EINTR);
 
@@ -1522,6 +1525,8 @@ ident_inet(hbaPort *port)
 
 	do
 	{
+		CHECK_FOR_INTERRUPTS();
+
 		rc = recv(sock_fd, ident_response, sizeof(ident_response) - 1, 0);
 	} while (rc < 0 && errno == EINTR);
 
@@ -2410,6 +2415,10 @@ CheckRADIUSAuth(Port *port)
 	 * call to select() with a timeout, since somebody can be sending invalid
 	 * packets to our port thus causing us to retry in a loop and never time
 	 * out.
+	 *
+	 * XXX: Using WaitLatchOrSocket() and doing a CHECK_FOR_INTERRUPTS() if
+	 * the latch was set would improve the responsiveness to
+	 * timeouts/cancellations.
 	 */
 	gettimeofday(&endtime, NULL);
 	endtime.tv_sec += RADIUS_TIMEOUT;
diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c
index 599b63a..97be944 100644
--- a/src/backend/libpq/crypt.c
+++ b/src/backend/libpq/crypt.c
@@ -47,13 +47,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
 	Datum		datum;
 	bool		isnull;
 
-	/*
-	 * Disable immediate interrupts while doing database access.  (Note we
-	 * don't bother to turn this back on if we hit one of the failure
-	 * conditions, since we can expect we'll just exit right away anyway.)
-	 */
-	ImmediateInterruptOK = false;
-
 	/* Get role info from pg_authid */
 	roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum(role));
 	if (!HeapTupleIsValid(roleTup))
@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
 	if (*shadow_pass == '\0')
 		return STATUS_ERROR;	/* empty password */
 
-	/* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
-	ImmediateInterruptOK = true;
-	/* And don't forget to detect one that already arrived */
 	CHECK_FOR_INTERRUPTS();
 
 	/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 75ec53e..0723c81 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2867,7 +2867,14 @@ ProcessInterrupts(void)
 		/* As in quickdie, don't risk sending to client during auth */
 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
 			whereToSendOutput = DestNone;
-		if (IsAutoVacuumWorkerProcess())
+
+		if (ClientAuthInProgress)
+		{
+			ereport(FATAL,
+					(errcode(ERRCODE_QUERY_CANCELED),
+					 errmsg("canceling authentication due to timeout")));
+		}
+		else if (IsAutoVacuumWorkerProcess())
 			ereport(FATAL,
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
 					 errmsg("terminating autovacuum process due to administrator command")));
@@ -2907,16 +2914,6 @@ ProcessInterrupts(void)
 	if (QueryCancelPending)
 	{
 		QueryCancelPending = false;
-		if (ClientAuthInProgress)
-		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
-			/* As in quickdie, don't risk sending to client during auth */
-			if (whereToSendOutput == DestRemote)
-				whereToSendOutput = DestNone;
-			ereport(ERROR,
-					(errcode(ERRCODE_QUERY_CANCELED),
-					 errmsg("canceling authentication due to timeout")));
-		}
 
 		/*
 		 * If LOCK_TIMEOUT and STATEMENT_TIMEOUT indicators are both set, we
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 600fb04..bd8b3df 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -1096,18 +1096,20 @@ ShutdownPostgres(int code, Datum arg)
 static void
 StatementTimeoutHandler(void)
 {
+	int sig = SIGINT;
+
+	if (ClientAuthInProgress)
+		sig = SIGTERM;
+
 #ifdef HAVE_SETSID
 	/* try to signal whole process group */
-	kill(-MyProcPid, SIGINT);
+	kill(-MyProcPid, sig);
 #endif
-	kill(MyProcPid, SIGINT);
+	kill(MyProcPid, sig);
 }
 
 /*
  * LOCK_TIMEOUT handler: trigger a query-cancel interrupt.
- *
- * This is identical to StatementTimeoutHandler, but since it's so short,
- * we might as well keep the two functions separate for clarity.
  */
 static void
 LockTimeoutHandler(void)
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0007-Remove-the-option-to-service-interrupts-during-PGSem.patchtext/x-patch; charset=us-asciiDownload
>From 3b1d5306e9bde187971f17374c8e8ac4912485bf Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 15 Jan 2015 00:54:40 +0100
Subject: [PATCH 07/10] Remove the option to service interrupts during
 PGSemaphoreLock().

The remaining caller (lwlocks) don't need that facility, and we plan
to remove ImmedidateInterruptOK. The latter means that interrupts
can't be serviced race-free and portably anyway, so there's little
reason for keeping the feature.
---
 src/backend/port/posix_sema.c     | 12 ++---------
 src/backend/port/sysv_sema.c      | 43 ++++-----------------------------------
 src/backend/port/win32_sema.c     |  6 +-----
 src/backend/storage/lmgr/lwlock.c | 12 ++++-------
 src/include/storage/pg_sema.h     |  2 +-
 5 files changed, 12 insertions(+), 63 deletions(-)

diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 633bf5a..ad9c80a 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -236,22 +236,14 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be < 0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema)
 {
 	int			errStatus;
 
-	/*
-	 * See notes in sysv_sema.c's implementation of PGSemaphoreLock. Just as
-	 * that code does for semop(), we handle both the case where sem_wait()
-	 * returns errno == EINTR after a signal, and the case where it just keeps
-	 * waiting.
-	 */
+	/* See notes in sysv_sema.c's implementation of PGSemaphoreLock. */
 	do
 	{
-		ImmediateInterruptOK = interruptOK;
-		CHECK_FOR_INTERRUPTS();
 		errStatus = sem_wait(PG_SEM_REF(sema));
-		ImmediateInterruptOK = false;
 	} while (errStatus < 0 && errno == EINTR);
 
 	if (errStatus < 0)
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 5a0193f..ba46c64 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -364,7 +364,7 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be < 0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema)
 {
 	int			errStatus;
 	struct sembuf sops;
@@ -378,48 +378,13 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
 	 * from the operation prematurely because we were sent a signal.  So we
 	 * try and lock the semaphore again.
 	 *
-	 * Each time around the loop, we check for a cancel/die interrupt.  On
-	 * some platforms, if such an interrupt comes in while we are waiting, it
-	 * will cause the semop() call to exit with errno == EINTR, allowing us to
-	 * service the interrupt (if not in a critical section already) during the
-	 * next loop iteration.
-	 *
-	 * Once we acquire the lock, we do NOT check for an interrupt before
-	 * returning.  The caller needs to be able to record ownership of the lock
-	 * before any interrupt can be accepted.
-	 *
-	 * There is a window of a few instructions between CHECK_FOR_INTERRUPTS
-	 * and entering the semop() call.  If a cancel/die interrupt occurs in
-	 * that window, we would fail to notice it until after we acquire the lock
-	 * (or get another interrupt to escape the semop()).  We can avoid this
-	 * problem by temporarily setting ImmediateInterruptOK to true before we
-	 * do CHECK_FOR_INTERRUPTS; then, a die() interrupt in this interval will
-	 * execute directly.  However, there is a huge pitfall: there is another
-	 * window of a few instructions after the semop() before we are able to
-	 * reset ImmediateInterruptOK.  If an interrupt occurs then, we'll lose
-	 * control, which means that the lock has been acquired but our caller did
-	 * not get a chance to record the fact. Therefore, we only set
-	 * ImmediateInterruptOK if the caller tells us it's OK to do so, ie, the
-	 * caller does not need to record acquiring the lock.  (This is currently
-	 * true for lockmanager locks, since the process that granted us the lock
-	 * did all the necessary state updates. It's not true for SysV semaphores
-	 * used to implement LW locks or emulate spinlocks --- but the wait time
-	 * for such locks should not be very long, anyway.)
-	 *
-	 * On some platforms, signals marked SA_RESTART (which is most, for us)
-	 * will not interrupt the semop(); it will just keep waiting.  Therefore
-	 * it's necessary for cancel/die interrupts to be serviced directly by the
-	 * signal handler.  On these platforms the behavior is really the same
-	 * whether the signal arrives just before the semop() begins, or while it
-	 * is waiting.  The loop on EINTR is thus important only for platforms
-	 * without SA_RESTART.
+	 * We used to check interrupts here, but that required servicing
+	 * interrupts directly from signal handlers. Which is hard to do safely
+	 * and portably.
 	 */
 	do
 	{
-		ImmediateInterruptOK = interruptOK;
-		CHECK_FOR_INTERRUPTS();
 		errStatus = semop(sema->semId, &sops, 1);
-		ImmediateInterruptOK = false;
 	} while (errStatus < 0 && errno == EINTR);
 
 	if (errStatus < 0)
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index f848ff8..011e2fd 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -116,13 +116,11 @@ PGSemaphoreReset(PGSemaphore sema)
  * Serve the interrupt if interruptOK is true.
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema)
 {
 	HANDLE		wh[2];
 	bool		done = false;
 
-	ImmediateInterruptOK = interruptOK;
-
 	/*
 	 * Note: pgwin32_signal_event should be first to ensure that it will be
 	 * reported when multiple events are set.  We want to guarantee that
@@ -173,8 +171,6 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
 				break;
 		}
 	}
-
-	ImmediateInterruptOK = false;
 }
 
 /*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 7cb0023..7ca8dc0 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -863,8 +863,7 @@ LWLockDequeueSelf(LWLock *lock)
 		 */
 		for (;;)
 		{
-			/* "false" means cannot accept cancel/die interrupt here. */
-			PGSemaphoreLock(&MyProc->sem, false);
+			PGSemaphoreLock(&MyProc->sem);
 			if (!MyProc->lwWaiting)
 				break;
 			extraWaits++;
@@ -1034,8 +1033,7 @@ LWLockAcquireCommon(LWLock *lock, LWLockMode mode, uint64 *valptr, uint64 val)
 
 		for (;;)
 		{
-			/* "false" means cannot accept cancel/die interrupt here. */
-			PGSemaphoreLock(&proc->sem, false);
+			PGSemaphoreLock(&proc->sem);
 			if (!proc->lwWaiting)
 				break;
 			extraWaits++;
@@ -1195,8 +1193,7 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 
 			for (;;)
 			{
-				/* "false" means cannot accept cancel/die interrupt here. */
-				PGSemaphoreLock(&proc->sem, false);
+				PGSemaphoreLock(&proc->sem);
 				if (!proc->lwWaiting)
 					break;
 				extraWaits++;
@@ -1397,8 +1394,7 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
 
 		for (;;)
 		{
-			/* "false" means cannot accept cancel/die interrupt here. */
-			PGSemaphoreLock(&proc->sem, false);
+			PGSemaphoreLock(&proc->sem);
 			if (!proc->lwWaiting)
 				break;
 			extraWaits++;
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 5eaf2bf..9f8be75 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -72,7 +72,7 @@ extern void PGSemaphoreCreate(PGSemaphore sema);
 extern void PGSemaphoreReset(PGSemaphore sema);
 
 /* Lock a semaphore (decrement count), blocking if count would be < 0 */
-extern void PGSemaphoreLock(PGSemaphore sema, bool interruptOK);
+extern void PGSemaphoreLock(PGSemaphore sema);
 
 /* Unlock a semaphore (increment count) */
 extern void PGSemaphoreUnlock(PGSemaphore sema);
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0008-Remove-remnants-of-ImmediateInterruptOK-handling.patchtext/x-patch; charset=us-asciiDownload
>From 6dfca6b69a5d6f8bb36b3a7df1b7e4005d166b3d Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 15 Jan 2015 00:57:12 +0100
Subject: [PATCH 08/10] Remove remnants of ImmediateInterruptOK handling.

Now that nothing sets ImmediateInterruptOK to true anymore, we can
remove all the supporting code.
---
 src/backend/storage/ipc/ipc.c         |  2 --
 src/backend/tcop/postgres.c           | 52 -----------------------------------
 src/backend/utils/error/elog.c        | 29 ++-----------------
 src/backend/utils/init/globals.c      |  1 -
 src/backend/utils/misc/timeout.c      | 38 ++-----------------------
 src/include/miscadmin.h               |  1 -
 src/test/modules/test_shm_mq/worker.c |  6 +---
 7 files changed, 7 insertions(+), 122 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index f0f7939..6bc0b06 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -165,8 +165,6 @@ proc_exit_prepare(int code)
 	InterruptPending = false;
 	ProcDiePending = false;
 	QueryCancelPending = false;
-	/* And let's just make *sure* we're not interrupted ... */
-	ImmediateInterruptOK = false;
 	InterruptHoldoffCount = 1;
 	CritSectionCount = 0;
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 0723c81..9620b4d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2622,21 +2622,6 @@ die(SIGNAL_ARGS)
 	{
 		InterruptPending = true;
 		ProcDiePending = true;
-
-		/*
-		 * If it's safe to interrupt, and we're waiting for a lock, service
-		 * the interrupt immediately
-		 */
-		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
-			CritSectionCount == 0)
-		{
-			/* bump holdoff count to make ProcessInterrupts() a no-op */
-			/* until we are done getting ready for it */
-			InterruptHoldoffCount++;
-			LockErrorCleanup(); /* prevent CheckDeadLock from running */
-			InterruptHoldoffCount--;
-			ProcessInterrupts();
-		}
 	}
 
 	/* If we're still here, waken anything waiting on the process latch */
@@ -2661,21 +2646,6 @@ StatementCancelHandler(SIGNAL_ARGS)
 	{
 		InterruptPending = true;
 		QueryCancelPending = true;
-
-		/*
-		 * If it's safe to interrupt, and we're waiting for a lock, service
-		 * the interrupt immediately
-		 */
-		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
-			CritSectionCount == 0)
-		{
-			/* bump holdoff count to make ProcessInterrupts() a no-op */
-			/* until we are done getting ready for it */
-			InterruptHoldoffCount++;
-			LockErrorCleanup(); /* prevent CheckDeadLock from running */
-			InterruptHoldoffCount--;
-			ProcessInterrupts();
-		}
 	}
 
 	/* If we're still here, waken anything waiting on the process latch */
@@ -2816,21 +2786,6 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
 		 */
 		if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE)
 			RecoveryConflictRetryable = false;
-
-		/*
-		 * If it's safe to interrupt, and we're waiting for a lock, service
-		 * the interrupt immediately
-		 */
-		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
-			CritSectionCount == 0)
-		{
-			/* bump holdoff count to make ProcessInterrupts() a no-op */
-			/* until we are done getting ready for it */
-			InterruptHoldoffCount++;
-			LockErrorCleanup(); /* prevent CheckDeadLock from running */
-			InterruptHoldoffCount--;
-			ProcessInterrupts();
-		}
 	}
 
 	/*
@@ -2863,7 +2818,6 @@ ProcessInterrupts(void)
 	{
 		ProcDiePending = false;
 		QueryCancelPending = false;		/* ProcDie trumps QueryCancel */
-		ImmediateInterruptOK = false;	/* not idle anymore */
 		/* As in quickdie, don't risk sending to client during auth */
 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
 			whereToSendOutput = DestNone;
@@ -2904,7 +2858,6 @@ ProcessInterrupts(void)
 	if (ClientConnectionLost)
 	{
 		QueryCancelPending = false;		/* lost connection trumps QueryCancel */
-		ImmediateInterruptOK = false;	/* not idle anymore */
 		/* don't send to client, we already know the connection to be dead. */
 		whereToSendOutput = DestNone;
 		ereport(FATAL,
@@ -2921,7 +2874,6 @@ ProcessInterrupts(void)
 		 */
 		if (get_timeout_indicator(LOCK_TIMEOUT, true))
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			(void) get_timeout_indicator(STATEMENT_TIMEOUT, true);
 			ereport(ERROR,
 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
@@ -2929,21 +2881,18 @@ ProcessInterrupts(void)
 		}
 		if (get_timeout_indicator(STATEMENT_TIMEOUT, true))
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to statement timeout")));
 		}
 		if (IsAutoVacuumWorkerProcess())
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling autovacuum task")));
 		}
 		if (RecoveryConflictPending)
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			RecoveryConflictPending = false;
 			pgstat_report_recovery_conflict(RecoveryConflictReason);
 			if (DoingCommandRead)
@@ -2967,7 +2916,6 @@ ProcessInterrupts(void)
 		 */
 		if (!DoingCommandRead)
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to user request")));
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index bee2c92..136f731 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -412,7 +412,6 @@ errfinish(int dummy,...)
 {
 	ErrorData  *edata = &errordata[errordata_stack_depth];
 	int			elevel;
-	bool		save_ImmediateInterruptOK;
 	MemoryContext oldcontext;
 	ErrorContextCallback *econtext;
 
@@ -421,18 +420,6 @@ errfinish(int dummy,...)
 	elevel = edata->elevel;
 
 	/*
-	 * Ensure we can't get interrupted while performing error reporting.  This
-	 * is needed to prevent recursive entry to functions like syslog(), which
-	 * may not be re-entrant.
-	 *
-	 * Note: other places that save-and-clear ImmediateInterruptOK also do
-	 * HOLD_INTERRUPTS(), but that should not be necessary here since we don't
-	 * call anything that could turn on ImmediateInterruptOK.
-	 */
-	save_ImmediateInterruptOK = ImmediateInterruptOK;
-	ImmediateInterruptOK = false;
-
-	/*
 	 * Do processing in ErrorContext, which we hope has enough reserved space
 	 * to report an error.
 	 */
@@ -463,10 +450,6 @@ errfinish(int dummy,...)
 		 * itself be inside a holdoff section.  If necessary, such a handler
 		 * could save and restore InterruptHoldoffCount for itself, but this
 		 * should make life easier for most.)
-		 *
-		 * Note that we intentionally don't restore ImmediateInterruptOK here,
-		 * even if it was set at entry.  We definitely don't want that on
-		 * while doing error cleanup.
 		 */
 		InterruptHoldoffCount = 0;
 
@@ -572,15 +555,9 @@ errfinish(int dummy,...)
 	}
 
 	/*
-	 * We reach here if elevel <= WARNING.  OK to return to caller, so restore
-	 * caller's setting of ImmediateInterruptOK.
-	 */
-	ImmediateInterruptOK = save_ImmediateInterruptOK;
-
-	/*
-	 * But check for cancel/die interrupt first --- this is so that the user
-	 * can stop a query emitting tons of notice or warning messages, even if
-	 * it's in a loop that otherwise fails to check for interrupts.
+	 * Check for cancel/die interrupt first --- this is so that the user can
+	 * stop a query emitting tons of notice or warning messages, even if it's
+	 * in a loop that otherwise fails to check for interrupts.
 	 */
 	CHECK_FOR_INTERRUPTS();
 }
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index c35867b..8d3d51b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -30,7 +30,6 @@ volatile bool InterruptPending = false;
 volatile bool QueryCancelPending = false;
 volatile bool ProcDiePending = false;
 volatile bool ClientConnectionLost = false;
-volatile bool ImmediateInterruptOK = false;
 volatile uint32 InterruptHoldoffCount = 0;
 volatile uint32 CritSectionCount = 0;
 
diff --git a/src/backend/utils/misc/timeout.c b/src/backend/utils/misc/timeout.c
index ce4bc13..a7ec665 100644
--- a/src/backend/utils/misc/timeout.c
+++ b/src/backend/utils/misc/timeout.c
@@ -259,27 +259,12 @@ static void
 handle_sig_alarm(SIGNAL_ARGS)
 {
 	int			save_errno = errno;
-	bool		save_ImmediateInterruptOK = ImmediateInterruptOK;
 
 	/*
-	 * We may be executing while ImmediateInterruptOK is true (e.g., when
-	 * mainline is waiting for a lock).  If SIGINT or similar arrives while
-	 * this code is running, we'd lose control and perhaps leave our data
-	 * structures in an inconsistent state.  Disable immediate interrupts, and
-	 * just to be real sure, bump the holdoff counter as well.  (The reason
-	 * for this belt-and-suspenders-too approach is to make sure that nothing
-	 * bad happens if a timeout handler calls code that manipulates
-	 * ImmediateInterruptOK.)
-	 *
-	 * Note: it's possible for a SIGINT to interrupt handle_sig_alarm before
-	 * we manage to do this; the net effect would be as if the SIGALRM event
-	 * had been silently lost.  Therefore error recovery must include some
-	 * action that will allow any lost interrupt to be rescheduled.  Disabling
-	 * some or all timeouts is sufficient, or if that's not appropriate,
-	 * reschedule_timeouts() can be called.  Also, the signal blocking hazard
-	 * described below applies here too.
+	 * Bump the holdoff counter, to make sure nothing we call will process
+	 * interrupts directly. No timeout handler should do that, but these
+	 * failures are hard to debug, so better be sure.
 	 */
-	ImmediateInterruptOK = false;
 	HOLD_INTERRUPTS();
 
 	/*
@@ -332,24 +317,7 @@ handle_sig_alarm(SIGNAL_ARGS)
 		}
 	}
 
-	/*
-	 * Re-allow query cancel, and then try to service any cancel request that
-	 * arrived meanwhile (this might in particular include a cancel request
-	 * fired by one of the timeout handlers).  Since we are in a signal
-	 * handler, we mustn't call ProcessInterrupts unless ImmediateInterruptOK
-	 * is set; if it isn't, the cancel will happen at the next mainline
-	 * CHECK_FOR_INTERRUPTS.
-	 *
-	 * Note: a longjmp from here is safe so far as our own data structures are
-	 * concerned; but on platforms that block a signal before calling the
-	 * handler and then un-block it on return, longjmping out of the signal
-	 * handler leaves SIGALRM still blocked.  Error cleanup is responsible for
-	 * unblocking any blocked signals.
-	 */
 	RESUME_INTERRUPTS();
-	ImmediateInterruptOK = save_ImmediateInterruptOK;
-	if (save_ImmediateInterruptOK)
-		CHECK_FOR_INTERRUPTS();
 
 	errno = save_errno;
 }
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 6e33a17..9bc6b71 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -80,7 +80,6 @@ extern PGDLLIMPORT volatile bool ProcDiePending;
 extern volatile bool ClientConnectionLost;
 
 /* these are marked volatile because they are examined by signal handlers: */
-extern PGDLLIMPORT volatile bool ImmediateInterruptOK;
 extern PGDLLIMPORT volatile uint32 InterruptHoldoffCount;
 extern PGDLLIMPORT volatile uint32 CritSectionCount;
 
diff --git a/src/test/modules/test_shm_mq/worker.c b/src/test/modules/test_shm_mq/worker.c
index a9d9e0e..691a568 100644
--- a/src/test/modules/test_shm_mq/worker.c
+++ b/src/test/modules/test_shm_mq/worker.c
@@ -60,13 +60,9 @@ test_shm_mq_main(Datum main_arg)
 	 *
 	 * We want CHECK_FOR_INTERRUPTS() to kill off this worker process just as
 	 * it would a normal user backend.  To make that happen, we establish a
-	 * signal handler that is a stripped-down version of die().  We don't have
-	 * any equivalent of the backend's command-read loop, where interrupts can
-	 * be processed immediately, so make sure ImmediateInterruptOK is turned
-	 * off.
+	 * signal handler that is a stripped-down version of die().
 	 */
 	pqsignal(SIGTERM, handle_sigterm);
-	ImmediateInterruptOK = false;
 	BackgroundWorkerUnblockSignals();
 
 	/*
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0009-WIP-lwlock.c-use-latches.patchtext/x-patch; charset=us-asciiDownload
>From a7a2548c17c3881068053fc9bf040c9cc0c2b533 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 13 Jan 2015 11:58:04 +0100
Subject: [PATCH 09/10] WIP: lwlock.c: use latches

---
 src/backend/storage/lmgr/lwlock.c | 89 ++++++++++++++++++++++++++++++++-------
 1 file changed, 73 insertions(+), 16 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 7ca8dc0..c159cce 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -107,6 +107,8 @@ extern slock_t *ShmemLock;
 /* Must be greater than MAX_BACKENDS - which is 2^23-1, so we're fine. */
 #define LW_SHARED_MASK				((uint32)(1 << 23))
 
+#define LW_USE_LATCHES
+
 /*
  * This is indexed by tranche ID and stores metadata for all tranches known
  * to the current backend.
@@ -731,7 +733,11 @@ LWLockWakeup(LWLock *lock)
 		 */
 		pg_write_barrier();
 		waiter->lwWaiting = false;
+#ifdef LW_USE_LATCHES
+		SetLatch(&waiter->procLatch);
+#else
 		PGSemaphoreUnlock(&waiter->sem);
+#endif
 	}
 }
 
@@ -843,7 +849,9 @@ LWLockDequeueSelf(LWLock *lock)
 		MyProc->lwWaiting = false;
 	else
 	{
+#ifndef LW_USE_LATCHES
 		int		extraWaits = 0;
+#endif
 
 		/*
 		 * Somebody else dequeued us and has or will wake us up. Deal with the
@@ -863,17 +871,26 @@ LWLockDequeueSelf(LWLock *lock)
 		 */
 		for (;;)
 		{
+#ifdef LW_USE_LATCHES
+			if (!MyProc->lwWaiting)
+				break;
+			WaitLatch(&MyProc->procLatch, WL_LATCH_SET, 0);
+			ResetLatch(&MyProc->procLatch);
+#else
 			PGSemaphoreLock(&MyProc->sem);
 			if (!MyProc->lwWaiting)
 				break;
 			extraWaits++;
+#endif
 		}
 
+#ifndef LW_USE_LATCHES
 		/*
 		 * Fix the process wait semaphore's count for any absorbed wakeups.
 		 */
 		while (extraWaits-- > 0)
 			PGSemaphoreUnlock(&MyProc->sem);
+#endif
 	}
 
 #ifdef LOCK_DEBUG
@@ -914,9 +931,10 @@ LWLockAcquireWithVar(LWLock *l, uint64 *valptr, uint64 val)
 static inline bool
 LWLockAcquireCommon(LWLock *lock, LWLockMode mode, uint64 *valptr, uint64 val)
 {
-	PGPROC	   *proc = MyProc;
 	bool		result = true;
-	int			extraWaits = 0;
+#ifndef LW_USE_LATCHES
+	int         extraWaits = 0;
+#endif
 #ifdef LWLOCK_STATS
 	lwlock_stats *lwstats;
 
@@ -940,7 +958,7 @@ LWLockAcquireCommon(LWLock *lock, LWLockMode mode, uint64 *valptr, uint64 val)
 	 * during bootstrap or shared memory initialization.  Put an Assert here
 	 * to catch unsafe coding practices.
 	 */
-	Assert(!(proc == NULL && IsUnderPostmaster));
+	Assert(!(MyProc == NULL && IsUnderPostmaster));
 
 	/* Ensure we will have room to remember the lock */
 	if (num_held_lwlocks >= MAX_SIMUL_LWLOCKS)
@@ -1033,10 +1051,19 @@ LWLockAcquireCommon(LWLock *lock, LWLockMode mode, uint64 *valptr, uint64 val)
 
 		for (;;)
 		{
-			PGSemaphoreLock(&proc->sem);
-			if (!proc->lwWaiting)
+#ifdef LW_USE_LATCHES
+			pg_read_barrier();
+			if (!MyProc->lwWaiting)
+				break;
+
+			WaitLatch(&MyProc->procLatch, WL_LATCH_SET, 0);
+			ResetLatch(&MyProc->procLatch);
+#else
+			PGSemaphoreLock(&MyProc->sem);
+			if (!MyProc->lwWaiting)
 				break;
 			extraWaits++;
+#endif
 		}
 
 		/* Retrying, allow LWLockRelease to release waiters again. */
@@ -1068,11 +1095,13 @@ LWLockAcquireCommon(LWLock *lock, LWLockMode mode, uint64 *valptr, uint64 val)
 	held_lwlocks[num_held_lwlocks].lock = lock;
 	held_lwlocks[num_held_lwlocks++].mode = mode;
 
+#ifndef LW_USE_LATCHES
 	/*
 	 * Fix the process wait semaphore's count for any absorbed wakeups.
 	 */
 	while (extraWaits-- > 0)
-		PGSemaphoreUnlock(&proc->sem);
+		PGSemaphoreUnlock(&MyProc->sem);
+#endif
 
 	return result;
 }
@@ -1142,9 +1171,10 @@ LWLockConditionalAcquire(LWLock *lock, LWLockMode mode)
 bool
 LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 {
-	PGPROC	   *proc = MyProc;
 	bool		mustwait;
-	int			extraWaits = 0;
+#ifndef LW_USE_LATCHES
+	int         extraWaits = 0;
+#endif
 #ifdef LWLOCK_STATS
 	lwlock_stats *lwstats;
 
@@ -1193,10 +1223,19 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 
 			for (;;)
 			{
-				PGSemaphoreLock(&proc->sem);
-				if (!proc->lwWaiting)
+#ifdef LW_USE_LATCHES
+				pg_read_barrier();
+				if (!MyProc->lwWaiting)
+					break;
+
+				WaitLatch(&MyProc->procLatch, WL_LATCH_SET, 0);
+				ResetLatch(&MyProc->procLatch);
+#else
+				PGSemaphoreLock(&MyProc->sem);
+				if (!MyProc->lwWaiting)
 					break;
 				extraWaits++;
+#endif
 			}
 
 #ifdef LOCK_DEBUG
@@ -1224,11 +1263,13 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 		}
 	}
 
+#ifndef LW_USE_LATCHES
 	/*
 	 * Fix the process wait semaphore's count for any absorbed wakeups.
 	 */
 	while (extraWaits-- > 0)
-		PGSemaphoreUnlock(&proc->sem);
+		PGSemaphoreUnlock(&MyProc->sem);
+#endif
 
 	if (mustwait)
 	{
@@ -1271,9 +1312,10 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 bool
 LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
 {
-	PGPROC	   *proc = MyProc;
-	int			extraWaits = 0;
 	bool		result = false;
+#ifndef LW_USE_LATCHES
+	int         extraWaits = 0;
+#endif
 #ifdef LWLOCK_STATS
 	lwlock_stats *lwstats;
 
@@ -1394,10 +1436,19 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
 
 		for (;;)
 		{
-			PGSemaphoreLock(&proc->sem);
-			if (!proc->lwWaiting)
+#ifdef LW_USE_LATCHES
+			pg_read_barrier();
+			if (!MyProc->lwWaiting)
+				break;
+
+			WaitLatch(&MyProc->procLatch, WL_LATCH_SET, 0);
+			ResetLatch(&MyProc->procLatch);
+#else
+			PGSemaphoreLock(&MyProc->sem);
+			if (!MyProc->lwWaiting)
 				break;
 			extraWaits++;
+#endif
 		}
 
 #ifdef LOCK_DEBUG
@@ -1418,11 +1469,13 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
 
 	TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(lock), T_ID(lock), LW_EXCLUSIVE);
 
+#ifndef LW_USE_LATCHES
 	/*
 	 * Fix the process wait semaphore's count for any absorbed wakeups.
 	 */
 	while (extraWaits-- > 0)
-		PGSemaphoreUnlock(&proc->sem);
+		PGSemaphoreUnlock(&MyProc->sem);
+#endif
 
 	/*
 	 * Now okay to allow cancel/die interrupts.
@@ -1498,7 +1551,11 @@ LWLockUpdateVar(LWLock *lock, uint64 *valptr, uint64 val)
 		/* check comment in LWLockWakeup() about this barrier */
 		pg_write_barrier();
 		waiter->lwWaiting = false;
+#ifdef LW_USE_LATCHES
+		SetLatch(&waiter->procLatch);
+#else
 		PGSemaphoreUnlock(&waiter->sem);
+#endif
 	}
 }
 
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0010-WIP-unix_latch.c-efficiency-hackery.patchtext/x-patch; charset=us-asciiDownload
>From f6f3be360d1af1f9d83d4b5a522defa87bfda346 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 13 Jan 2015 11:58:58 +0100
Subject: [PATCH 10/10] WIP: unix_latch.c: efficiency hackery

Hack 1) Use eventfds on linux when available

        This provides a measurable performance adavante on linux, due
        to avoidance of the pipe buffer.

        I think we should do this, but this isn't a finished patch.

Hack 2) Store a fd in the latch, and use that instead of signals + selfbyte for wakeups

        This requires a fair amount (NumProcSignalSlots) of fds, and
        only works for slots initialized in the postmaster.

        The, not really proven, performance benefits come from
        avoiding roundtrips through the kernel as only one write to a
        fd is needed, instead of a signal + write to the selfpipe.
---
 src/backend/port/unix_latch.c   | 166 +++++++++++++++++++++++++++++++++-------
 src/backend/storage/lmgr/proc.c |   2 +-
 src/include/storage/latch.h     |   2 +
 3 files changed, 140 insertions(+), 30 deletions(-)

diff --git a/src/backend/port/unix_latch.c b/src/backend/port/unix_latch.c
index 147e22c..cf5f9fe 100644
--- a/src/backend/port/unix_latch.c
+++ b/src/backend/port/unix_latch.c
@@ -32,10 +32,15 @@
  */
 #include "postgres.h"
 
+#define HAVE_EVENTFD
+
 #include <fcntl.h>
 #include <limits.h>
 #include <signal.h>
 #include <unistd.h>
+#ifdef HAVE_EVENTFD
+#include <sys/eventfd.h>
+#endif
 #include <sys/time.h>
 #include <sys/types.h>
 #ifdef HAVE_POLL_H
@@ -62,10 +67,13 @@ static volatile sig_atomic_t waiting = false;
 /* Read and write ends of the self-pipe */
 static int	selfpipe_readfd = -1;
 static int	selfpipe_writefd = -1;
+#if defined(HAVE_EVENTFD) && defined(EFD_NONBLOCK)
+static int	selfpipe_eventfd = -1;
+#endif
 
 /* Private function prototypes */
 static void sendSelfPipeByte(void);
-static void drainSelfPipe(void);
+static void drainSelfPipe(volatile Latch *latch);
 
 
 /*
@@ -77,26 +85,45 @@ static void drainSelfPipe(void);
 void
 InitializeLatchSupport(void)
 {
-	int			pipefd[2];
-
 	Assert(selfpipe_readfd == -1);
 
-	/*
-	 * Set up the self-pipe that allows a signal handler to wake up the
-	 * select() in WaitLatch. Make the write-end non-blocking, so that
-	 * SetLatch won't block if the event has already been set many times
-	 * filling the kernel buffer. Make the read-end non-blocking too, so that
-	 * we can easily clear the pipe by reading until EAGAIN or EWOULDBLOCK.
-	 */
-	if (pipe(pipefd) < 0)
-		elog(FATAL, "pipe() failed: %m");
-	if (fcntl(pipefd[0], F_SETFL, O_NONBLOCK) < 0)
-		elog(FATAL, "fcntl() failed on read-end of self-pipe: %m");
-	if (fcntl(pipefd[1], F_SETFL, O_NONBLOCK) < 0)
-		elog(FATAL, "fcntl() failed on write-end of self-pipe: %m");
-
-	selfpipe_readfd = pipefd[0];
-	selfpipe_writefd = pipefd[1];
+#if defined(HAVE_EVENTFD) && defined(EFD_NONBLOCK)
+	do
+	{
+		selfpipe_eventfd = eventfd(0, EFD_NONBLOCK);
+		if (selfpipe_eventfd < 0)
+		{
+			elog(DEBUG1, "eventfd() failed: %m");
+			break;
+		}
+		selfpipe_readfd = selfpipe_eventfd;
+		selfpipe_writefd = selfpipe_eventfd;
+
+		return;
+	}
+	while (0);
+#endif
+	{
+		int			pipefd[2];
+
+		/*
+		 * Set up the self-pipe that allows a signal handler to wake up the
+		 * select() in WaitLatch. Make the write-end non-blocking, so that
+		 * SetLatch won't block if the event has already been set many times
+		 * filling the kernel buffer. Make the read-end non-blocking too, so
+		 * that we can easily clear the pipe by reading until EAGAIN or
+		 * EWOULDBLOCK.
+		 */
+		if (pipe(pipefd) < 0)
+			elog(FATAL, "pipe() failed: %m");
+		if (fcntl(pipefd[0], F_SETFL, O_NONBLOCK) < 0)
+			elog(FATAL, "fcntl() failed on read-end of self-pipe: %m");
+		if (fcntl(pipefd[1], F_SETFL, O_NONBLOCK) < 0)
+			elog(FATAL, "fcntl() failed on write-end of self-pipe: %m");
+
+		selfpipe_readfd = pipefd[0];
+		selfpipe_writefd = pipefd[1];
+	}
 }
 
 /*
@@ -111,6 +138,7 @@ InitLatch(volatile Latch *latch)
 	latch->is_set = false;
 	latch->owner_pid = MyProcPid;
 	latch->is_shared = false;
+	latch->owner_fd = -1;
 }
 
 /*
@@ -130,6 +158,24 @@ InitSharedLatch(volatile Latch *latch)
 	latch->is_set = false;
 	latch->owner_pid = 0;
 	latch->is_shared = true;
+	latch->owner_fd = -1;
+}
+
+void
+InitSharedLatchPostmaster(volatile Latch *latch)
+{
+	latch->is_set = false;
+	latch->owner_pid = 0;
+	latch->is_shared = true;
+	latch->owner_fd = -1;
+
+#if defined(HAVE_EVENTFD) && defined(EFD_NONBLOCK)
+	latch->owner_fd = eventfd(0, EFD_NONBLOCK);
+	if (latch->owner_fd < 0)
+	{
+		elog(DEBUG1, "eventfd() failed: %m");
+	}
+#endif
 }
 
 /*
@@ -218,6 +264,7 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 #ifdef HAVE_POLL
 	struct pollfd pfds[3];
 	int			nfds;
+	int			selfd = -1;
 #else
 	struct timeval tv,
 			   *tvp;
@@ -277,8 +324,6 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 		 * with weak memory ordering, so that we cannot miss seeing is_set if
 		 * the signal byte is already in the pipe when we drain it.
 		 */
-		drainSelfPipe();
-
 		if ((wakeEvents & WL_LATCH_SET) && latch->is_set)
 		{
 			result |= WL_LATCH_SET;
@@ -315,10 +360,17 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 			nfds++;
 		}
 
-		pfds[nfds].fd = selfpipe_readfd;
-		pfds[nfds].events = POLLIN;
-		pfds[nfds].revents = 0;
-		nfds++;
+		if (wakeEvents & WL_LATCH_SET)
+		{
+			if (latch->owner_fd != -1)
+				pfds[nfds].fd = latch->owner_fd;
+			else
+				pfds[nfds].fd = selfpipe_readfd;
+			pfds[nfds].events = POLLIN;
+			pfds[nfds].revents = 0;
+			selfd = nfds;
+			nfds++;
+		}
 
 		if (wakeEvents & WL_POSTMASTER_DEATH)
 		{
@@ -395,6 +447,13 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 				if (!PostmasterIsAlive())
 					result |= WL_POSTMASTER_DEATH;
 			}
+
+			/* clear data from selfpipe */
+			if ((wakeEvents & WL_LATCH_SET) && (pfds[selfd].revents & POLLIN))
+			{
+				Assert(selfd != -1);
+				drainSelfPipe(latch);
+			}
 		}
 #else							/* !HAVE_POLL */
 
@@ -552,11 +611,36 @@ SetLatch(volatile Latch *latch)
 	owner_pid = latch->owner_pid;
 	if (owner_pid == 0)
 		return;
+#if defined(HAVE_EVENTFD) && defined(EFD_NONBLOCK)
+	else if (latch->owner_fd != -1)
+	{
+		int			rc;
+		int64		count = 1;
+retry:
+		rc = write(latch->owner_fd, &count, sizeof(count));
+
+		if (rc < 0)
+		{
+			if (errno == EINTR)
+				goto retry;
+			else if (errno == EAGAIN && errno == EWOULDBLOCK)
+			{
+				/* counter is full, no need to do anything */
+			}
+			/* XXX: odd, should we warn instead? */
+			Assert(false);
+			return;
+		}
+	}
+#endif
 	else if (owner_pid == MyProcPid)
 	{
+		Assert(latch->owner_fd == -1);
+
 		if (waiting)
 			sendSelfPipeByte();
 	}
+
 	else
 		kill(owner_pid, SIGUSR1);
 }
@@ -604,10 +688,21 @@ static void
 sendSelfPipeByte(void)
 {
 	int			rc;
-	char		dummy = 0;
 
 retry:
-	rc = write(selfpipe_writefd, &dummy, 1);
+#if defined(HAVE_EVENTFD) && defined(EFD_NONBLOCK)
+	if (selfpipe_eventfd != -1)
+	{
+		int64		count = 1;
+		rc = write(selfpipe_writefd, &count, sizeof(count));
+	}
+	else
+#endif
+	{
+		char		dummy = 0;
+		rc = write(selfpipe_writefd, &dummy, 1);
+	}
+
 	if (rc < 0)
 	{
 		/* If interrupted by signal, just retry */
@@ -638,7 +733,7 @@ retry:
  * happen).
  */
 static void
-drainSelfPipe(void)
+drainSelfPipe(volatile Latch *latch)
 {
 	/*
 	 * There shouldn't normally be more than one byte in the pipe, or maybe a
@@ -646,10 +741,23 @@ drainSelfPipe(void)
 	 */
 	char		buf[16];
 	int			rc;
+	int			fd;
+
+	if (latch->owner_fd != -1)
+		fd = latch->owner_fd;
+	else
+		fd = selfpipe_readfd;
 
 	for (;;)
 	{
-		rc = read(selfpipe_readfd, buf, sizeof(buf));
+		/*
+		 * If eventfd is being used, this will always return 8 bytes
+		 * (containing the number of events) as being readable. That means
+		 * we'll always recognize it as the pipe being drained. Not pretty,
+		 * but beats repeating the code (perhaps).
+		 */
+		rc = read(fd, buf, sizeof(buf));
+
 		if (rc < 0)
 		{
 			if (errno == EAGAIN || errno == EWOULDBLOCK)
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index d97c244..0e7ee87 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -224,7 +224,7 @@ InitProcGlobal(void)
 		if (i < MaxBackends + NUM_AUXILIARY_PROCS)
 		{
 			PGSemaphoreCreate(&(procs[i].sem));
-			InitSharedLatch(&(procs[i].procLatch));
+			InitSharedLatchPostmaster(&(procs[i].procLatch));
 			procs[i].backendLock = LWLockAssign();
 		}
 		procs[i].pgprocno = i;
diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h
index 28fc684..a4a0ba4 100644
--- a/src/include/storage/latch.h
+++ b/src/include/storage/latch.h
@@ -90,6 +90,7 @@ typedef struct Latch
 	sig_atomic_t is_set;
 	bool		is_shared;
 	int			owner_pid;
+	int			owner_fd;
 #ifdef WIN32
 	HANDLE		event;
 #endif
@@ -108,6 +109,7 @@ typedef struct Latch
 extern void InitializeLatchSupport(void);
 extern void InitLatch(volatile Latch *latch);
 extern void InitSharedLatch(volatile Latch *latch);
+extern void InitSharedLatchPostmaster(volatile Latch *latch);
 extern void OwnLatch(volatile Latch *latch);
 extern void DisownLatch(volatile Latch *latch);
 extern int	WaitLatch(volatile Latch *latch, int wakeEvents, long timeout);
-- 
2.0.0.rc2.4.g1dc51c6.dirty

#2Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Andres Freund (#1)
Re: Overhauling our interrupt handling

Hello, I'd synced up this at last.

I think I should finilize my commitfest item for this issue, with
.. "Rejected"?

This mail is a answer to
http://archives.postgresql.org/message-id/20150110022542.GF12509%40alap3.anarazel.de
but I decided that it's a good go move it into a new thread since the
patchseries has outgrown its original target. It's invasive and touches
many arcane areas of the code, so that I think more eyes than a long
deep thread is likely to have, are a good thing.

As a short description of the goal of the patchseries:
This started out as steps on the way to be able to safely handle
terminate_backend() et al when we're reading/writing from the
client. E.g. because the client is dead and we haven't noticed yet and
are stuck writing, or because we want to implement a idle_in_transaction
timeout.

Making these changes allowed me to see that not doing significant work
in signal handlers can make code much simpler and more robust. The
number of bugs postgres had in the past that involved doing too much in
signal handlers is significant. Thus I've since extended the
patchseries to get rid of nearly all nontrivial work in signal
handlers.

It sounds pretty nice.

All the patches in the series up to 0008 hav ecommit messages providing
more detail. A short description of each patch follows:

0001: Replace walsender's latch with the general shared latch.

New patch that removes ImmediateInteruptOK behaviour from walsender. I
think that's a rather good idea, because walsender currently seems to
assume WaitLatchOrSocket is reentrant - which I don't think is really
guaranteed.
Hasn't been reviewed yet, but I think it's not far from being
committable.

Deesn't this patchset containing per-socket basis non-blocking
control for win32? It should make the code (above the win32
socket layer itself) more simpler.

0002: Use a nonblocking socket for FE/BE communication and block using
latches.

Has previously been reviewed by Heikki. I think Noah also had a
look, although I'm not sure how close that was.

I think this can be committed soon.

0003: Introduce and use infrastructure for interrupt processing during client reads.

From here on ImmediateInterruptOK isn't set during client
communication. Normal interrupts and sinval/async interrupts are
processed outside of signal handlers. Especially the sinval/async
greatly simplify the respective code.

Has been reviewed by Heikki in an earlier version - I think I
addressed all the review comments.

I'd like somebody else to look at the sinval/async.c changes
before committing. I really like the simplification this change
allowed.

I think this patch might not be safe without 0004 because I can't
convince myself that it's safe to interrupt latch waits with work
that actually might also use the same latch (sinval
interrupts). But it's easier to review this way as 0004 is quite
big.

0004: Process 'die' interrupts while reading/writing from the client socket.

This is the reason Horiguchi-san started this thread.

I think the important debate here is whether we think it's
acceptable that there are situations (a full socket buffer, but a
alive connection) where the client previously got an error, but
not anymore afterwards. I think that's much better than having
unkillable connections, but I can see others being of a different
opinion.

This patch yields a code a bit confusion like following.

| secure_raw_write(Port *port, const void *ptr, size_t len)
| {
..
| w = WaitLatchOrSocket(MyLatch,
| if (w & WL_LATCH_SET)
...
| errno = EINTR;
| else if (w & WL_SOCKET_WRITEABLE)
| goto wloop;
|
| errno = save_errno;

The errno set when WL_LATCH_SET always vanishes. Specifically,
the EINTR set by SIGTERM(WL_LATCH_SET) is overwritten by
EAGAIN. As the result, pg_terminte_backend() cannot kill the
backend till the write blocking is released. errno = save_errno
should be the alternative of the line "errno = EINTR" and I
confirmed that the change leads to the desirable (as of me)
behavior.

0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.

I'm surprised how comparatively simple this turned out to be. For
robustness and understanding I think this is a great improvement.

New and not reviewed at all. Needs significant review. But works
in my simple testcases.

0006: Don't allow immediate interupts during authentication anymore.

So far we've set ImmediateInterruptOK to true during large parts
of the client authentication - that's not all that pretty,
interrupts might arrive while we're in some system routines.

Due to patches 0003/0004 we now are able to safely serve
interrupts during client communication which is the major are
where we want to adhere to authentication_timeout.

I additionally placed some CHECK_FOR_INTERRUPTS()s in some
somewhat randomly chosen places in auth.c. Those don't completely
get back the previous 'appruptness' (?) of reacting to
interrupts, but that's partially for the better, because we don't
interrupt foreign code anymore.

Simplly as a comment on style, this patch introduces checks of
ClientAuthInProgress twice successively into
ProcessInterrupts(). Isn't it better to make it like following?

| /* As in quickdie, ...
| if (ClientAuthInProgress)
| {
| if (whereToSendOutput == DestRemote) whereToSendOutput = DestNone;
| ereport(FATAL,

0007: Remove the option to service interrupts during PGSemaphoreLock().

Due to patch 0005 there's no user of PGSemaphoreLock(lock, interruptOK = true)

anymore, so remove the code to handle that. I find it rather odd
that the code did a CHECK_FOR_INTERRUPTS before the semwait even
when interruptOK was set to to false - that only happened to work
because lwlock.c did a HOLD_INTERRUPTS() around the semaphore
code. It's now removed entirely.

This is a API break because the signature of PGSemaphoreLock()
changed. But I have a hard time seing that as a problem. For one I
don't think there's many users of PGSemaphoreLock, for another
they should change their interrupt handling.

New and not reviewed.

0008: Remove remnants of ImmediateInterruptOK handling.

Now that ImmediateInterruptOK is never set to true anymore, we can
remove related code and comments.

New and not reviewed.

walreceiver.c still has WalRcvImmediateInterruptOK as mentioned
below, apart from whether it should be changed or not, the
following comment remains.

| * This is very much like what regular backends do with ImmediateInterruptOK,
| * ProcessInterrupts() etc.

0009: WIP: lwlock: use latches

Optional patch that removes the use of semaphores from the lwlock
code. There's no benefit for lwlock.c itself due to this. But it
does get rid of the last user of semaphores in a normal postgres
build. I primarily wrote this to test the performance of latches.

I guess we want to do this anyway.

0010: WIP: unix_latch: WIP efficiency hackery

Nothing ready, although I think using eventfds on linux is worth
the cost.

Questions I have about the series right now:

1) Do others agree that getting rid of ImmediateInterruptOK is
worthwile. I personally think that we really should pursue that
aggressively as a goal.

Just as my two cents, I perfectly agree with you. The code after
it dissapears looks quite cleaner and the less number of states
makes it more understandable.

2) Does anybody see a significant problem with the reduction of cases
where we immediately react to authentication_timeout? Since we still
react immediately when we're waiting for the client (except maybe in
sspi/kerberos?) I think that's ok. The waiting cases are slow
ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
out of the relevant code anyway.

I don't have definite answer for it, but from the point that
stucking or congesting during authentication can raise the number
of immature backend process unboundedly, it might be more
preferable if we can administratively or automatically kill them
for the case.

3) If we do everything (in corrected), upto 0009, there's no remaining
user of semaphores in postgres, except the spinlock emulation.

Does anybody see a need for PGPROC->sem to remain? Removing it would
have the significant benefit that semaphores are the last thing users
frequently need to tune to get postgres to start up when using a
higher max_connection or multiple clusters.

The less tuning point, the more usability, I suppose, as long as
no loss of function.

If we remove PGPROC->sem does anybody see a way to get rid of the
semaphore code alltogether? I personally still think we should just
gank --disable-spinlocks. Now that it's only a couple lines (in an
preexisting patch) to rely on compiler intrinsics for spinlocks that
doesn't sound like a big deal. Even if not, we could decide to get
rid of win32_sem.c...

4) There remains one user of something like ImmediateInterruptOK -
walreceiver. It has WalRcvImmediateInterruptOK. We definitely should
get rid of that as well, but that's a independent patch that can be
written in the future.

5) In a future patch I think we could also handle catchup interrupts
during other client reads, not just ReadCommand(). Does anybody see
that as a worthwile goal? I can't remember many problems with not
enough catchup happening, but I think someone mentioned that as a
problem in the past.

6) Review ;)

Comments?

Greetings,

Andres Freund

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@2ndquadrant.com
In reply to: Kyotaro HORIGUCHI (#2)
Re: Overhauling our interrupt handling

Hi,

On 2015-01-15 15:05:08 +0900, Kyotaro HORIGUCHI wrote:

Hello, I'd synced up this at last.

I think I should finilize my commitfest item for this issue, with
.. "Rejected"?

Fine with me.

All the patches in the series up to 0008 hav ecommit messages providing
more detail. A short description of each patch follows:

0001: Replace walsender's latch with the general shared latch.

New patch that removes ImmediateInteruptOK behaviour from walsender. I
think that's a rather good idea, because walsender currently seems to
assume WaitLatchOrSocket is reentrant - which I don't think is really
guaranteed.
Hasn't been reviewed yet, but I think it's not far from being
committable.

Deesn't this patchset containing per-socket basis non-blocking
control for win32? It should make the code (above the win32
socket layer itself) more simpler.

I don't think so - we still rely on it unfortunately.

0004: Process 'die' interrupts while reading/writing from the client socket.

This is the reason Horiguchi-san started this thread.

I think the important debate here is whether we think it's
acceptable that there are situations (a full socket buffer, but a
alive connection) where the client previously got an error, but
not anymore afterwards. I think that's much better than having
unkillable connections, but I can see others being of a different
opinion.

This patch yields a code a bit confusion like following.

| secure_raw_write(Port *port, const void *ptr, size_t len)
| {
..
| w = WaitLatchOrSocket(MyLatch,
| if (w & WL_LATCH_SET)
...
| errno = EINTR;
| else if (w & WL_SOCKET_WRITEABLE)
| goto wloop;
|
| errno = save_errno;

The errno set when WL_LATCH_SET always vanishes. Specifically,
the EINTR set by SIGTERM(WL_LATCH_SET) is overwritten by
EAGAIN. As the result, pg_terminte_backend() cannot kill the
backend till the write blocking is released. errno = save_errno
should be the alternative of the line "errno = EINTR" and I
confirmed that the change leads to the desirable (as of me)
behavior.

Ugh, that's the result stupid last minute "cleanup". You're right.

0006: Don't allow immediate interupts during authentication anymore.

So far we've set ImmediateInterruptOK to true during large parts
of the client authentication - that's not all that pretty,
interrupts might arrive while we're in some system routines.

Due to patches 0003/0004 we now are able to safely serve
interrupts during client communication which is the major are
where we want to adhere to authentication_timeout.

I additionally placed some CHECK_FOR_INTERRUPTS()s in some
somewhat randomly chosen places in auth.c. Those don't completely
get back the previous 'appruptness' (?) of reacting to
interrupts, but that's partially for the better, because we don't
interrupt foreign code anymore.

Simplly as a comment on style, this patch introduces checks of
ClientAuthInProgress twice successively into
ProcessInterrupts(). Isn't it better to make it like following?

| /* As in quickdie, ...
| if (ClientAuthInProgress)
| {
| if (whereToSendOutput == DestRemote) whereToSendOutput = DestNone;
| ereport(FATAL,

Hm, yes.

0008: Remove remnants of ImmediateInterruptOK handling.

Now that ImmediateInterruptOK is never set to true anymore, we can
remove related code and comments.

New and not reviewed.

walreceiver.c still has WalRcvImmediateInterruptOK as mentioned
below, apart from whether it should be changed or not, the
following comment remains.

| * This is very much like what regular backends do with ImmediateInterruptOK,
| * ProcessInterrupts() etc.

Yep. As mentioned below, it doesn't use the same infrastructure, so I'd
rather treat this separately. This set is more than big enough.

Thanks for looking!

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Andres Freund (#3)
Re: Overhauling our interrupt handling

Hello,

I think I should finilize my commitfest item for this issue, with
.. "Rejected"?

Fine with me.

done.

0001: Replace walsender's latch with the general shared latch.

New patch that removes ImmediateInteruptOK behaviour from walsender. I
think that's a rather good idea, because walsender currently seems to
assume WaitLatchOrSocket is reentrant - which I don't think is really
guaranteed.
Hasn't been reviewed yet, but I think it's not far from being
committable.

Deesn't this patchset containing per-socket basis non-blocking
control for win32? It should make the code (above the win32
socket layer itself) more simpler.

I don't think so - we still rely on it unfortunately.

Does "it" mean win32_noblock? Or the nonblocking bare win32
socket? The win32-per-sock-blkng-cntl patch in the below message
should cover both of them.

/messages/by-id/54060AE5.5020502@vmware.com

If you are saying it should be a patch separate from this, I'll
do so.

regareds,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#1)
Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On Wed, Jan 14, 2015 at 9:03 PM, Andres Freund <andres@2ndquadrant.com> wrote:

0002: Use a nonblocking socket for FE/BE communication and block using
latches.

Has previously been reviewed by Heikki. I think Noah also had a
look, although I'm not sure how close that was.

I think this can be committed soon.

Doesn't this significantly increase the number of system calls? I
worry there could be a performance issue here.

0003: Introduce and use infrastructure for interrupt processing during client reads.

From here on ImmediateInterruptOK isn't set during client
communication. Normal interrupts and sinval/async interrupts are
processed outside of signal handlers. Especially the sinval/async
greatly simplify the respective code.

ProcessNotifyInterrupt() seems like it could lead to a failure to
respond to other interrupts if there is a sufficiently vigorous stream
of notify interrupts. I think there ought to be one loop that goes
through and tries to handle each kind of interrupt in turn and then
loops until no interrupts remain.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#5)
Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 2015-01-30 09:29:59 -0500, Robert Haas wrote:

On Wed, Jan 14, 2015 at 9:03 PM, Andres Freund <andres@2ndquadrant.com> wrote:

0002: Use a nonblocking socket for FE/BE communication and block using
latches.

Has previously been reviewed by Heikki. I think Noah also had a
look, although I'm not sure how close that was.

I think this can be committed soon.

Doesn't this significantly increase the number of system calls? I
worry there could be a performance issue here.

I've posted benchmarks upthread and I only could start to measure any
overhead in pretty absurd cases (i.e. several hundred connections on a
few core machine, all doing SELECT 1;statements). As we try
the read before the poll/select it's not that bad - there's no
superflous work done if we're actually busy.

0003: Introduce and use infrastructure for interrupt processing during client reads.

From here on ImmediateInterruptOK isn't set during client
communication. Normal interrupts and sinval/async interrupts are
processed outside of signal handlers. Especially the sinval/async
greatly simplify the respective code.

ProcessNotifyInterrupt() seems like it could lead to a failure to
respond to other interrupts if there is a sufficiently vigorous stream
of notify interrupts.

That's nothing new though. It just used to be executed inside interrupts
directly, with looping. And we looped when enabling the notify
interrupts. Since I can't recall a report of this being problematic I'm
not that inclined to change even more than the patch already does. Given
that queuing notifies requires a lock I have a hard time seing this ever
fast enough to cause that problem.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andres Freund (#1)
Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 01/15/2015 03:03 AM, Andres Freund wrote:

0004: Process 'die' interrupts while reading/writing from the client socket.

This is the reason Horiguchi-san started this thread.

+ ProcessClientWriteInterrupt(!port->noblock);

...

+/*
+ * ProcessClientWriteInterrupt() - Process interrupts specific to client writes
+ *
+ * This is called just after low-level writes. That might be after the read
+ * finished successfully, or it was interrupted via interrupt. 'blocked' tells
+ * us whether the
+ *
+ * Must preserve errno!
+ */
+void
+ProcessClientWriteInterrupt(bool blocked)

You're passing port->noblock as argument, but I thought the argument is
supposed to mean whether the write would've blocked, i.e. if the write
buffer was full. port->noblock doesn't mean that. But perhaps I
misunderstood this - the comment on the 'blocked' argument above is a
bit incomplete ;-).

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andres Freund (#1)
Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 01/15/2015 03:03 AM, Andres Freund wrote:

0002: Use a nonblocking socket for FE/BE communication and block using
latches.

s/suceeds/succeeds/

0004: Process 'die' interrupts while reading/writing from the client socket.

+			 * Check for interrupts here, in addition to secure_write(),
+			 * because a interrupted write in secure_raw_write() can possibly
+			 * only return to here, not to secure_write().
*/
+			ProcessClientWriteInterrupt(true);

Took me a while to parse that sentence. How about:

Check for interrupts here, in addition to secure_write(), because an
interrupted write in secure_raw_write() will return here, and we cannot
return to secure_write() until we've written something.

0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.

I'm surprised how comparatively simple this turned out to be. For
robustness and understanding I think this is a great improvement.

New and not reviewed at all. Needs significant review. But works
in my simple testcases.

@@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
proc = MyProc;
MyProc = NULL;
DisownLatch(&proc->procLatch);
+ SetLatch(MyLatch);

SpinLockAcquire(ProcStructLock);

@@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
proc = MyProc;
MyProc = NULL;
DisownLatch(&proc->procLatch);
+ SetLatch(MyLatch);

SpinLockAcquire(ProcStructLock);

These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already
sets the latch. (According to the comment in SwitchToSharedLatch() even
that should not be necessary.)

0006: Don't allow immediate interupts during authentication anymore.

In commit message: s/interupts/interrupts/.

@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
if (*shadow_pass == '\0')
return STATUS_ERROR; /* empty password */

- /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
- ImmediateInterruptOK = true;
- /* And don't forget to detect one that already arrived */
CHECK_FOR_INTERRUPTS();

/*

That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly
placed now that the "ImmediateInterruptOK = true" is gone. I would just
remove it. Add one to the caller if it's needed, but it probably isn't.

0007: Remove the option to service interrupts during PGSemaphoreLock().

s/don't/doesn't/ in commit message.

Questions I have about the series right now:

1) Do others agree that getting rid of ImmediateInterruptOK is
worthwile. I personally think that we really should pursue that
aggressively as a goal.

Yes.

2) Does anybody see a significant problem with the reduction of cases
where we immediately react to authentication_timeout? Since we still
react immediately when we're waiting for the client (except maybe in
sspi/kerberos?) I think that's ok. The waiting cases are slow
ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
out of the relevant code anyway.

Yeah, I think that's OK. Doing those things with
ImmediateInterruptOK=true was quite scary, and we need to stop doing
that. It would be nice to have a wholesale solution for those lookups
but I don't see one. So all the ident/radius/kerberos/ldap lookups will
need to be done in a non-blocking way to have them react to the timeout.
That requires some work, but we can leave that to a followup patch, if
someone wants to write it.

Overall, 1-8 look good to me, except for that one incomplete comment in
ProcessClientWriteInterrupt() I mentioned in a separate email. I haven't
reviewed patches 9 and 10 yet.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#7)
Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 2015-01-30 18:59:28 +0100, Heikki Linnakangas wrote:

On 01/15/2015 03:03 AM, Andres Freund wrote:

0004: Process 'die' interrupts while reading/writing from the client socket.

This is the reason Horiguchi-san started this thread.

+ ProcessClientWriteInterrupt(!port->noblock);

...

+/*
+ * ProcessClientWriteInterrupt() - Process interrupts specific to client writes
+ *
+ * This is called just after low-level writes. That might be after the read
+ * finished successfully, or it was interrupted via interrupt. 'blocked' tells
+ * us whether the
+ *
+ * Must preserve errno!
+ */
+void
+ProcessClientWriteInterrupt(bool blocked)

You're passing port->noblock as argument, but I thought the argument is
supposed to mean whether the write would've blocked, i.e. if the write
buffer was full.

Right.

port->noblock doesn't mean that. But perhaps I misunderstood this -
the comment on the 'blocked' argument above is a bit incomplete ;-).

The point here is that we tried the send() and then were interrupted
while waiting for the buffer to become writable. That pretty much
implies that the write buffer is full. The !port->noblock is because we
only are blocked on write if we're doinga blocking send, otherwise we'll
just return control to the caller...

I agree that this could use some more comments ;)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#8)
Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 2015-01-31 00:52:03 +0100, Heikki Linnakangas wrote:

On 01/15/2015 03:03 AM, Andres Freund wrote:

0004: Process 'die' interrupts while reading/writing from the client socket.

+			 * Check for interrupts here, in addition to secure_write(),
+			 * because a interrupted write in secure_raw_write() can possibly
+			 * only return to here, not to secure_write().
*/
+			ProcessClientWriteInterrupt(true);

Took me a while to parse that sentence. How about:

Check for interrupts here, in addition to secure_write(), because an
interrupted write in secure_raw_write() will return here, and we cannot
return to secure_write() until we've written something.

Yep, that's better.

0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.

I'm surprised how comparatively simple this turned out to be. For
robustness and understanding I think this is a great improvement.

New and not reviewed at all. Needs significant review. But works
in my simple testcases.

@@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
proc = MyProc;
MyProc = NULL;
DisownLatch(&proc->procLatch);
+ SetLatch(MyLatch);

SpinLockAcquire(ProcStructLock);

@@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
proc = MyProc;
MyProc = NULL;
DisownLatch(&proc->procLatch);
+ SetLatch(MyLatch);

SpinLockAcquire(ProcStructLock);

These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already sets
the latch. (According to the comment in SwitchToSharedLatch() even that
should not be necessary.)

Ick, that's some debugging leftovers where I was trying to find a bug
that was fundamentally caused by the missing barriers in the latch
code...

0006: Don't allow immediate interupts during authentication anymore.

In commit message: s/interupts/interrupts/.

@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
if (*shadow_pass == '\0')
return STATUS_ERROR; /* empty password */

- /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
- ImmediateInterruptOK = true;
- /* And don't forget to detect one that already arrived */
CHECK_FOR_INTERRUPTS();

/*

That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly placed
now that the "ImmediateInterruptOK = true" is gone. I would just remove it.
Add one to the caller if it's needed, but it probably isn't.

There's some method to the madness here actually - we just did some
database stuff that could in theory take a while... Whether it's
worthwhile to leave it here I'm not sure.

2) Does anybody see a significant problem with the reduction of cases
where we immediately react to authentication_timeout? Since we still
react immediately when we're waiting for the client (except maybe in
sspi/kerberos?) I think that's ok. The waiting cases are slow
ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
out of the relevant code anyway.

Yeah, I think that's OK. Doing those things with ImmediateInterruptOK=true
was quite scary, and we need to stop doing that. It would be nice to have a
wholesale solution for those lookups but I don't see one. So all the
ident/radius/kerberos/ldap lookups will need to be done in a non-blocking
way to have them react to the timeout. That requires some work, but we can
leave that to a followup patch, if someone wants to write it.

Ok, cool.

Overall, 1-8 look good to me, except for that one incomplete comment in
ProcessClientWriteInterrupt() I mentioned in a separate email.

Thanks for the review! I plan to push the fixed up versions sometime
next week.

I haven't reviewed patches 9 and 10 yet.

Yea, those aren't as close to being ready (and thus marked WIP).

I'd like to do the lwlock stuff for 9.5 because then any sane setup
(i.e. unless spinlocks are emulated) doesn't need any semaphores
anymore, which makes setup easier. Right now users need to adjust system
settings when changing max_connctions upwards or run multiple
clusters. But it's not really related to getting rid of
ImmediateInterruptOK.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#1)
Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On Thu, Jan 15, 2015 at 03:03:35AM +0100, Andres Freund wrote:

0002: Use a nonblocking socket for FE/BE communication and block using
latches.

Has previously been reviewed by Heikki. I think Noah also had a
look, although I'm not sure how close that was.

Negligible. I did little more than scroll past it while reviewing the patch
that became commit 4bad60e.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andres Freund
andres@2ndquadrant.com
In reply to: Andres Freund (#10)
7 attachment(s)
Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

Hi,

Here's the last, rebased (took a while...), version of this
patchset. I've fixed the things that Heikki mentioned (except the one
"stray" CFI, which imo maskes sense).

Besides a fair number of cosmetic changes there are two somewhat
important ones:

* I previously had removed the win32 waitforsinglesocket calls in the
openssl code - they're now just always replaced with latch waits. The
windows case makes actually much more sense, as we previously just
busylooped in the !win32 case.
* Previously the patchset didn't handle SIGTERM while
InteractiveBackend() was reading from the client. It did handle
ctrl-c/d, but since getc() isn't interruptible and can't be replaced
with latches... The fix for that isn't super pretty:
die():
if (DoingCommandRead && whereToSendOutput != DestRemote)
ProcessInterrupts();
but imo acceptable for single user mode.

Unless someone announces the intent do review them some more, I plan to
push the attached patches fairly soon. I'm not claiming at all they're
bug free, but I think at this stage it's better to get them in.

I plan to pursue the remaining patches (latch optimizations, lwlock
using latches, possibly removing PGPROC.sem) afterwards.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Use-a-nonblocking-socket-for-FE-BE-communication-and.patchtext/x-patch; charset=us-asciiDownload
>From debed670e17445f8577c742c4bd70a5e0c5866c5 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 3 Feb 2015 00:02:33 +0100
Subject: [PATCH 1/7] Use a nonblocking socket for FE/BE communication and
 block using latches.

This allows to introduce more elaborate handling of interrupts while
reading from a socket.  Currently some interrupt handlers have to do
significant work from inside signal handlers, and it's very hard to
correctly write code to do so.  Generic signal handler limitations,
combined with the fact that we can't safely jump out of a signal
handler while reading from the client have prohibited implementation
of features like timeouts for idle-in-transaction.

Additionally we use the latch code to wait in a couple places where we
previously only had waiting code on windows as other platforms just
busy looped.

This commit theoretically can't used without the next patch in the
series, as WaitLatchOrSocket is not defined to be fully signal
safe. As we already do that in some cases though, it seems better to
keep the commits separate, so they're easier to understand.

Author: Andres Freund
Reviewed-By: Heikki Linnakangas
---
 src/backend/libpq/be-secure-openssl.c | 52 ++++++++++++++---------
 src/backend/libpq/be-secure.c         | 77 ++++++++++++++++++++++++++++++++++-
 src/backend/libpq/pqcomm.c            | 41 ++++++++-----------
 3 files changed, 124 insertions(+), 46 deletions(-)

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 1dd7770..20b4742 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -71,6 +71,8 @@
 #endif
 
 #include "libpq/libpq.h"
+#include "miscadmin.h"
+#include "storage/latch.h"
 #include "tcop/tcopprot.h"
 #include "utils/memutils.h"
 
@@ -338,6 +340,7 @@ be_tls_open_server(Port *port)
 {
 	int			r;
 	int			err;
+	int			waitfor;
 
 	Assert(!port->ssl);
 	Assert(!port->peer);
@@ -371,12 +374,15 @@ aloop:
 		{
 			case SSL_ERROR_WANT_READ:
 			case SSL_ERROR_WANT_WRITE:
-#ifdef WIN32
-				pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
-											(err == SSL_ERROR_WANT_READ) ?
-						FD_READ | FD_CLOSE | FD_ACCEPT : FD_WRITE | FD_CLOSE,
-											INFINITE);
-#endif
+				/* not allowed during connection establishment */
+				Assert(!port->noblock);
+
+				if (err == SSL_ERROR_WANT_READ)
+					waitfor = WL_SOCKET_READABLE;
+				else
+					waitfor = WL_SOCKET_WRITEABLE;
+
+				WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
 				goto aloop;
 			case SSL_ERROR_SYSCALL:
 				if (r < 0)
@@ -504,6 +510,7 @@ be_tls_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 	int			err;
+	int			waitfor;
 
 rloop:
 	errno = 0;
@@ -516,18 +523,20 @@ rloop:
 			break;
 		case SSL_ERROR_WANT_READ:
 		case SSL_ERROR_WANT_WRITE:
+			/* Don't retry if the socket is in nonblocking mode. */
 			if (port->noblock)
 			{
 				errno = EWOULDBLOCK;
 				n = -1;
 				break;
 			}
-#ifdef WIN32
-			pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
-										(err == SSL_ERROR_WANT_READ) ?
-									FD_READ | FD_CLOSE : FD_WRITE | FD_CLOSE,
-										INFINITE);
-#endif
+
+			if (err == SSL_ERROR_WANT_READ)
+				waitfor = WL_SOCKET_READABLE;
+			else
+				waitfor = WL_SOCKET_WRITEABLE;
+
+			WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
 			goto rloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
@@ -567,6 +576,7 @@ be_tls_write(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 	int			err;
+	int			waitfor;
 
 	/*
 	 * If SSL renegotiations are enabled and we're getting close to the
@@ -630,12 +640,13 @@ wloop:
 			break;
 		case SSL_ERROR_WANT_READ:
 		case SSL_ERROR_WANT_WRITE:
-#ifdef WIN32
-			pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
-										(err == SSL_ERROR_WANT_READ) ?
-									FD_READ | FD_CLOSE : FD_WRITE | FD_CLOSE,
-										INFINITE);
-#endif
+
+			if (err == SSL_ERROR_WANT_READ)
+				waitfor = WL_SOCKET_READABLE;
+			else
+				waitfor = WL_SOCKET_WRITEABLE;
+
+			WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
 			goto wloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
@@ -722,7 +733,7 @@ my_sock_read(BIO *h, char *buf, int size)
 		if (res <= 0)
 		{
 			/* If we were interrupted, tell caller to retry */
-			if (errno == EINTR)
+			if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
 			{
 				BIO_set_retry_read(h);
 			}
@@ -741,7 +752,8 @@ my_sock_write(BIO *h, const char *buf, int size)
 	BIO_clear_retry_flags(h);
 	if (res <= 0)
 	{
-		if (errno == EINTR)
+		/* If we were interrupted, tell caller to retry */
+		if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
 		{
 			BIO_set_retry_write(h);
 		}
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index c592f85..c9a8f6d 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -32,8 +32,10 @@
 #endif
 
 #include "libpq/libpq.h"
+#include "miscadmin.h"
 #include "tcop/tcopprot.h"
 #include "utils/memutils.h"
+#include "storage/proc.h"
 
 
 char	   *ssl_cert_file;
@@ -147,7 +149,39 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 
 	prepare_for_client_read();
 
+	/*
+	 * Try to read from the socket without blocking. If it succeeds we're
+	 * done, otherwise we'll wait for the socket using the latch mechanism.
+	 */
+rloop:
+#ifdef WIN32
+	pgwin32_noblock = true;
+#endif
 	n = recv(port->sock, ptr, len, 0);
+#ifdef WIN32
+	pgwin32_noblock = false;
+#endif
+
+	if (n < 0 && !port->noblock && (errno == EWOULDBLOCK || errno == EAGAIN))
+	{
+		int		w;
+		int		save_errno = errno;
+
+		w = WaitLatchOrSocket(MyLatch,
+							  WL_SOCKET_READABLE,
+							  port->sock, 0);
+
+		if (w & WL_SOCKET_READABLE)
+		{
+			goto rloop;
+		}
+
+		/*
+		 * Restore errno, clobbered by WaitLatchOrSocket, so the caller can
+		 * react properly.
+		 */
+		errno = save_errno;
+	}
 
 	client_read_ended();
 
@@ -170,7 +204,9 @@ secure_write(Port *port, void *ptr, size_t len)
 	}
 	else
 #endif
+	{
 		n = secure_raw_write(port, ptr, len);
+	}
 
 	return n;
 }
@@ -178,5 +214,44 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port->sock, ptr, len, 0);
+	ssize_t		n;
+
+wloop:
+
+#ifdef WIN32
+	pgwin32_noblock = true;
+#endif
+	n = send(port->sock, ptr, len, 0);
+#ifdef WIN32
+	pgwin32_noblock = false;
+#endif
+
+	if (n < 0 && !port->noblock && (errno == EWOULDBLOCK || errno == EAGAIN))
+	{
+		int		w;
+		int		save_errno = errno;
+
+		/*
+		 * We probably want to check for latches being set at some point
+		 * here. That'd allow us to handle interrupts while blocked on
+		 * writes. If set we'd not retry directly, but return. That way we
+		 * don't do anything while (possibly) inside a ssl library.
+		 */
+		w = WaitLatchOrSocket(MyLatch,
+							  WL_SOCKET_WRITEABLE,
+							  port->sock, 0);
+
+		if (w & WL_SOCKET_WRITEABLE)
+		{
+			goto wloop;
+		}
+
+		/*
+		 * Restore errno, clobbered by WaitLatchOrSocket, so the caller can
+		 * react properly.
+		 */
+		errno = save_errno;
+	}
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 254fd82..0d97aa4 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -181,6 +181,22 @@ pq_init(void)
 	PqCommReadingMsg = false;
 	DoingCopyOut = false;
 	on_proc_exit(socket_close, 0);
+
+	/*
+	 * In backends (as soon as forked) we operate the underlying socket in
+	 * nonblocking mode and use latches to implement blocking semantics if
+	 * needed. That allows us to provide safely interruptible reads.
+	 *
+	 * Use COMMERROR on failure, because ERROR would try to send the error to
+	 * the client, which might require changing the mode again, leading to
+	 * infinite recursion.
+	 */
+#ifndef WIN32
+	if (!pg_set_noblock(MyProcPort->sock))
+		ereport(COMMERROR,
+				(errmsg("could not set socket to nonblocking mode: %m")));
+#endif
+
 }
 
 /* --------------------------------
@@ -820,31 +836,6 @@ socket_set_nonblocking(bool nonblocking)
 				(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST),
 				 errmsg("there is no client connection")));
 
-	if (MyProcPort->noblock == nonblocking)
-		return;
-
-#ifdef WIN32
-	pgwin32_noblock = nonblocking ? 1 : 0;
-#else
-
-	/*
-	 * Use COMMERROR on failure, because ERROR would try to send the error to
-	 * the client, which might require changing the mode again, leading to
-	 * infinite recursion.
-	 */
-	if (nonblocking)
-	{
-		if (!pg_set_noblock(MyProcPort->sock))
-			ereport(COMMERROR,
-					(errmsg("could not set socket to nonblocking mode: %m")));
-	}
-	else
-	{
-		if (!pg_set_block(MyProcPort->sock))
-			ereport(COMMERROR,
-					(errmsg("could not set socket to blocking mode: %m")));
-	}
-#endif
 	MyProcPort->noblock = nonblocking;
 }
 
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0002-Introduce-and-use-infrastructure-for-interrupt-proce.patchtext/x-patch; charset=us-asciiDownload
>From 8c35902175336591eee1b395347c98ba8580e020 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 3 Feb 2015 00:22:18 +0100
Subject: [PATCH 2/7] Introduce and use infrastructure for interrupt processing
 during client reads.

Up to now large swathes of backend code ran inside signal handlers
while reading commands from the client, to allow for speedy reaction to
asynchronous events. Most prominently shared invalidation and NOTIFY
handling. That means that complex code like the starting/stopping of
transactions is run in signal handlers...  The required code was
fragile and verbose, and is likely to contain bugs.

That approach also severely limited what could be done while
communicating with the client. As the read might be from within
openssl it wasn't safely possible to trigger an error, e.g. to cancel
a backend in idle-in-transaction state.

Now that FE/BE communication in the backend employs non-blocking
sockets and latches to block we can quite simply interrupt reads from
signal handlers by setting a signal. That allows us to signal an
interrupted read which is supposed to be retried after returning from
within the ssl library.

As signal handlers now only need to set the latch to guarantee timely
interrupt processing, remove a fair amount of complicated & fragile
code from async.c and sinval.c.

This work will hopefully allow to handle cases like being blocked by
sending data, interrupting idle transactions and similar to be
implemented without too much effort.

Author: Andres Freund
Reviewed-By: Heikki Linnakangas
---
 src/backend/commands/async.c          | 199 ++++++------------------------
 src/backend/libpq/be-secure-openssl.c |  31 ++++-
 src/backend/libpq/be-secure.c         |  43 ++++++-
 src/backend/postmaster/autovacuum.c   |   6 +-
 src/backend/storage/ipc/sinval.c      | 223 +++++++---------------------------
 src/backend/tcop/postgres.c           | 104 ++++++----------
 src/include/commands/async.h          |  12 +-
 src/include/storage/sinval.h          |   7 +-
 src/include/tcop/tcopprot.h           |   4 +-
 9 files changed, 195 insertions(+), 434 deletions(-)

diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 7ece527..f6f92d7 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -79,10 +79,12 @@
  *	  either, but just process the queue directly.
  *
  * 5. Upon receipt of a PROCSIG_NOTIFY_INTERRUPT signal, the signal handler
- *	  can call inbound-notify processing immediately if this backend is idle
- *	  (ie, it is waiting for a frontend command and is not within a transaction
- *	  block).  Otherwise the handler may only set a flag, which will cause the
- *	  processing to occur just before we next go idle.
+ *	  sets the process's latch which triggers the event to be processed
+ *	  immediately if this backend is idle (ie, it is waiting for a frontend
+ *	  command and is not within a transaction block. C.f.
+ *	  ProcessClientReadInterrupt()).  Otherwise the handler may only set a
+ *	  flag, which will cause the processing to occur just before we next go
+ *	  idle.
  *
  *	  Inbound-notify processing consists of reading all of the notifications
  *	  that have arrived since scanning last time. We read every notification
@@ -126,6 +128,7 @@
 #include "miscadmin.h"
 #include "storage/ipc.h"
 #include "storage/lmgr.h"
+#include "storage/proc.h"
 #include "storage/procarray.h"
 #include "storage/procsignal.h"
 #include "storage/sinval.h"
@@ -334,17 +337,13 @@ static List *pendingNotifies = NIL;		/* list of Notifications */
 static List *upperPendingNotifies = NIL;		/* list of upper-xact lists */
 
 /*
- * State for inbound notifications consists of two flags: one saying whether
- * the signal handler is currently allowed to call ProcessIncomingNotify
- * directly, and one saying whether the signal has occurred but the handler
- * was not allowed to call ProcessIncomingNotify at the time.
- *
- * NB: the "volatile" on these declarations is critical!  If your compiler
- * does not grok "volatile", you'd be best advised to compile this file
- * with all optimization turned off.
+ * Inbound notifications are initially processed by HandleNotifyInterrupt(),
+ * called from inside a signal handler. That just sets the
+ * notifyInterruptPending flag and sets the process
+ * latch. ProcessNotifyInterrupt() will then be called whenever it's safe to
+ * actually deal with the interrupt.
  */
-static volatile sig_atomic_t notifyInterruptEnabled = 0;
-static volatile sig_atomic_t notifyInterruptOccurred = 0;
+volatile sig_atomic_t notifyInterruptPending = false;
 
 /* True if we've registered an on_shmem_exit cleanup */
 static bool unlistenExitRegistered = false;
@@ -1625,164 +1624,45 @@ AtSubAbort_Notify(void)
 /*
  * HandleNotifyInterrupt
  *
- *		This is called when PROCSIG_NOTIFY_INTERRUPT is received.
- *
- *		If we are idle (notifyInterruptEnabled is set), we can safely invoke
- *		ProcessIncomingNotify directly.  Otherwise, just set a flag
- *		to do it later.
+ *		Signal handler portion of interrupt handling. Let the backend know
+ *		that there's a pending notify interrupt. If we're currently reading
+ *		from the client, this will interrupt the read and
+ *		ProcessClientReadInterrupt() will call ProcessNotifyInterrupt().
  */
 void
 HandleNotifyInterrupt(void)
 {
 	/*
 	 * Note: this is called by a SIGNAL HANDLER. You must be very wary what
-	 * you do here. Some helpful soul had this routine sprinkled with
-	 * TPRINTFs, which would likely lead to corruption of stdio buffers if
-	 * they were ever turned on.
+	 * you do here.
 	 */
 
-	/* Don't joggle the elbow of proc_exit */
-	if (proc_exit_inprogress)
-		return;
-
-	if (notifyInterruptEnabled)
-	{
-		bool		save_ImmediateInterruptOK = ImmediateInterruptOK;
-
-		/*
-		 * We may be called while ImmediateInterruptOK is true; turn it off
-		 * while messing with the NOTIFY state.  This prevents problems if
-		 * SIGINT or similar arrives while we're working.  Just to be real
-		 * sure, bump the interrupt holdoff counter as well.  That way, even
-		 * if something inside ProcessIncomingNotify() transiently sets
-		 * ImmediateInterruptOK (eg while waiting on a lock), we won't get
-		 * interrupted until we're done with the notify interrupt.
-		 */
-		ImmediateInterruptOK = false;
-		HOLD_INTERRUPTS();
-
-		/*
-		 * I'm not sure whether some flavors of Unix might allow another
-		 * SIGUSR1 occurrence to recursively interrupt this routine. To cope
-		 * with the possibility, we do the same sort of dance that
-		 * EnableNotifyInterrupt must do --- see that routine for comments.
-		 */
-		notifyInterruptEnabled = 0;		/* disable any recursive signal */
-		notifyInterruptOccurred = 1;	/* do at least one iteration */
-		for (;;)
-		{
-			notifyInterruptEnabled = 1;
-			if (!notifyInterruptOccurred)
-				break;
-			notifyInterruptEnabled = 0;
-			if (notifyInterruptOccurred)
-			{
-				/* Here, it is finally safe to do stuff. */
-				if (Trace_notify)
-					elog(DEBUG1, "HandleNotifyInterrupt: perform async notify");
-
-				ProcessIncomingNotify();
-
-				if (Trace_notify)
-					elog(DEBUG1, "HandleNotifyInterrupt: done");
-			}
-		}
+	/* signal that work needs to be done */
+	notifyInterruptPending = true;
 
-		/*
-		 * Restore the holdoff level and ImmediateInterruptOK, and check for
-		 * interrupts if needed.
-		 */
-		RESUME_INTERRUPTS();
-		ImmediateInterruptOK = save_ImmediateInterruptOK;
-		if (save_ImmediateInterruptOK)
-			CHECK_FOR_INTERRUPTS();
-	}
-	else
-	{
-		/*
-		 * In this path it is NOT SAFE to do much of anything, except this:
-		 */
-		notifyInterruptOccurred = 1;
-	}
+	/* make sure the event is processed in due course */
+	SetLatch(MyLatch);
 }
 
 /*
- * EnableNotifyInterrupt
- *
- *		This is called by the PostgresMain main loop just before waiting
- *		for a frontend command.  If we are truly idle (ie, *not* inside
- *		a transaction block), then process any pending inbound notifies,
- *		and enable the signal handler to process future notifies directly.
+ * ProcessNotifyInterrupt
  *
- *		NOTE: the signal handler starts out disabled, and stays so until
- *		PostgresMain calls this the first time.
+ *		This is called just after waiting for a frontend command.  If a
+ *		interrupt arrives (via HandleNotifyInterrupt()) while reading, the
+ *		read will be interrupted via the process's latch, and this routine
+ *		will get called.  If we are truly idle (ie, *not* inside a transaction
+ *		block), process the incoming notifies.
  */
 void
-EnableNotifyInterrupt(void)
+ProcessNotifyInterrupt(void)
 {
 	if (IsTransactionOrTransactionBlock())
 		return;					/* not really idle */
 
-	/*
-	 * This code is tricky because we are communicating with a signal handler
-	 * that could interrupt us at any point.  If we just checked
-	 * notifyInterruptOccurred and then set notifyInterruptEnabled, we could
-	 * fail to respond promptly to a signal that happens in between those two
-	 * steps.  (A very small time window, perhaps, but Murphy's Law says you
-	 * can hit it...)  Instead, we first set the enable flag, then test the
-	 * occurred flag.  If we see an unserviced interrupt has occurred, we
-	 * re-clear the enable flag before going off to do the service work. (That
-	 * prevents re-entrant invocation of ProcessIncomingNotify() if another
-	 * interrupt occurs.) If an interrupt comes in between the setting and
-	 * clearing of notifyInterruptEnabled, then it will have done the service
-	 * work and left notifyInterruptOccurred zero, so we have to check again
-	 * after clearing enable.  The whole thing has to be in a loop in case
-	 * another interrupt occurs while we're servicing the first. Once we get
-	 * out of the loop, enable is set and we know there is no unserviced
-	 * interrupt.
-	 *
-	 * NB: an overenthusiastic optimizing compiler could easily break this
-	 * code. Hopefully, they all understand what "volatile" means these days.
-	 */
-	for (;;)
-	{
-		notifyInterruptEnabled = 1;
-		if (!notifyInterruptOccurred)
-			break;
-		notifyInterruptEnabled = 0;
-		if (notifyInterruptOccurred)
-		{
-			if (Trace_notify)
-				elog(DEBUG1, "EnableNotifyInterrupt: perform async notify");
-
-			ProcessIncomingNotify();
-
-			if (Trace_notify)
-				elog(DEBUG1, "EnableNotifyInterrupt: done");
-		}
-	}
+	while (notifyInterruptPending)
+		ProcessIncomingNotify();
 }
 
-/*
- * DisableNotifyInterrupt
- *
- *		This is called by the PostgresMain main loop just after receiving
- *		a frontend command.  Signal handler execution of inbound notifies
- *		is disabled until the next EnableNotifyInterrupt call.
- *
- *		The PROCSIG_CATCHUP_INTERRUPT signal handler also needs to call this,
- *		so as to prevent conflicts if one signal interrupts the other.  So we
- *		must return the previous state of the flag.
- */
-bool
-DisableNotifyInterrupt(void)
-{
-	bool		result = (notifyInterruptEnabled != 0);
-
-	notifyInterruptEnabled = 0;
-
-	return result;
-}
 
 /*
  * Read all pending notifications from the queue, and deliver appropriate
@@ -2076,9 +1956,10 @@ asyncQueueAdvanceTail(void)
 /*
  * ProcessIncomingNotify
  *
- *		Deal with arriving NOTIFYs from other backends.
- *		This is called either directly from the PROCSIG_NOTIFY_INTERRUPT
- *		signal handler, or the next time control reaches the outer idle loop.
+ *		Deal with arriving NOTIFYs from other backends as soon as it's safe to
+ *		do so. This used to be called from the PROCSIG_NOTIFY_INTERRUPT
+ *		signal handler, but isn't anymore.
+ *
  *		Scan the queue for arriving notifications and report them to my front
  *		end.
  *
@@ -2087,18 +1968,13 @@ asyncQueueAdvanceTail(void)
 static void
 ProcessIncomingNotify(void)
 {
-	bool		catchup_enabled;
-
 	/* We *must* reset the flag */
-	notifyInterruptOccurred = 0;
+	notifyInterruptPending = false;
 
 	/* Do nothing else if we aren't actively listening */
 	if (listenChannels == NIL)
 		return;
 
-	/* Must prevent catchup interrupt while I am running */
-	catchup_enabled = DisableCatchupInterrupt();
-
 	if (Trace_notify)
 		elog(DEBUG1, "ProcessIncomingNotify");
 
@@ -2123,9 +1999,6 @@ ProcessIncomingNotify(void)
 
 	if (Trace_notify)
 		elog(DEBUG1, "ProcessIncomingNotify: done");
-
-	if (catchup_enabled)
-		EnableCatchupInterrupt();
 }
 
 /*
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 20b4742..e4c1a25 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -511,6 +511,7 @@ be_tls_read(Port *port, void *ptr, size_t len)
 	ssize_t		n;
 	int			err;
 	int			waitfor;
+	int			latchret;
 
 rloop:
 	errno = 0;
@@ -531,12 +532,32 @@ rloop:
 				break;
 			}
 
+			waitfor = WL_LATCH_SET;
+
 			if (err == SSL_ERROR_WANT_READ)
-				waitfor = WL_SOCKET_READABLE;
+				waitfor |= WL_SOCKET_READABLE;
 			else
-				waitfor = WL_SOCKET_WRITEABLE;
+				waitfor |= WL_SOCKET_WRITEABLE;
 
-			WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
+			latchret = WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
+
+			/*
+			 * We'll, among other situations, get here if the low level
+			 * routine doing the actual recv() via the socket got interrupted
+			 * by a signal. That's so we can handle interrupts once outside
+			 * openssl so we don't jump out from underneath its covers. We can
+			 * check this both, when reading and writing, because even when
+			 * writing that's just openssl's doing, not a 'proper' write
+			 * initiated by postgres.
+			 *
+			 * Only process interrupts here if we're blocking inside the
+			 * function. In the other cases secure_read() will do so.
+			 */
+			if (latchret & WL_LATCH_SET)
+			{
+				ResetLatch(MyLatch);
+				ProcessClientReadInterrupt();  /* preserves errno */
+			}
 			goto rloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
@@ -647,6 +668,10 @@ wloop:
 				waitfor = WL_SOCKET_WRITEABLE;
 
 			WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
+			/*
+			 * XXX: We'll, at some later point, likely want to add interrupt
+			 * processing here.
+			 */
 			goto wloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index c9a8f6d..b90ab0e 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -128,6 +128,7 @@ secure_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
+retry:
 #ifdef USE_SSL
 	if (port->ssl_in_use)
 	{
@@ -139,6 +140,14 @@ secure_read(Port *port, void *ptr, size_t len)
 		n = secure_raw_read(port, ptr, len);
 	}
 
+	/* Process interrupts that happened while (or before) receiving. */
+	ProcessClientReadInterrupt(); /* preserves errno */
+
+	/* retry after processing interrupts */
+	if (n < 0 && errno == EINTR)
+	{
+		goto retry;
+	}
 	return n;
 }
 
@@ -147,8 +156,6 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
-
 	/*
 	 * Try to read from the socket without blocking. If it succeeds we're
 	 * done, otherwise we'll wait for the socket using the latch mechanism.
@@ -168,10 +175,20 @@ rloop:
 		int		save_errno = errno;
 
 		w = WaitLatchOrSocket(MyLatch,
-							  WL_SOCKET_READABLE,
+							  WL_LATCH_SET | WL_SOCKET_READABLE,
 							  port->sock, 0);
 
-		if (w & WL_SOCKET_READABLE)
+		if (w & WL_LATCH_SET)
+		{
+			ResetLatch(MyLatch);
+			/*
+			 * Force a return, so interrupts can be processed when not
+			 * (possibly) underneath a ssl library.
+			 */
+			errno = EINTR;
+			return -1;
+		}
+		else if (w & WL_SOCKET_READABLE)
 		{
 			goto rloop;
 		}
@@ -183,8 +200,6 @@ rloop:
 		errno = save_errno;
 	}
 
-	client_read_ended();
-
 	return n;
 }
 
@@ -197,6 +212,7 @@ secure_write(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
+retry:
 #ifdef USE_SSL
 	if (port->ssl_in_use)
 	{
@@ -208,6 +224,21 @@ secure_write(Port *port, void *ptr, size_t len)
 		n = secure_raw_write(port, ptr, len);
 	}
 
+	/*
+	 * XXX: We'll, at some later point, likely want to add interrupt
+	 * processing here.
+	 */
+
+	/*
+	 * Retry after processing interrupts. This can be triggered even though we
+	 * don't check for latch set's during writing yet, because SSL
+	 * renegotiations might have required reading from the socket.
+	 */
+	if (n < 0 && errno == EINTR)
+	{
+		goto retry;
+	}
+
 	return n;
 }
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6492067..77158c1 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -582,9 +582,6 @@ AutoVacLauncherMain(int argc, char *argv[])
 		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
 								 false, &nap);
 
-		/* Allow sinval catchup interrupts while sleeping */
-		EnableCatchupInterrupt();
-
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
 		 * signal handlers will wake us by calling SetLatch).
@@ -595,7 +592,8 @@ AutoVacLauncherMain(int argc, char *argv[])
 
 		ResetLatch(MyLatch);
 
-		DisableCatchupInterrupt();
+		/* Process sinval catchup interrupts that happened while sleeping */
+		ProcessCatchupInterrupt();
 
 		/*
 		 * Emergency bailout if postmaster has died.  This is to avoid the
diff --git a/src/backend/storage/ipc/sinval.c b/src/backend/storage/ipc/sinval.c
index e6b0d49..67ec515 100644
--- a/src/backend/storage/ipc/sinval.c
+++ b/src/backend/storage/ipc/sinval.c
@@ -18,6 +18,7 @@
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "storage/ipc.h"
+#include "storage/proc.h"
 #include "storage/sinvaladt.h"
 #include "utils/inval.h"
 
@@ -32,19 +33,12 @@ uint64		SharedInvalidMessageCounter;
  * through a cache reset exercise.  This is done by sending
  * PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far behind.
  *
- * State for catchup events consists of two flags: one saying whether
- * the signal handler is currently allowed to call ProcessCatchupEvent
- * directly, and one saying whether the signal has occurred but the handler
- * was not allowed to call ProcessCatchupEvent at the time.
- *
- * NB: the "volatile" on these declarations is critical!  If your compiler
- * does not grok "volatile", you'd be best advised to compile this file
- * with all optimization turned off.
+ * The signal handler will set a interrupt pending flag and will set the
+ * processes latch. Whenever starting to read from the client, or when
+ * interrupted while doing so, ProcessClientReadInterrupt() will call
+ * ProcessCatchupEvent().
  */
-static volatile int catchupInterruptEnabled = 0;
-static volatile int catchupInterruptOccurred = 0;
-
-static void ProcessCatchupEvent(void);
+volatile sig_atomic_t catchupInterruptPending = false;
 
 
 /*
@@ -141,9 +135,9 @@ ReceiveSharedInvalidMessages(
 	 * catchup signal this way avoids creating spikes in system load for what
 	 * should be just a background maintenance activity.
 	 */
-	if (catchupInterruptOccurred)
+	if (catchupInterruptPending)
 	{
-		catchupInterruptOccurred = 0;
+		catchupInterruptPending = false;
 		elog(DEBUG4, "sinval catchup complete, cleaning queue");
 		SICleanupQueue(false, 0);
 	}
@@ -155,12 +149,9 @@ ReceiveSharedInvalidMessages(
  *
  * This is called when PROCSIG_CATCHUP_INTERRUPT is received.
  *
- * If we are idle (catchupInterruptEnabled is set), we can safely
- * invoke ProcessCatchupEvent directly.  Otherwise, just set a flag
- * to do it later.  (Note that it's quite possible for normal processing
- * of the current transaction to cause ReceiveSharedInvalidMessages()
- * to be run later on; in that case the flag will get cleared again,
- * since there's no longer any reason to do anything.)
+ * We used to directly call ProcessCatchupEvent directly when idle. These days
+ * we just set a flag to do it later and notify the process of that fact by
+ * setting the process's latch.
  */
 void
 HandleCatchupInterrupt(void)
@@ -170,174 +161,46 @@ HandleCatchupInterrupt(void)
 	 * you do here.
 	 */
 
-	/* Don't joggle the elbow of proc_exit */
-	if (proc_exit_inprogress)
-		return;
-
-	if (catchupInterruptEnabled)
-	{
-		bool		save_ImmediateInterruptOK = ImmediateInterruptOK;
-
-		/*
-		 * We may be called while ImmediateInterruptOK is true; turn it off
-		 * while messing with the catchup state.  This prevents problems if
-		 * SIGINT or similar arrives while we're working.  Just to be real
-		 * sure, bump the interrupt holdoff counter as well.  That way, even
-		 * if something inside ProcessCatchupEvent() transiently sets
-		 * ImmediateInterruptOK (eg while waiting on a lock), we won't get
-		 * interrupted until we're done with the catchup interrupt.
-		 */
-		ImmediateInterruptOK = false;
-		HOLD_INTERRUPTS();
-
-		/*
-		 * I'm not sure whether some flavors of Unix might allow another
-		 * SIGUSR1 occurrence to recursively interrupt this routine. To cope
-		 * with the possibility, we do the same sort of dance that
-		 * EnableCatchupInterrupt must do --- see that routine for comments.
-		 */
-		catchupInterruptEnabled = 0;	/* disable any recursive signal */
-		catchupInterruptOccurred = 1;	/* do at least one iteration */
-		for (;;)
-		{
-			catchupInterruptEnabled = 1;
-			if (!catchupInterruptOccurred)
-				break;
-			catchupInterruptEnabled = 0;
-			if (catchupInterruptOccurred)
-			{
-				/* Here, it is finally safe to do stuff. */
-				ProcessCatchupEvent();
-			}
-		}
+	catchupInterruptPending = true;
 
-		/*
-		 * Restore the holdoff level and ImmediateInterruptOK, and check for
-		 * interrupts if needed.
-		 */
-		RESUME_INTERRUPTS();
-		ImmediateInterruptOK = save_ImmediateInterruptOK;
-		if (save_ImmediateInterruptOK)
-			CHECK_FOR_INTERRUPTS();
-	}
-	else
-	{
-		/*
-		 * In this path it is NOT SAFE to do much of anything, except this:
-		 */
-		catchupInterruptOccurred = 1;
-	}
+	/* make sure the event is processed in due course */
+	SetLatch(MyLatch);
 }
 
 /*
- * EnableCatchupInterrupt
- *
- * This is called by the PostgresMain main loop just before waiting
- * for a frontend command.  We process any pending catchup events,
- * and enable the signal handler to process future events directly.
+ * ProcessCatchupInterrupt
  *
- * NOTE: the signal handler starts out disabled, and stays so until
- * PostgresMain calls this the first time.
+ * The portion of catchup interrupt handling that runs outside of the signal
+ * handler, which allows it to actually process pending invalidations.
  */
 void
-EnableCatchupInterrupt(void)
+ProcessCatchupInterrupt(void)
 {
-	/*
-	 * This code is tricky because we are communicating with a signal handler
-	 * that could interrupt us at any point.  If we just checked
-	 * catchupInterruptOccurred and then set catchupInterruptEnabled, we could
-	 * fail to respond promptly to a signal that happens in between those two
-	 * steps.  (A very small time window, perhaps, but Murphy's Law says you
-	 * can hit it...)  Instead, we first set the enable flag, then test the
-	 * occurred flag.  If we see an unserviced interrupt has occurred, we
-	 * re-clear the enable flag before going off to do the service work. (That
-	 * prevents re-entrant invocation of ProcessCatchupEvent() if another
-	 * interrupt occurs.) If an interrupt comes in between the setting and
-	 * clearing of catchupInterruptEnabled, then it will have done the service
-	 * work and left catchupInterruptOccurred zero, so we have to check again
-	 * after clearing enable.  The whole thing has to be in a loop in case
-	 * another interrupt occurs while we're servicing the first. Once we get
-	 * out of the loop, enable is set and we know there is no unserviced
-	 * interrupt.
-	 *
-	 * NB: an overenthusiastic optimizing compiler could easily break this
-	 * code. Hopefully, they all understand what "volatile" means these days.
-	 */
-	for (;;)
+	while (catchupInterruptPending)
 	{
-		catchupInterruptEnabled = 1;
-		if (!catchupInterruptOccurred)
-			break;
-		catchupInterruptEnabled = 0;
-		if (catchupInterruptOccurred)
-			ProcessCatchupEvent();
-	}
-}
-
-/*
- * DisableCatchupInterrupt
- *
- * This is called by the PostgresMain main loop just after receiving
- * a frontend command.  Signal handler execution of catchup events
- * is disabled until the next EnableCatchupInterrupt call.
- *
- * The PROCSIG_NOTIFY_INTERRUPT signal handler also needs to call this,
- * so as to prevent conflicts if one signal interrupts the other.  So we
- * must return the previous state of the flag.
- */
-bool
-DisableCatchupInterrupt(void)
-{
-	bool		result = (catchupInterruptEnabled != 0);
-
-	catchupInterruptEnabled = 0;
-
-	return result;
-}
-
-/*
- * ProcessCatchupEvent
- *
- * Respond to a catchup event (PROCSIG_CATCHUP_INTERRUPT) from another
- * backend.
- *
- * This is called either directly from the PROCSIG_CATCHUP_INTERRUPT
- * signal handler, or the next time control reaches the outer idle loop
- * (assuming there's still anything to do by then).
- */
-static void
-ProcessCatchupEvent(void)
-{
-	bool		notify_enabled;
-
-	/* Must prevent notify interrupt while I am running */
-	notify_enabled = DisableNotifyInterrupt();
-
-	/*
-	 * What we need to do here is cause ReceiveSharedInvalidMessages() to run,
-	 * which will do the necessary work and also reset the
-	 * catchupInterruptOccurred flag.  If we are inside a transaction we can
-	 * just call AcceptInvalidationMessages() to do this.  If we aren't, we
-	 * start and immediately end a transaction; the call to
-	 * AcceptInvalidationMessages() happens down inside transaction start.
-	 *
-	 * It is awfully tempting to just call AcceptInvalidationMessages()
-	 * without the rest of the xact start/stop overhead, and I think that
-	 * would actually work in the normal case; but I am not sure that things
-	 * would clean up nicely if we got an error partway through.
-	 */
-	if (IsTransactionOrTransactionBlock())
-	{
-		elog(DEBUG4, "ProcessCatchupEvent inside transaction");
-		AcceptInvalidationMessages();
-	}
-	else
-	{
-		elog(DEBUG4, "ProcessCatchupEvent outside transaction");
-		StartTransactionCommand();
-		CommitTransactionCommand();
+		/*
+		 * What we need to do here is cause ReceiveSharedInvalidMessages() to
+		 * run, which will do the necessary work and also reset the
+		 * catchupInterruptPending flag.  If we are inside a transaction we
+		 * can just call AcceptInvalidationMessages() to do this.  If we
+		 * aren't, we start and immediately end a transaction; the call to
+		 * AcceptInvalidationMessages() happens down inside transaction start.
+		 *
+		 * It is awfully tempting to just call AcceptInvalidationMessages()
+		 * without the rest of the xact start/stop overhead, and I think that
+		 * would actually work in the normal case; but I am not sure that things
+		 * would clean up nicely if we got an error partway through.
+		 */
+		if (IsTransactionOrTransactionBlock())
+		{
+			elog(DEBUG4, "ProcessCatchupEvent inside transaction");
+			AcceptInvalidationMessages();
+		}
+		else
+		{
+			elog(DEBUG4, "ProcessCatchupEvent outside transaction");
+			StartTransactionCommand();
+			CommitTransactionCommand();
+		}
 	}
-
-	if (notify_enabled)
-		EnableNotifyInterrupt();
 }
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 556e563..1f26c6c 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -301,17 +301,25 @@ InteractiveBackend(StringInfo inBuf)
  * interactive_getc -- collect one character from stdin
  *
  * Even though we are not reading from a "client" process, we still want to
- * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * respond to signals, particularly SIGTERM/SIGQUIT.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	/*
+	 * This will not process catchup interrupts or notifications while
+	 * reading. But those can't really be relevant for a standalone backend
+	 * anyway. To properly handle SIGTERM there's a hack in die() that
+	 * directly processes interrupts at this stage...
+	 */
+	CHECK_FOR_INTERRUPTS();
+
 	c = getc(stdin);
-	client_read_ended();
+
+	ProcessClientReadInterrupt();
+
 	return c;
 }
 
@@ -513,53 +521,33 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client input
+ * ProcessClientReadInterrupt() - Process interrupts specific to client reads
  *
- * This must be called immediately before any low-level read from the
- * client connection.  It is necessary to do it at a sufficiently low level
- * that there won't be any other operations except the read kernel call
- * itself between this call and the subsequent client_read_ended() call.
- * In particular there mustn't be use of malloc() or other potentially
- * non-reentrant libc functions.  This restriction makes it safe for us
- * to allow interrupt service routines to execute nontrivial code while
- * we are waiting for input.
- */
-void
-prepare_for_client_read(void)
-{
-	if (DoingCommandRead)
-	{
-		/* Enable immediate processing of asynchronous signals */
-		EnableNotifyInterrupt();
-		EnableCatchupInterrupt();
-
-		/* Allow die interrupts to be processed while waiting */
-		ImmediateInterruptOK = true;
-
-		/* And don't forget to detect one that already arrived */
-		CHECK_FOR_INTERRUPTS();
-	}
-}
-
-/*
- * client_read_ended -- get out of the client-input state
+ * This is called just after low-level reads. That might be after the read
+ * finished successfully, or it was interrupted via interrupt.
  *
- * This is called just after low-level reads.  It must preserve errno!
+ * Must preserve errno!
  */
 void
-client_read_ended(void)
+ProcessClientReadInterrupt(void)
 {
+	int			save_errno = errno;
+
 	if (DoingCommandRead)
 	{
-		int			save_errno = errno;
-
-		ImmediateInterruptOK = false;
+		/* Check for general interrupts that arrived while reading */
+		CHECK_FOR_INTERRUPTS();
 
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
+		/* Process sinval catchup interrupts that happened while reading */
+		if (catchupInterruptPending)
+			ProcessCatchupInterrupt();
 
-		errno = save_errno;
+		/* Process sinval catchup interrupts that happened while reading */
+		if (notifyInterruptPending)
+			ProcessNotifyInterrupt();
 	}
+
+	errno = save_errno;
 }
 
 
@@ -2626,6 +2614,13 @@ die(SIGNAL_ARGS)
 	/* If we're still here, waken anything waiting on the process latch */
 	SetLatch(MyLatch);
 
+	/*
+	 * If we're in single user mode, we want to quit immediately - we can't
+	 * rely on latches as they wouldn't work when stdin/stdout is a file.
+	 */
+	if (DoingCommandRead && whereToSendOutput != DestRemote)
+		ProcessInterrupts();
+
 	errno = save_errno;
 }
 
@@ -2834,8 +2829,6 @@ ProcessInterrupts(void)
 		QueryCancelPending = false;		/* ProcDie trumps QueryCancel */
 		ImmediateInterruptOK = false;	/* not idle anymore */
 		LockErrorCleanup();
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 		/* As in quickdie, don't risk sending to client during auth */
 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
 			whereToSendOutput = DestNone;
@@ -2871,8 +2864,6 @@ ProcessInterrupts(void)
 		QueryCancelPending = false;		/* lost connection trumps QueryCancel */
 		ImmediateInterruptOK = false;	/* not idle anymore */
 		LockErrorCleanup();
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 		/* don't send to client, we already know the connection to be dead. */
 		whereToSendOutput = DestNone;
 		ereport(FATAL,
@@ -2892,8 +2883,6 @@ ProcessInterrupts(void)
 		ImmediateInterruptOK = false;		/* not idle anymore */
 		RecoveryConflictPending = false;
 		LockErrorCleanup();
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 		pgstat_report_recovery_conflict(RecoveryConflictReason);
 		ereport(FATAL,
 				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
@@ -2926,8 +2915,6 @@ ProcessInterrupts(void)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			/* As in quickdie, don't risk sending to client during auth */
 			if (whereToSendOutput == DestRemote)
 				whereToSendOutput = DestNone;
@@ -2945,8 +2932,6 @@ ProcessInterrupts(void)
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			(void) get_timeout_indicator(STATEMENT_TIMEOUT, true);
 			LockErrorCleanup();
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
 					 errmsg("canceling statement due to lock timeout")));
@@ -2955,8 +2940,6 @@ ProcessInterrupts(void)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to statement timeout")));
@@ -2965,8 +2948,6 @@ ProcessInterrupts(void)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling autovacuum task")));
@@ -2976,8 +2957,6 @@ ProcessInterrupts(void)
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			RecoveryConflictPending = false;
 			LockErrorCleanup();
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			pgstat_report_recovery_conflict(RecoveryConflictReason);
 			ereport(ERROR,
 					(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
@@ -2994,13 +2973,12 @@ ProcessInterrupts(void)
 		{
 			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
-			DisableNotifyInterrupt();
-			DisableCatchupInterrupt();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to user request")));
 		}
 	}
+
 	/* If we get here, do nothing (probably, QueryCancelPending was reset) */
 }
 
@@ -3843,14 +3821,8 @@ PostgresMain(int argc, char *argv[],
 		disable_all_timeouts(false);
 		QueryCancelPending = false;		/* second to avoid race condition */
 
-		/*
-		 * Turn off these interrupts too.  This is only needed here and not in
-		 * other exception-catching places since these interrupts are only
-		 * enabled while we wait for client input.
-		 */
+		/* Not reading from the client anymore. */
 		DoingCommandRead = false;
-		DisableNotifyInterrupt();
-		DisableCatchupInterrupt();
 
 		/* Make sure libpq is in a good state */
 		pq_comm_reset();
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 87c3abb..8491f47 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -13,6 +13,8 @@
 #ifndef ASYNC_H
 #define ASYNC_H
 
+#include <signal.h>
+
 #include "fmgr.h"
 
 /*
@@ -21,6 +23,7 @@
 #define NUM_ASYNC_BUFFERS	8
 
 extern bool Trace_notify;
+extern volatile sig_atomic_t notifyInterruptPending;
 
 extern Size AsyncShmemSize(void);
 extern void AsyncShmemInit(void);
@@ -48,12 +51,7 @@ extern void ProcessCompletedNotifies(void);
 /* signal handler for inbound notifies (PROCSIG_NOTIFY_INTERRUPT) */
 extern void HandleNotifyInterrupt(void);
 
-/*
- * enable/disable processing of inbound notifies directly from signal handler.
- * The enable routine first performs processing of any inbound notifies that
- * have occurred since the last disable.
- */
-extern void EnableNotifyInterrupt(void);
-extern bool DisableNotifyInterrupt(void);
+/* process interrupts */
+extern void ProcessNotifyInterrupt(void);
 
 #endif   /* ASYNC_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index 1a6f2df..d9ffd72 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -14,8 +14,9 @@
 #ifndef SINVAL_H
 #define SINVAL_H
 
-#include "storage/relfilenode.h"
+#include <signal.h>
 
+#include "storage/relfilenode.h"
 
 /*
  * We support several types of shared-invalidation messages:
@@ -123,6 +124,7 @@ typedef union
 /* Counter of messages processed; don't worry about overflow. */
 extern uint64 SharedInvalidMessageCounter;
 
+extern volatile sig_atomic_t catchupInterruptPending;
 
 extern void SendSharedInvalidMessages(const SharedInvalidationMessage *msgs,
 						  int n);
@@ -138,8 +140,7 @@ extern void HandleCatchupInterrupt(void);
  * The enable routine first performs processing of any catchup events that
  * have occurred since the last disable.
  */
-extern void EnableCatchupInterrupt(void);
-extern bool DisableCatchupInterrupt(void);
+extern void ProcessCatchupInterrupt(void);
 
 extern int xactGetCommittedInvalidationMessages(SharedInvalidationMessage **msgs,
 									 bool *RelcacheInitFileInval);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 0a350fd..fe8c725 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -67,8 +67,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
 extern void FloatExceptionHandler(SIGNAL_ARGS) __attribute__((noreturn));
 extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
 																 * handler */
-extern void prepare_for_client_read(void);
-extern void client_read_ended(void);
+extern void ProcessClientReadInterrupt(void);
+
 extern void process_postgres_switches(int argc, char *argv[],
 						  GucContext ctx, const char **dbname);
 extern void PostgresMain(int argc, char *argv[],
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0003-Process-die-interrupts-while-reading-writing-from-th.patchtext/x-patch; charset=us-asciiDownload
>From 53c1c1c6aebbae799fe06570a98143c899b7c5b5 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 28 Sep 2014 00:22:39 +0200
Subject: [PATCH 3/7] Process 'die' interrupts while reading/writing from the
 client socket.

Up to now it was impossible to terminate a backend that was trying to
send/recv data to/from the client when the socket's buffer was already
full/empty. While the send/recv calls itself might have gotten
interrupted by signals on some platforms, we just immediately retried.

That could lead to situations where a backend couldn't be terminated
after the client died without the connection being closed.

The problem was far more likely to be hit when sending data than when
reading, as when reading a command from the client and during
authentication, we already processed interrupt. That primarily left
COPY FROM STDIN as being problematic.

Change things so that that we process 'die' events immediately when
the appropriate signal arrives during both recv/send. We can't react
to query cancels at that point, because we might loose sync with the
client as we could be in the middle of writing a message.

We don't interrupt writes if the write buffer isn't full as that would
lead to far fewer error messages reaching clients.

Per discussion with Kyotaro HORIGUCHI and Heikki Linnakangas

Discussion: 20140927191243.GD5423@alap3.anarazel.de
---
 src/backend/libpq/be-secure-openssl.c | 24 +++++++++++----
 src/backend/libpq/be-secure.c         | 58 ++++++++++++++++++++++-------------
 src/backend/tcop/postgres.c           | 50 ++++++++++++++++++++++++++++--
 src/include/tcop/tcopprot.h           |  3 +-
 4 files changed, 105 insertions(+), 30 deletions(-)

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index e4c1a25..08e3422 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -556,7 +556,7 @@ rloop:
 			if (latchret & WL_LATCH_SET)
 			{
 				ResetLatch(MyLatch);
-				ProcessClientReadInterrupt();  /* preserves errno */
+				ProcessClientReadInterrupt(true);  /* preserves errno */
 			}
 			goto rloop;
 		case SSL_ERROR_SYSCALL:
@@ -598,6 +598,7 @@ be_tls_write(Port *port, void *ptr, size_t len)
 	ssize_t		n;
 	int			err;
 	int			waitfor;
+	int			latchret;
 
 	/*
 	 * If SSL renegotiations are enabled and we're getting close to the
@@ -662,16 +663,27 @@ wloop:
 		case SSL_ERROR_WANT_READ:
 		case SSL_ERROR_WANT_WRITE:
 
+			waitfor = WL_LATCH_SET;
+
 			if (err == SSL_ERROR_WANT_READ)
-				waitfor = WL_SOCKET_READABLE;
+				waitfor |= WL_SOCKET_READABLE;
 			else
-				waitfor = WL_SOCKET_WRITEABLE;
+				waitfor |= WL_SOCKET_WRITEABLE;
+
+			latchret = WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
 
-			WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
 			/*
-			 * XXX: We'll, at some later point, likely want to add interrupt
-			 * processing here.
+			 * Check for interrupts here, in addition to secure_write(),
+			 * because an interrupted write in secure_raw_write() will return
+			 * here, and we cannot return to secure_write() until we've
+			 * written something.
 			 */
+			if (latchret & WL_LATCH_SET)
+			{
+				ResetLatch(MyLatch);
+				ProcessClientWriteInterrupt(true); /* preserves errno */
+			}
+
 			goto wloop;
 		case SSL_ERROR_SYSCALL:
 			/* leave it to caller to ereport the value of errno */
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index b90ab0e..c2c1842 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -140,14 +140,27 @@ retry:
 		n = secure_raw_read(port, ptr, len);
 	}
 
-	/* Process interrupts that happened while (or before) receiving. */
-	ProcessClientReadInterrupt(); /* preserves errno */
-
 	/* retry after processing interrupts */
 	if (n < 0 && errno == EINTR)
 	{
+		/*
+		 * We tried to read data, the socket was empty, and we were
+		 * interrupted while waiting for readability. We only process
+		 * interrupts if we got interrupted while reading and when in blocking
+		 * mode. In other cases it's better to allow the interrupts to be
+		 * handled at higher layers.
+		 */
+		ProcessClientReadInterrupt(!port->noblock); /* preserves errno */
 		goto retry;
 	}
+
+	/*
+	 * Process interrupts that happened while (or before) receiving. Note that
+	 * we signal that we're not blocking, which will prevent some types of
+	 * interrupts from being processed.
+	 */
+	ProcessClientReadInterrupt(false);
+
 	return n;
 }
 
@@ -224,18 +237,17 @@ retry:
 		n = secure_raw_write(port, ptr, len);
 	}
 
-	/*
-	 * XXX: We'll, at some later point, likely want to add interrupt
-	 * processing here.
-	 */
-
-	/*
-	 * Retry after processing interrupts. This can be triggered even though we
-	 * don't check for latch set's during writing yet, because SSL
-	 * renegotiations might have required reading from the socket.
-	 */
+	/* retry after processing interrupts */
 	if (n < 0 && errno == EINTR)
 	{
+		/*
+		 * We tried to send data, the socket was full, and we were interrupted
+		 * while waiting for writability. We only process interrupts if we got
+		 * interrupted while writing and when in blocking mode. In other cases
+		 * it's better to allow the interrupts to be handled at higher layers.
+		 */
+		ProcessClientWriteInterrupt(!port->noblock);
+
 		goto retry;
 	}
 
@@ -262,17 +274,21 @@ wloop:
 		int		w;
 		int		save_errno = errno;
 
-		/*
-		 * We probably want to check for latches being set at some point
-		 * here. That'd allow us to handle interrupts while blocked on
-		 * writes. If set we'd not retry directly, but return. That way we
-		 * don't do anything while (possibly) inside a ssl library.
-		 */
 		w = WaitLatchOrSocket(MyLatch,
-							  WL_SOCKET_WRITEABLE,
+							  WL_LATCH_SET | WL_SOCKET_WRITEABLE,
 							  port->sock, 0);
 
-		if (w & WL_SOCKET_WRITEABLE)
+		if (w & WL_LATCH_SET)
+		{
+			ResetLatch(MyLatch);
+			/*
+			 * Force a return, so interrupts can be processed when not
+			 * (possibly) underneath a ssl library.
+			 */
+			errno = EINTR;
+			return -1;
+		}
+		else if (w & WL_SOCKET_WRITEABLE)
 		{
 			goto wloop;
 		}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 1f26c6c..060f62a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -318,7 +318,7 @@ interactive_getc(void)
 
 	c = getc(stdin);
 
-	ProcessClientReadInterrupt();
+	ProcessClientReadInterrupt(true);
 
 	return c;
 }
@@ -529,7 +529,7 @@ ReadCommand(StringInfo inBuf)
  * Must preserve errno!
  */
 void
-ProcessClientReadInterrupt(void)
+ProcessClientReadInterrupt(bool blocked)
 {
 	int			save_errno = errno;
 
@@ -546,10 +546,56 @@ ProcessClientReadInterrupt(void)
 		if (notifyInterruptPending)
 			ProcessNotifyInterrupt();
 	}
+	else if (ProcDiePending && blocked)
+	{
+		/*
+		 * We're dying. It's safe (and sane) to handle that now.
+		 */
+		CHECK_FOR_INTERRUPTS();
+	}
 
 	errno = save_errno;
 }
 
+/*
+ * ProcessClientWriteInterrupt() - Process interrupts specific to client writes
+ *
+ * This is called just after low-level writes. That might be after the read
+ * finished successfully, or it was interrupted via interrupt. 'blocked' tells
+ * us whether the
+ *
+ * Must preserve errno!
+ */
+void
+ProcessClientWriteInterrupt(bool blocked)
+{
+	int			save_errno = errno;
+
+	Assert(InterruptHoldoffCount == 0 && CritSectionCount == 0);
+
+	/*
+	 * We only want to process the interrupt here if socket writes are
+	 * blocking to increase the chance to get an error message to the
+	 * client. If we're not blocked there'll soon be a
+	 * CHECK_FOR_INTERRUPTS(). But if we're blocked we'll never get out of
+	 * that situation if the client has died.
+	 */
+	if (ProcDiePending && blocked)
+	{
+		/*
+		 * We're dying. It's safe (and sane) to handle that now. But we don't
+		 * want to send the client the error message as that a) would possibly
+		 * block again b) would possibly lead to sending an error message to
+		 * the client, while we already started to send something else.
+		 */
+		if (whereToSendOutput == DestRemote)
+			whereToSendOutput = DestNone;
+
+		CHECK_FOR_INTERRUPTS();
+	}
+
+	errno = save_errno;
+}
 
 /*
  * Do raw parsing (only).
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index fe8c725..3e17770 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -67,7 +67,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
 extern void FloatExceptionHandler(SIGNAL_ARGS) __attribute__((noreturn));
 extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
 																 * handler */
-extern void ProcessClientReadInterrupt(void);
+extern void ProcessClientReadInterrupt(bool blocked);
+extern void ProcessClientWriteInterrupt(bool blocked);
 
 extern void process_postgres_switches(int argc, char *argv[],
 						  GucContext ctx, const char **dbname);
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0004-Don-t-allow-immediate-interrupts-during-authenticati.patchtext/x-patch; charset=us-asciiDownload
>From 09af7c3bdb2d030759e01fc15bfc53cfabfdb790 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 15 Jan 2015 00:13:37 +0100
Subject: [PATCH 4/7] Don't allow immediate interrupts during authentication
 anymore.

We used to handle authentication_timeout by setting
ImmediateInterruptOK to true during large parts of the authentication
phase of a new connection.  While that happens to work acceptably in
practice, it's not particularly nice and has ugly corner cases.

Previous commits converted the FE/BE communication to use latches and
implemented support for interrupt handling during both
send/recv. Building on top of that work we can get rid of
ImmediateInterruptOK during authentication, by immediately treating
timeouts during authentication as a reason to die. As die interrupts
are handled immediately during client communication that provides a
sensibly quick reaction time to authentication timeout.

Additionally add a few CHECK_FOR_INTERRUPTS() to some more complex
authentication methods. More could be added, but this already should
provides a reasonable coverage.

While it this overall increases the maximum time till a timeout is
reacted to, it greatly reduces complexity and increases
reliability. That seems like a overall win. If the increase proves to
be noticeable we can deal with those cases by moving to nonblocking
network code and add interrupt checking there.

Reviewed-By: Heikki Linnakangas
---
 src/backend/libpq/auth.c              | 30 ++++++++++++++++++++----------
 src/backend/libpq/be-secure-openssl.c |  5 +++++
 src/backend/libpq/crypt.c             | 10 ----------
 src/backend/tcop/postgres.c           | 17 +++++------------
 src/backend/utils/init/postinit.c     | 16 +++++++++++-----
 5 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 3f3cf44..2d6b1cb 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -306,13 +306,6 @@ ClientAuthentication(Port *port)
 	 */
 	hba_getauthmethod(port);
 
-	/*
-	 * Enable immediate response to SIGTERM/SIGINT/timeout interrupts. (We
-	 * don't want this during hba_getauthmethod() because it might have to do
-	 * database access, eg for role membership checks.)
-	 */
-	ImmediateInterruptOK = true;
-	/* And don't forget to detect one that already arrived */
 	CHECK_FOR_INTERRUPTS();
 
 	/*
@@ -566,9 +559,6 @@ ClientAuthentication(Port *port)
 		sendAuthRequest(port, AUTH_REQ_OK);
 	else
 		auth_failed(port, status, logdetail);
-
-	/* Done with authentication, so we should turn off immediate interrupts */
-	ImmediateInterruptOK = false;
 }
 
 
@@ -580,6 +570,8 @@ sendAuthRequest(Port *port, AuthRequest areq)
 {
 	StringInfoData buf;
 
+	CHECK_FOR_INTERRUPTS();
+
 	pq_beginmessage(&buf, 'R');
 	pq_sendint(&buf, (int32) areq, sizeof(int32));
 
@@ -613,6 +605,8 @@ sendAuthRequest(Port *port, AuthRequest areq)
 	 */
 	if (areq != AUTH_REQ_OK)
 		pq_flush();
+
+	CHECK_FOR_INTERRUPTS();
 }
 
 /*
@@ -851,6 +845,9 @@ pg_GSS_recvauth(Port *port)
 	do
 	{
 		pq_startmsgread();
+
+		CHECK_FOR_INTERRUPTS();
+
 		mtype = pq_getbyte();
 		if (mtype != 'p')
 		{
@@ -900,6 +897,8 @@ pg_GSS_recvauth(Port *port)
 			 maj_stat, min_stat,
 			 (unsigned int) port->gss->outbuf.length, gflags);
 
+		CHECK_FOR_INTERRUPTS();
+
 		if (port->gss->outbuf.length != 0)
 		{
 			/*
@@ -1396,6 +1395,9 @@ interpret_ident_response(const char *ident_response,
  *	IP addresses and port numbers are in network byte order.
  *
  *	But iff we're unable to get the information from ident, return false.
+ *
+ *	XXX: Using WaitLatchOrSocket() and doing a CHECK_FOR_INTERRUPTS() if the
+ *	latch was set would improve the responsiveness to timeouts/cancellations.
  */
 static int
 ident_inet(hbaPort *port)
@@ -1510,6 +1512,8 @@ ident_inet(hbaPort *port)
 	/* loop in case send is interrupted */
 	do
 	{
+		CHECK_FOR_INTERRUPTS();
+
 		rc = send(sock_fd, ident_query, strlen(ident_query), 0);
 	} while (rc < 0 && errno == EINTR);
 
@@ -1525,6 +1529,8 @@ ident_inet(hbaPort *port)
 
 	do
 	{
+		CHECK_FOR_INTERRUPTS();
+
 		rc = recv(sock_fd, ident_response, sizeof(ident_response) - 1, 0);
 	} while (rc < 0 && errno == EINTR);
 
@@ -2413,6 +2419,10 @@ CheckRADIUSAuth(Port *port)
 	 * call to select() with a timeout, since somebody can be sending invalid
 	 * packets to our port thus causing us to retry in a loop and never time
 	 * out.
+	 *
+	 * XXX: Using WaitLatchOrSocket() and doing a CHECK_FOR_INTERRUPTS() if
+	 * the latch was set would improve the responsiveness to
+	 * timeouts/cancellations.
 	 */
 	gettimeofday(&endtime, NULL);
 	endtime.tv_sec += RADIUS_TIMEOUT;
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 08e3422..7b64d03 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -377,6 +377,11 @@ aloop:
 				/* not allowed during connection establishment */
 				Assert(!port->noblock);
 
+				/*
+				 * No need to care about timeouts/interrupts here. At this
+				 * point authentication_timeout still employs
+				 * StartupPacketTimeoutHandler() which directly exits.
+				 */
 				if (err == SSL_ERROR_WANT_READ)
 					waitfor = WL_SOCKET_READABLE;
 				else
diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c
index 599b63a..97be944 100644
--- a/src/backend/libpq/crypt.c
+++ b/src/backend/libpq/crypt.c
@@ -47,13 +47,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
 	Datum		datum;
 	bool		isnull;
 
-	/*
-	 * Disable immediate interrupts while doing database access.  (Note we
-	 * don't bother to turn this back on if we hit one of the failure
-	 * conditions, since we can expect we'll just exit right away anyway.)
-	 */
-	ImmediateInterruptOK = false;
-
 	/* Get role info from pg_authid */
 	roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum(role));
 	if (!HeapTupleIsValid(roleTup))
@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
 	if (*shadow_pass == '\0')
 		return STATUS_ERROR;	/* empty password */
 
-	/* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
-	ImmediateInterruptOK = true;
-	/* And don't forget to detect one that already arrived */
 	CHECK_FOR_INTERRUPTS();
 
 	/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 060f62a..832561e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2878,7 +2878,11 @@ ProcessInterrupts(void)
 		/* As in quickdie, don't risk sending to client during auth */
 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
 			whereToSendOutput = DestNone;
-		if (IsAutoVacuumWorkerProcess())
+		if (ClientAuthInProgress)
+			ereport(FATAL,
+					(errcode(ERRCODE_QUERY_CANCELED),
+					 errmsg("canceling authentication due to timeout")));
+		else if (IsAutoVacuumWorkerProcess())
 			ereport(FATAL,
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
 					 errmsg("terminating autovacuum process due to administrator command")));
@@ -2957,17 +2961,6 @@ ProcessInterrupts(void)
 		}
 
 		QueryCancelPending = false;
-		if (ClientAuthInProgress)
-		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
-			LockErrorCleanup();
-			/* As in quickdie, don't risk sending to client during auth */
-			if (whereToSendOutput == DestRemote)
-				whereToSendOutput = DestNone;
-			ereport(ERROR,
-					(errcode(ERRCODE_QUERY_CANCELED),
-					 errmsg("canceling authentication due to timeout")));
-		}
 
 		/*
 		 * If LOCK_TIMEOUT and STATEMENT_TIMEOUT indicators are both set, we
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 983b237..66aa7ea 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -1099,18 +1099,24 @@ ShutdownPostgres(int code, Datum arg)
 static void
 StatementTimeoutHandler(void)
 {
+	int sig = SIGINT;
+
+	/*
+	 * During authentication the timeout is used to deal with
+	 * authentication_timeout - we want to quit in response to such timeouts.
+	 */
+	if (ClientAuthInProgress)
+		sig = SIGTERM;
+
 #ifdef HAVE_SETSID
 	/* try to signal whole process group */
-	kill(-MyProcPid, SIGINT);
+	kill(-MyProcPid, sig);
 #endif
-	kill(MyProcPid, SIGINT);
+	kill(MyProcPid, sig);
 }
 
 /*
  * LOCK_TIMEOUT handler: trigger a query-cancel interrupt.
- *
- * This is identical to StatementTimeoutHandler, but since it's so short,
- * we might as well keep the two functions separate for clarity.
  */
 static void
 LockTimeoutHandler(void)
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0005-Move-deadlock-and-other-interrupt-handling-in-proc.c.patchtext/x-patch; charset=us-asciiDownload
>From b352e05600222ea69731dae04a939876252a4e85 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 13 Jan 2015 11:58:27 +0100
Subject: [PATCH 5/7] Move deadlock and other interrupt handling in proc.c out
 of signal handlers.

Deadlock checking was performed inside signal handlers up to
now. While it's a remarkable feat to have made this work reliably,
it's quite complex to understand why that is the case. Partially it
worked due to the assumption that semaphores are signal safe - which
is not actually documented to be the case for sysv semaphores.

The reason we had to rely on performing this work inside signal
handlers is that semaphores aren't guaranteed to be interruptable by
signals on all platforms. But now that latches provide a somewhat
similar API, which actually has the guarantee of being interruptible,
we can avoid doing so.

Signalling between ProcSleep, ProcWakeup, ProcWaitForSignal and
ProcSendSignal is now done using latches. This increases the
likelihood of spurious wakeups. As spurious wakeup already were
possible and aren't likely to be frequent enough to be an actual
problem, this seems acceptable.

This change would allow for further simplification of the deadlock
checking, now that it doesn't have to run in a signal handler. But
even if I were motivated to do so right now, it would still be better
to do that separately. Such a cleanup shouldn't have to be reviewed a
the same time as the more fundamental changes in this commit.

There is one possible usability regression due to this commit. Namely
it is more likely than before that log_lock_waits messages are output
more than once.

Reviewed-By: Heikki Linnakangas
---
 src/backend/storage/lmgr/proc.c   | 131 ++++++++++++++++++--------------------
 src/backend/utils/init/postinit.c |   2 +-
 src/include/storage/proc.h        |   2 +-
 3 files changed, 63 insertions(+), 72 deletions(-)

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 2463622..33b2f69 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -80,13 +80,15 @@ PGPROC	   *PreparedXactProcs = NULL;
 /* If we are waiting for a lock, this points to the associated LOCALLOCK */
 static LOCALLOCK *lockAwaited = NULL;
 
-/* Mark this volatile because it can be changed by signal handler */
-static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
+static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
 
+/* Is a deadlock check pending? */
+static volatile sig_atomic_t got_deadlock_timeout;
 
 static void RemoveProcFromArray(int code, Datum arg);
 static void ProcKill(int code, Datum arg);
 static void AuxiliaryProcKill(int code, Datum arg);
+static void CheckDeadLock(void);
 
 
 /*
@@ -705,16 +707,6 @@ LockErrorCleanup(void)
 
 	LWLockRelease(partitionLock);
 
-	/*
-	 * We used to do PGSemaphoreReset() here to ensure that our proc's wait
-	 * semaphore is reset to zero.  This prevented a leftover wakeup signal
-	 * from remaining in the semaphore if someone else had granted us the lock
-	 * we wanted before we were able to remove ourselves from the wait-list.
-	 * However, now that ProcSleep loops until waitStatus changes, a leftover
-	 * wakeup signal isn't harmful, and it seems not worth expending cycles to
-	 * get rid of a signal that most likely isn't there.
-	 */
-
 	RESUME_INTERRUPTS();
 }
 
@@ -942,9 +934,6 @@ ProcQueueInit(PROC_QUEUE *queue)
  *		we release the partition lock.
  *
  * NOTES: The process queue is now a priority queue for locking.
- *
- * P() on the semaphore should put us to sleep.  The process
- * semaphore is normally zero, so when we try to acquire it, we sleep.
  */
 int
 ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
@@ -1084,6 +1073,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 
 	/* Reset deadlock_state before enabling the timeout handler */
 	deadlock_state = DS_NOT_YET_CHECKED;
+	got_deadlock_timeout = false;
 
 	/*
 	 * Set timer so we can wake up after awhile and check for a deadlock. If a
@@ -1113,32 +1103,37 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout);
 
 	/*
-	 * If someone wakes us between LWLockRelease and PGSemaphoreLock,
-	 * PGSemaphoreLock will not block.  The wakeup is "saved" by the semaphore
-	 * implementation.  While this is normally good, there are cases where a
-	 * saved wakeup might be leftover from a previous operation (for example,
-	 * we aborted ProcWaitForSignal just before someone did ProcSendSignal).
-	 * So, loop to wait again if the waitStatus shows we haven't been granted
-	 * nor denied the lock yet.
+	 * If somebody wakes us between LWLockRelease and WaitLatch, the latch
+	 * will not wait. But a set latch does not necessarily mean that the lock
+	 * is free now, as there are many other sources for latch sets than
+	 * somebody releasing the lock.
 	 *
-	 * We pass interruptOK = true, which eliminates a window in which
-	 * cancel/die interrupts would be held off undesirably.  This is a promise
-	 * that we don't mind losing control to a cancel/die interrupt here.  We
-	 * don't, because we have no shared-state-change work to do after being
-	 * granted the lock (the grantor did it all).  We do have to worry about
-	 * canceling the deadlock timeout and updating the locallock table, but if
-	 * we lose control to an error, LockErrorCleanup will fix that up.
+	 * We process interrupts whenever the latch has been set, so cancel/die
+	 * interrupts are processed quickly. This means we must not mind losing
+	 * control to a cancel/die interrupt here.  We don't, because we have no
+	 * shared-state-change work to do after being granted the lock (the
+	 * grantor did it all).  We do have to worry about canceling the deadlock
+	 * timeout and updating the locallock table, but if we lose control to an
+	 * error, LockErrorCleanup will fix that up.
 	 */
 	do
 	{
-		PGSemaphoreLock(&MyProc->sem, true);
+		WaitLatch(MyLatch, WL_LATCH_SET, 0);
+		ResetLatch(MyLatch);
+		/* check for deadlocks first, as that's probably log-worthy */
+		if (got_deadlock_timeout)
+		{
+			CheckDeadLock();
+			got_deadlock_timeout = false;
+		}
+		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * waitStatus could change from STATUS_WAITING to something else
 		 * asynchronously.  Read it just once per loop to prevent surprising
 		 * behavior (such as missing log messages).
 		 */
-		myWaitStatus = MyProc->waitStatus;
+		myWaitStatus = *((volatile int *) &MyProc->waitStatus);
 
 		/*
 		 * If we are not deadlocked, but are waiting on an autovacuum-induced
@@ -1435,7 +1430,7 @@ ProcWakeup(PGPROC *proc, int waitStatus)
 	proc->waitStatus = waitStatus;
 
 	/* And awaken it */
-	PGSemaphoreUnlock(&proc->sem);
+	SetLatch(&proc->procLatch);
 
 	return retProc;
 }
@@ -1502,17 +1497,13 @@ ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock)
 /*
  * CheckDeadLock
  *
- * We only get to this routine if the DEADLOCK_TIMEOUT fired
- * while waiting for a lock to be released by some other process.  Look
- * to see if there's a deadlock; if not, just return and continue waiting.
- * (But signal ProcSleep to log a message, if log_lock_waits is true.)
- * If we have a real deadlock, remove ourselves from the lock's wait queue
- * and signal an error to ProcSleep.
- *
- * NB: this is run inside a signal handler, so be very wary about what is done
- * here or in called routines.
+ * We only get to this routine, if DEADLOCK_TIMEOUT fired while waiting for a
+ * lock to be released by some other process.  Check if there's a deadlock; if
+ * not, just return.  (But signal ProcSleep to log a message, if
+ * log_lock_waits is true.)  If we have a real deadlock, remove ourselves from
+ * the lock's wait queue and signal an error to ProcSleep.
  */
-void
+static void
 CheckDeadLock(void)
 {
 	int			i;
@@ -1571,12 +1562,6 @@ CheckDeadLock(void)
 		RemoveFromWaitQueue(MyProc, LockTagHashCode(&(MyProc->waitLock->tag)));
 
 		/*
-		 * Unlock my semaphore so that the interrupted ProcSleep() call can
-		 * finish.
-		 */
-		PGSemaphoreUnlock(&MyProc->sem);
-
-		/*
 		 * We're done here.  Transaction abort caused by the error that
 		 * ProcSleep will raise will cause any other locks we hold to be
 		 * released, thus allowing other processes to wake up; we don't need
@@ -1587,19 +1572,6 @@ CheckDeadLock(void)
 		 * RemoveFromWaitQueue took care of waking up any such processes.
 		 */
 	}
-	else if (log_lock_waits || deadlock_state == DS_BLOCKED_BY_AUTOVACUUM)
-	{
-		/*
-		 * Unlock my semaphore so that the interrupted ProcSleep() call can
-		 * print the log message (we daren't do it here because we are inside
-		 * a signal handler).  It will then sleep again until someone releases
-		 * the lock.
-		 *
-		 * If blocked by autovacuum, this wakeup will enable ProcSleep to send
-		 * the canceling signal to the autovacuum worker.
-		 */
-		PGSemaphoreUnlock(&MyProc->sem);
-	}
 
 	/*
 	 * And release locks.  We do this in reverse order for two reasons: (1)
@@ -1613,22 +1585,39 @@ check_done:
 		LWLockRelease(LockHashPartitionLockByIndex(i));
 }
 
+/*
+ * CheckDeadLockAlert - Handle the expiry of deadlock_timeout.
+ *
+ * NB: Runs inside a signal handler, be careful.
+ */
+void
+CheckDeadLockAlert(void)
+{
+	int			save_errno = errno;
+
+	got_deadlock_timeout = true;
+	/*
+	 * Have to set the latch again, even if handle_sig_alarm already did. Back
+	 * then got_deadlock_timeout wasn't yet set... It's unlikely that this
+	 * ever would be a problem, but setting a set latch again is cheap.
+	 */
+	SetLatch(MyLatch);
+	errno = save_errno;
+}
 
 /*
  * ProcWaitForSignal - wait for a signal from another backend.
  *
- * This can share the semaphore normally used for waiting for locks,
- * since a backend could never be waiting for a lock and a signal at
- * the same time.  As with locks, it's OK if the signal arrives just
- * before we actually reach the waiting state.  Also as with locks,
- * it's necessary that the caller be robust against bogus wakeups:
- * always check that the desired state has occurred, and wait again
- * if not.  This copes with possible "leftover" wakeups.
+ * As this uses the generic process latch the caller has to be robust against
+ * unrelated wakeups: Always check that the desired state has occurred, and
+ * wait again if not.
  */
 void
 ProcWaitForSignal(void)
 {
-	PGSemaphoreLock(&MyProc->sem, true);
+	WaitLatch(MyLatch, WL_LATCH_SET, 0);
+	ResetLatch(MyLatch);
+	CHECK_FOR_INTERRUPTS();
 }
 
 /*
@@ -1664,5 +1653,7 @@ ProcSendSignal(int pid)
 		proc = BackendPidGetProc(pid);
 
 	if (proc != NULL)
-		PGSemaphoreUnlock(&proc->sem);
+	{
+		SetLatch(&proc->procLatch);
+	}
 }
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 66aa7ea..1e646a1 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -578,7 +578,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	 */
 	if (!bootstrap)
 	{
-		RegisterTimeout(DEADLOCK_TIMEOUT, CheckDeadLock);
+		RegisterTimeout(DEADLOCK_TIMEOUT, CheckDeadLockAlert);
 		RegisterTimeout(STATEMENT_TIMEOUT, StatementTimeoutHandler);
 		RegisterTimeout(LOCK_TIMEOUT, LockTimeoutHandler);
 	}
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index d194f38..e807a2e 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -251,7 +251,7 @@ extern void ProcQueueInit(PROC_QUEUE *queue);
 extern int	ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable);
 extern PGPROC *ProcWakeup(PGPROC *proc, int waitStatus);
 extern void ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock);
-extern void CheckDeadLock(void);
+extern void CheckDeadLockAlert(void);
 extern bool IsWaitingForLock(void);
 extern void LockErrorCleanup(void);
 
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0006-Remove-the-option-to-service-interrupts-during-PGSem.patchtext/x-patch; charset=us-asciiDownload
>From dfe30df648f7140ae9199d088fed139fccb839f3 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 15 Jan 2015 00:54:40 +0100
Subject: [PATCH 6/7] Remove the option to service interrupts during
 PGSemaphoreLock().

The remaining caller (lwlocks) doesn't need that facility, and we plan
to remove ImmedidateInterruptOK entirely. That means that interrupts
can't be serviced race-free and portably anyway, so there's little
reason for keeping the feature.
---
 src/backend/port/posix_sema.c     | 12 ++---------
 src/backend/port/sysv_sema.c      | 43 ++++-----------------------------------
 src/backend/port/win32_sema.c     |  6 +-----
 src/backend/storage/lmgr/lwlock.c | 12 ++++-------
 src/include/storage/pg_sema.h     |  2 +-
 5 files changed, 12 insertions(+), 63 deletions(-)

diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 633bf5a..ad9c80a 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -236,22 +236,14 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be < 0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema)
 {
 	int			errStatus;
 
-	/*
-	 * See notes in sysv_sema.c's implementation of PGSemaphoreLock. Just as
-	 * that code does for semop(), we handle both the case where sem_wait()
-	 * returns errno == EINTR after a signal, and the case where it just keeps
-	 * waiting.
-	 */
+	/* See notes in sysv_sema.c's implementation of PGSemaphoreLock. */
 	do
 	{
-		ImmediateInterruptOK = interruptOK;
-		CHECK_FOR_INTERRUPTS();
 		errStatus = sem_wait(PG_SEM_REF(sema));
-		ImmediateInterruptOK = false;
 	} while (errStatus < 0 && errno == EINTR);
 
 	if (errStatus < 0)
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 5a0193f..ba46c64 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -364,7 +364,7 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be < 0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema)
 {
 	int			errStatus;
 	struct sembuf sops;
@@ -378,48 +378,13 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
 	 * from the operation prematurely because we were sent a signal.  So we
 	 * try and lock the semaphore again.
 	 *
-	 * Each time around the loop, we check for a cancel/die interrupt.  On
-	 * some platforms, if such an interrupt comes in while we are waiting, it
-	 * will cause the semop() call to exit with errno == EINTR, allowing us to
-	 * service the interrupt (if not in a critical section already) during the
-	 * next loop iteration.
-	 *
-	 * Once we acquire the lock, we do NOT check for an interrupt before
-	 * returning.  The caller needs to be able to record ownership of the lock
-	 * before any interrupt can be accepted.
-	 *
-	 * There is a window of a few instructions between CHECK_FOR_INTERRUPTS
-	 * and entering the semop() call.  If a cancel/die interrupt occurs in
-	 * that window, we would fail to notice it until after we acquire the lock
-	 * (or get another interrupt to escape the semop()).  We can avoid this
-	 * problem by temporarily setting ImmediateInterruptOK to true before we
-	 * do CHECK_FOR_INTERRUPTS; then, a die() interrupt in this interval will
-	 * execute directly.  However, there is a huge pitfall: there is another
-	 * window of a few instructions after the semop() before we are able to
-	 * reset ImmediateInterruptOK.  If an interrupt occurs then, we'll lose
-	 * control, which means that the lock has been acquired but our caller did
-	 * not get a chance to record the fact. Therefore, we only set
-	 * ImmediateInterruptOK if the caller tells us it's OK to do so, ie, the
-	 * caller does not need to record acquiring the lock.  (This is currently
-	 * true for lockmanager locks, since the process that granted us the lock
-	 * did all the necessary state updates. It's not true for SysV semaphores
-	 * used to implement LW locks or emulate spinlocks --- but the wait time
-	 * for such locks should not be very long, anyway.)
-	 *
-	 * On some platforms, signals marked SA_RESTART (which is most, for us)
-	 * will not interrupt the semop(); it will just keep waiting.  Therefore
-	 * it's necessary for cancel/die interrupts to be serviced directly by the
-	 * signal handler.  On these platforms the behavior is really the same
-	 * whether the signal arrives just before the semop() begins, or while it
-	 * is waiting.  The loop on EINTR is thus important only for platforms
-	 * without SA_RESTART.
+	 * We used to check interrupts here, but that required servicing
+	 * interrupts directly from signal handlers. Which is hard to do safely
+	 * and portably.
 	 */
 	do
 	{
-		ImmediateInterruptOK = interruptOK;
-		CHECK_FOR_INTERRUPTS();
 		errStatus = semop(sema->semId, &sops, 1);
-		ImmediateInterruptOK = false;
 	} while (errStatus < 0 && errno == EINTR);
 
 	if (errStatus < 0)
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index f848ff8..011e2fd 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -116,13 +116,11 @@ PGSemaphoreReset(PGSemaphore sema)
  * Serve the interrupt if interruptOK is true.
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema)
 {
 	HANDLE		wh[2];
 	bool		done = false;
 
-	ImmediateInterruptOK = interruptOK;
-
 	/*
 	 * Note: pgwin32_signal_event should be first to ensure that it will be
 	 * reported when multiple events are set.  We want to guarantee that
@@ -173,8 +171,6 @@ PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
 				break;
 		}
 	}
-
-	ImmediateInterruptOK = false;
 }
 
 /*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 7cb0023..7ca8dc0 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -863,8 +863,7 @@ LWLockDequeueSelf(LWLock *lock)
 		 */
 		for (;;)
 		{
-			/* "false" means cannot accept cancel/die interrupt here. */
-			PGSemaphoreLock(&MyProc->sem, false);
+			PGSemaphoreLock(&MyProc->sem);
 			if (!MyProc->lwWaiting)
 				break;
 			extraWaits++;
@@ -1034,8 +1033,7 @@ LWLockAcquireCommon(LWLock *lock, LWLockMode mode, uint64 *valptr, uint64 val)
 
 		for (;;)
 		{
-			/* "false" means cannot accept cancel/die interrupt here. */
-			PGSemaphoreLock(&proc->sem, false);
+			PGSemaphoreLock(&proc->sem);
 			if (!proc->lwWaiting)
 				break;
 			extraWaits++;
@@ -1195,8 +1193,7 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 
 			for (;;)
 			{
-				/* "false" means cannot accept cancel/die interrupt here. */
-				PGSemaphoreLock(&proc->sem, false);
+				PGSemaphoreLock(&proc->sem);
 				if (!proc->lwWaiting)
 					break;
 				extraWaits++;
@@ -1397,8 +1394,7 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
 
 		for (;;)
 		{
-			/* "false" means cannot accept cancel/die interrupt here. */
-			PGSemaphoreLock(&proc->sem, false);
+			PGSemaphoreLock(&proc->sem);
 			if (!proc->lwWaiting)
 				break;
 			extraWaits++;
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 5eaf2bf..9f8be75 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -72,7 +72,7 @@ extern void PGSemaphoreCreate(PGSemaphore sema);
 extern void PGSemaphoreReset(PGSemaphore sema);
 
 /* Lock a semaphore (decrement count), blocking if count would be < 0 */
-extern void PGSemaphoreLock(PGSemaphore sema, bool interruptOK);
+extern void PGSemaphoreLock(PGSemaphore sema);
 
 /* Unlock a semaphore (increment count) */
 extern void PGSemaphoreUnlock(PGSemaphore sema);
-- 
2.0.0.rc2.4.g1dc51c6.dirty

0007-Remove-remnants-of-ImmediateInterruptOK-handling.patchtext/x-patch; charset=us-asciiDownload
>From 63078f8d273b6a2dc47bfab093d42b110e2487e5 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 15 Jan 2015 00:57:12 +0100
Subject: [PATCH 7/7] Remove remnants of ImmediateInterruptOK handling.

Now that nothing sets ImmediateInterruptOK to true anymore, we can
remove all the supporting code.
---
 src/backend/storage/ipc/ipc.c         |  2 --
 src/backend/tcop/postgres.c           | 29 --------------------------
 src/backend/utils/error/elog.c        | 29 +++-----------------------
 src/backend/utils/init/globals.c      |  1 -
 src/backend/utils/misc/timeout.c      | 38 +++--------------------------------
 src/include/miscadmin.h               |  1 -
 src/test/modules/test_shm_mq/worker.c |  6 +-----
 7 files changed, 7 insertions(+), 99 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index f0f7939..6bc0b06 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -165,8 +165,6 @@ proc_exit_prepare(int code)
 	InterruptPending = false;
 	ProcDiePending = false;
 	QueryCancelPending = false;
-	/* And let's just make *sure* we're not interrupted ... */
-	ImmediateInterruptOK = false;
 	InterruptHoldoffCount = 1;
 	CritSectionCount = 0;
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 832561e..bf9832a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2648,13 +2648,6 @@ die(SIGNAL_ARGS)
 	{
 		InterruptPending = true;
 		ProcDiePending = true;
-
-		/*
-		 * If we're waiting for input or a lock so that it's safe to
-		 * interrupt, service the interrupt immediately
-		 */
-		if (ImmediateInterruptOK)
-			ProcessInterrupts();
 	}
 
 	/* If we're still here, waken anything waiting on the process latch */
@@ -2686,13 +2679,6 @@ StatementCancelHandler(SIGNAL_ARGS)
 	{
 		InterruptPending = true;
 		QueryCancelPending = true;
-
-		/*
-		 * If we're waiting for input or a lock so that it's safe to
-		 * interrupt, service the interrupt immediately
-		 */
-		if (ImmediateInterruptOK)
-			ProcessInterrupts();
 	}
 
 	/* If we're still here, waken anything waiting on the process latch */
@@ -2833,13 +2819,6 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
 		 */
 		if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE)
 			RecoveryConflictRetryable = false;
-
-		/*
-		 * If we're waiting for input or a lock so that it's safe to
-		 * interrupt, service the interrupt immediately.
-		 */
-		if (ImmediateInterruptOK)
-			ProcessInterrupts();
 	}
 
 	/*
@@ -2873,7 +2852,6 @@ ProcessInterrupts(void)
 	{
 		ProcDiePending = false;
 		QueryCancelPending = false;		/* ProcDie trumps QueryCancel */
-		ImmediateInterruptOK = false;	/* not idle anymore */
 		LockErrorCleanup();
 		/* As in quickdie, don't risk sending to client during auth */
 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
@@ -2912,7 +2890,6 @@ ProcessInterrupts(void)
 	if (ClientConnectionLost)
 	{
 		QueryCancelPending = false;		/* lost connection trumps QueryCancel */
-		ImmediateInterruptOK = false;	/* not idle anymore */
 		LockErrorCleanup();
 		/* don't send to client, we already know the connection to be dead. */
 		whereToSendOutput = DestNone;
@@ -2930,7 +2907,6 @@ ProcessInterrupts(void)
 	if (RecoveryConflictPending && DoingCommandRead)
 	{
 		QueryCancelPending = false;			/* this trumps QueryCancel */
-		ImmediateInterruptOK = false;		/* not idle anymore */
 		RecoveryConflictPending = false;
 		LockErrorCleanup();
 		pgstat_report_recovery_conflict(RecoveryConflictReason);
@@ -2968,7 +2944,6 @@ ProcessInterrupts(void)
 		 */
 		if (get_timeout_indicator(LOCK_TIMEOUT, true))
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			(void) get_timeout_indicator(STATEMENT_TIMEOUT, true);
 			LockErrorCleanup();
 			ereport(ERROR,
@@ -2977,7 +2952,6 @@ ProcessInterrupts(void)
 		}
 		if (get_timeout_indicator(STATEMENT_TIMEOUT, true))
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
@@ -2985,7 +2959,6 @@ ProcessInterrupts(void)
 		}
 		if (IsAutoVacuumWorkerProcess())
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
@@ -2993,7 +2966,6 @@ ProcessInterrupts(void)
 		}
 		if (RecoveryConflictPending)
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			RecoveryConflictPending = false;
 			LockErrorCleanup();
 			pgstat_report_recovery_conflict(RecoveryConflictReason);
@@ -3010,7 +2982,6 @@ ProcessInterrupts(void)
 		 */
 		if (!DoingCommandRead)
 		{
-			ImmediateInterruptOK = false;		/* not idle anymore */
 			LockErrorCleanup();
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 0f7aa19..4e0cc30 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -412,7 +412,6 @@ errfinish(int dummy,...)
 {
 	ErrorData  *edata = &errordata[errordata_stack_depth];
 	int			elevel;
-	bool		save_ImmediateInterruptOK;
 	MemoryContext oldcontext;
 	ErrorContextCallback *econtext;
 
@@ -421,18 +420,6 @@ errfinish(int dummy,...)
 	elevel = edata->elevel;
 
 	/*
-	 * Ensure we can't get interrupted while performing error reporting.  This
-	 * is needed to prevent recursive entry to functions like syslog(), which
-	 * may not be re-entrant.
-	 *
-	 * Note: other places that save-and-clear ImmediateInterruptOK also do
-	 * HOLD_INTERRUPTS(), but that should not be necessary here since we don't
-	 * call anything that could turn on ImmediateInterruptOK.
-	 */
-	save_ImmediateInterruptOK = ImmediateInterruptOK;
-	ImmediateInterruptOK = false;
-
-	/*
 	 * Do processing in ErrorContext, which we hope has enough reserved space
 	 * to report an error.
 	 */
@@ -463,10 +450,6 @@ errfinish(int dummy,...)
 		 * itself be inside a holdoff section.  If necessary, such a handler
 		 * could save and restore InterruptHoldoffCount for itself, but this
 		 * should make life easier for most.)
-		 *
-		 * Note that we intentionally don't restore ImmediateInterruptOK here,
-		 * even if it was set at entry.  We definitely don't want that on
-		 * while doing error cleanup.
 		 */
 		InterruptHoldoffCount = 0;
 		QueryCancelHoldoffCount = 0;
@@ -573,15 +556,9 @@ errfinish(int dummy,...)
 	}
 
 	/*
-	 * We reach here if elevel <= WARNING.  OK to return to caller, so restore
-	 * caller's setting of ImmediateInterruptOK.
-	 */
-	ImmediateInterruptOK = save_ImmediateInterruptOK;
-
-	/*
-	 * But check for cancel/die interrupt first --- this is so that the user
-	 * can stop a query emitting tons of notice or warning messages, even if
-	 * it's in a loop that otherwise fails to check for interrupts.
+	 * Check for cancel/die interrupt first --- this is so that the user can
+	 * stop a query emitting tons of notice or warning messages, even if it's
+	 * in a loop that otherwise fails to check for interrupts.
 	 */
 	CHECK_FOR_INTERRUPTS();
 }
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 8cf2ead..23e594e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -30,7 +30,6 @@ volatile bool InterruptPending = false;
 volatile bool QueryCancelPending = false;
 volatile bool ProcDiePending = false;
 volatile bool ClientConnectionLost = false;
-volatile bool ImmediateInterruptOK = false;
 volatile uint32 InterruptHoldoffCount = 0;
 volatile uint32 QueryCancelHoldoffCount = 0;
 volatile uint32 CritSectionCount = 0;
diff --git a/src/backend/utils/misc/timeout.c b/src/backend/utils/misc/timeout.c
index ce4bc13..a7ec665 100644
--- a/src/backend/utils/misc/timeout.c
+++ b/src/backend/utils/misc/timeout.c
@@ -259,27 +259,12 @@ static void
 handle_sig_alarm(SIGNAL_ARGS)
 {
 	int			save_errno = errno;
-	bool		save_ImmediateInterruptOK = ImmediateInterruptOK;
 
 	/*
-	 * We may be executing while ImmediateInterruptOK is true (e.g., when
-	 * mainline is waiting for a lock).  If SIGINT or similar arrives while
-	 * this code is running, we'd lose control and perhaps leave our data
-	 * structures in an inconsistent state.  Disable immediate interrupts, and
-	 * just to be real sure, bump the holdoff counter as well.  (The reason
-	 * for this belt-and-suspenders-too approach is to make sure that nothing
-	 * bad happens if a timeout handler calls code that manipulates
-	 * ImmediateInterruptOK.)
-	 *
-	 * Note: it's possible for a SIGINT to interrupt handle_sig_alarm before
-	 * we manage to do this; the net effect would be as if the SIGALRM event
-	 * had been silently lost.  Therefore error recovery must include some
-	 * action that will allow any lost interrupt to be rescheduled.  Disabling
-	 * some or all timeouts is sufficient, or if that's not appropriate,
-	 * reschedule_timeouts() can be called.  Also, the signal blocking hazard
-	 * described below applies here too.
+	 * Bump the holdoff counter, to make sure nothing we call will process
+	 * interrupts directly. No timeout handler should do that, but these
+	 * failures are hard to debug, so better be sure.
 	 */
-	ImmediateInterruptOK = false;
 	HOLD_INTERRUPTS();
 
 	/*
@@ -332,24 +317,7 @@ handle_sig_alarm(SIGNAL_ARGS)
 		}
 	}
 
-	/*
-	 * Re-allow query cancel, and then try to service any cancel request that
-	 * arrived meanwhile (this might in particular include a cancel request
-	 * fired by one of the timeout handlers).  Since we are in a signal
-	 * handler, we mustn't call ProcessInterrupts unless ImmediateInterruptOK
-	 * is set; if it isn't, the cancel will happen at the next mainline
-	 * CHECK_FOR_INTERRUPTS.
-	 *
-	 * Note: a longjmp from here is safe so far as our own data structures are
-	 * concerned; but on platforms that block a signal before calling the
-	 * handler and then un-block it on return, longjmping out of the signal
-	 * handler leaves SIGALRM still blocked.  Error cleanup is responsible for
-	 * unblocking any blocked signals.
-	 */
 	RESUME_INTERRUPTS();
-	ImmediateInterruptOK = save_ImmediateInterruptOK;
-	if (save_ImmediateInterruptOK)
-		CHECK_FOR_INTERRUPTS();
 
 	errno = save_errno;
 }
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index c9a46aa..eacfccb 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -84,7 +84,6 @@ extern PGDLLIMPORT volatile bool ProcDiePending;
 extern volatile bool ClientConnectionLost;
 
 /* these are marked volatile because they are examined by signal handlers: */
-extern PGDLLIMPORT volatile bool ImmediateInterruptOK;
 extern PGDLLIMPORT volatile uint32 InterruptHoldoffCount;
 extern PGDLLIMPORT volatile uint32 QueryCancelHoldoffCount;
 extern PGDLLIMPORT volatile uint32 CritSectionCount;
diff --git a/src/test/modules/test_shm_mq/worker.c b/src/test/modules/test_shm_mq/worker.c
index a9d9e0e..691a568 100644
--- a/src/test/modules/test_shm_mq/worker.c
+++ b/src/test/modules/test_shm_mq/worker.c
@@ -60,13 +60,9 @@ test_shm_mq_main(Datum main_arg)
 	 *
 	 * We want CHECK_FOR_INTERRUPTS() to kill off this worker process just as
 	 * it would a normal user backend.  To make that happen, we establish a
-	 * signal handler that is a stripped-down version of die().  We don't have
-	 * any equivalent of the backend's command-read loop, where interrupts can
-	 * be processed immediately, so make sure ImmediateInterruptOK is turned
-	 * off.
+	 * signal handler that is a stripped-down version of die().
 	 */
 	pqsignal(SIGTERM, handle_sigterm);
-	ImmediateInterruptOK = false;
 	BackgroundWorkerUnblockSignals();
 
 	/*
-- 
2.0.0.rc2.4.g1dc51c6.dirty

#13Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andres Freund (#12)
Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 02/03/2015 08:54 PM, Andres Freund wrote:

* Previously the patchset didn't handle SIGTERM while
InteractiveBackend() was reading from the client. It did handle
ctrl-c/d, but since getc() isn't interruptible and can't be replaced
with latches... The fix for that isn't super pretty:
die():
if (DoingCommandRead && whereToSendOutput != DestRemote)
ProcessInterrupts();
but imo acceptable for single user mode.

That smells an awful lot like ImmediateInterruptOK. Could you use
WaitLatchOrSocket to sleep on stdin? Probably needs to be renamed or
copied to WaitLatchOrStdin(), or a new flag to be added to wait on stdin
instead of a socket, at least on Windows, but in theory.

Might not be worth it if it gets complicated - I can live with the above
- but it's a thought.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#13)
Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

On 2015-02-03 22:17:22 +0200, Heikki Linnakangas wrote:

On 02/03/2015 08:54 PM, Andres Freund wrote:

* Previously the patchset didn't handle SIGTERM while
InteractiveBackend() was reading from the client. It did handle
ctrl-c/d, but since getc() isn't interruptible and can't be replaced
with latches... The fix for that isn't super pretty:
die():
if (DoingCommandRead && whereToSendOutput != DestRemote)
ProcessInterrupts();
but imo acceptable for single user mode.

That smells an awful lot like ImmediateInterruptOK.

It awfully does, yes. But we're only exiting, and only in single user
mode. And only during command read. We don't need all the other
complicated stuff around sinval, async, deadlock checking, etc.

Could you use WaitLatchOrSocket to sleep on stdin?

Unfortunately I don't think so. poll() etc only works properly on
network handles, pipes etc - but stdin can be a file :(. And I think
what exactly happens if it's a file fd isn't super well defined. On
linux the file is always marked as ready, which would probably actually
work...

I think this is better solved in the long run to get rid of single user
mode, akin to the patch Tom had a year or two back. Adding more
complications for it doesn't seem worth it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers