Better error messages when lacking connection slots for autovacuum workers and bgworkers

Started by Michael Paquieralmost 7 years ago3 messages

michael@paquier.xyz

almost 7 years ago

1 attachment(s)

Hi all,

When lacking connection slots, the backend returns a simple "sorry,
too many clients", which is something that has been tweaked by recent
commit ea92368 for WAL senders. However, the same message would show
up for autovacuum workers and bgworkers. Based on the way autovacuum
workers are spawned by the launcher, and the way bgworkers are added,
it seems that this cannot actually happen. Still, in my opinion,
providing more context can be helpful for people trying to work on
features related to such code so as they can get more information than
what would normally happen for normal backends.

I am wondering as well if this could not help for issues like this
one, which has popped out today and makes me wonder if we don't have
race conditions with the way dynamic bgworkers are spawned:
/messages/by-id/CAONUJSM5X259vAnnwSpqu=VnRECfGSJ-CgRHyS4P5YyRVwkXsQ@mail.gmail.com

There are four code paths which report the original "sorry, too many
clients", and the one of the patch attached can easily track the
type of connection slot used which helps for better context messages
when a process is initialized.

Would that be helpful for at least debugging purposes?

Thanks,
--
Michael

Attachments:

connslot-error-context-v1.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 0da5b19719..418363000b 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -349,15 +349,25 @@ InitProcess(void)
 		/*
 		 * If we reach here, all the PGPROCs are in use.  This is one of the
 		 * possible places to detect "too many backends", so give the standard
-		 * error message.  XXX do we need to give a different failure message
-		 * in the autovacuum case?
+		 * error message.
 		 */
 		SpinLockRelease(ProcStructLock);
-		if (am_walsender)
+		if (IsAnyAutoVacuumProcess())
+			ereport(FATAL,
+					(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
+					 errmsg("number of requested autovacuum workers exceeds autovacuum_max_workers (currently %d)",
+							autovacuum_max_workers)));
+		else if (IsBackgroundWorker)
+			ereport(FATAL,
+					(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
+					 errmsg("number of requested worker processes exceeds max_worker_processes (currently %d)",
+							max_worker_processes)));
+		else if (am_walsender)
 			ereport(FATAL,
 					(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
 					 errmsg("number of requested standby connections exceeds max_wal_senders (currently %d)",
 							max_wal_senders)));
+
 		ereport(FATAL,
 				(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
 				 errmsg("sorry, too many clients already")));

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Michael Paquier (#1)

Re: [Suspect SPAM] Better error messages when lacking connection slots for autovacuum workers and bgworkers

Hello.

At Wed, 13 Feb 2019 14:13:09 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20190213051309.GF5746@paquier.xyz>

Hi all,

When lacking connection slots, the backend returns a simple "sorry,
too many clients", which is something that has been tweaked by recent
commit ea92368 for WAL senders. However, the same message would show
up for autovacuum workers and bgworkers. Based on the way autovacuum
workers are spawned by the launcher, and the way bgworkers are added,
it seems that this cannot actually happen. Still, in my opinion,
providing more context can be helpful for people trying to work on
features related to such code so as they can get more information than
what would normally happen for normal backends.

I agree to the distinctive messages, but the autovaccum and
bgworker cases are in a kind of internal error, and they are not
"connection"s. I feel that elog is more suitable for them.

I am wondering as well if this could not help for issues like this
one, which has popped out today and makes me wonder if we don't have
race conditions with the way dynamic bgworkers are spawned:
/messages/by-id/CAONUJSM5X259vAnnwSpqu=VnRECfGSJ-CgRHyS4P5YyRVwkXsQ@mail.gmail.com

There are four code paths which report the original "sorry, too many
clients", and the one of the patch attached can easily track the
type of connection slot used which helps for better context messages
when a process is initialized.
Would that be helpful for at least debugging purposes?

I agree that it is helpful. Errors with different causes ought to
show distinctive error messages.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#2)

Re: [Suspect SPAM] Better error messages when lacking connection slots for autovacuum workers and bgworkers

On Thu, Feb 14, 2019 at 09:04:37PM +0900, Kyotaro HORIGUCHI wrote:

I agree to the distinctive messages, but the autovaccum and
bgworker cases are in a kind of internal error, and they are not
"connection"s. I feel that elog is more suitable for them.

I used ereport() for consistency with the existing code, still you are
right that we ought to use elog() as this is the case of a problem
which should not happen.
--
Michael