sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

Started by Robert Haasalmost 16 years ago12 messages
#1Robert Haas
robertmhaas@gmail.com

After snapshotting my master using hot backup to create a workable
slave instance, I created recovery.conf on the slave and tried to get
it to connect to the master and stream WAL.

This led to the message "sorry, too many standbys already", which did
not immediately clue me in as to what I needed to do to fix the
problem. Grepping the source code for the error message revealed that
the problem was that MaxWalSenders was zero. A few seconds further
head-scratching revealed that this was the GUC max_wal_senders, which
I duly increased from 0 to 1, after which it worked.

I think perhaps this error message needs some adjustment. It should
be reasonably possible to guess the name of the GUC that needs
increasing based on the error message, and it currently isn't. Also
I'd vote for making the variable name max_wal_senders rather than
MaxWalSenders, but maybe that's being persnicketty.

...Robert

#2Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#1)
1 attachment(s)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On Wed, Mar 31, 2010 at 12:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

After snapshotting my master using hot backup to create a workable
slave instance, I created recovery.conf on the slave and tried to get
it to connect to the master and stream WAL.

This led to the message "sorry, too many standbys already", which did
not immediately clue me in as to what I needed to do to fix the
problem.  Grepping the source code for the error message revealed that
the problem was that MaxWalSenders was zero.  A few seconds further
head-scratching revealed that this was the GUC max_wal_senders, which
I duly increased from 0 to 1, after which it worked.

I think perhaps this error message needs some adjustment.  It should
be reasonably possible to guess the name of the GUC that needs
increasing based on the error message, and it currently isn't.

Agreed. How about the atteched patch?
The patch treats differently the case where max_wal_senders is 0,
and the following error message (better message?) is written only
in this case.

could not accept connection from the standby because max_wal_senders is 0

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

sr_error_message_v1.patchapplication/octet-stream; name=sr_error_message_v1.patchDownload
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
***************
*** 474,482 **** InitWalSnd(void)
  		}
  	}
  	if (MyWalSnd == NULL)
! 		ereport(FATAL,
! 				(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
! 				 errmsg("sorry, too many standbys already")));
  
  	/* Arrange to clean up at walsender exit */
  	on_shmem_exit(WalSndKill, 0);
--- 474,489 ----
  		}
  	}
  	if (MyWalSnd == NULL)
! 	{
! 		if (MaxWalSenders == 0)
! 			ereport(FATAL,
! 					(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
! 					 errmsg("could not accept connection from the standby because max_wal_senders is 0")));
! 		else
! 			ereport(FATAL,
! 					(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
! 					 errmsg("sorry, too many standbys already")));
! 	}
  
  	/* Arrange to clean up at walsender exit */
  	on_shmem_exit(WalSndKill, 0);
#3Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#2)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

2010/3/31 Fujii Masao <masao.fujii@gmail.com>:

On Wed, Mar 31, 2010 at 12:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

After snapshotting my master using hot backup to create a workable
slave instance, I created recovery.conf on the slave and tried to get
it to connect to the master and stream WAL.

This led to the message "sorry, too many standbys already", which did
not immediately clue me in as to what I needed to do to fix the
problem.  Grepping the source code for the error message revealed that
the problem was that MaxWalSenders was zero.  A few seconds further
head-scratching revealed that this was the GUC max_wal_senders, which
I duly increased from 0 to 1, after which it worked.

I think perhaps this error message needs some adjustment.  It should
be reasonably possible to guess the name of the GUC that needs
increasing based on the error message, and it currently isn't.

Agreed. How about the atteched patch?
The patch treats differently the case where max_wal_senders is 0,
and the following error message (better message?) is written only
in this case.

   could not accept connection from the standby because max_wal_senders is 0

How about using errhint to tell the user which parameter to use?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#4Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#3)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On Wed, Mar 31, 2010 at 8:19 AM, Magnus Hagander <magnus@hagander.net> wrote:

How about using errhint to tell the user which parameter to use?

I thought about that. I noticed that the error message from the
master gets displayed on the slave. I didn't check if an errhint
would also propagate over.

...Robert

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Robert Haas (#4)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

Robert Haas escribi�:

On Wed, Mar 31, 2010 at 8:19 AM, Magnus Hagander <magnus@hagander.net> wrote:

How about using errhint to tell the user which parameter to use?

I thought about that. I noticed that the error message from the
master gets displayed on the slave. I didn't check if an errhint
would also propagate over.

Hmm, it would be very good that it did. Perhaps that needs fixing, if
it doesn't work already.

(Personally, I consider that this idea that hints, details and other
message fields are second-level citizens in the error report country has
got to stop. It means we can only use hints and details for near
useless information.)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#2)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On Wed, Mar 31, 2010 at 12:54 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Mar 31, 2010 at 12:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

After snapshotting my master using hot backup to create a workable
slave instance, I created recovery.conf on the slave and tried to get
it to connect to the master and stream WAL.

This led to the message "sorry, too many standbys already", which did
not immediately clue me in as to what I needed to do to fix the
problem.  Grepping the source code for the error message revealed that
the problem was that MaxWalSenders was zero.  A few seconds further
head-scratching revealed that this was the GUC max_wal_senders, which
I duly increased from 0 to 1, after which it worked.

I think perhaps this error message needs some adjustment.  It should
be reasonably possible to guess the name of the GUC that needs
increasing based on the error message, and it currently isn't.

Agreed. How about the atteched patch?
The patch treats differently the case where max_wal_senders is 0,
and the following error message (better message?) is written only
in this case.

   could not accept connection from the standby because max_wal_senders is 0

Well, that might still leave someone confused if they had one standby
and were trying to bring up a second one.

...Robert

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#6)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Mar 31, 2010 at 12:54 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

� �could not accept connection from the standby because max_wal_senders is 0

Well, that might still leave someone confused if they had one standby
and were trying to bring up a second one.

I'd suggest something like "number of requested standby connections
exceeds max_wal_senders (currently %d)"

regards, tom lane

#8Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#7)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On Wed, Mar 31, 2010 at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Mar 31, 2010 at 12:54 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

   could not accept connection from the standby because max_wal_senders is 0

Well, that might still leave someone confused if they had one standby
and were trying to bring up a second one.

I'd suggest something like "number of requested standby connections
exceeds max_wal_senders (currently %d)"

Oh, that's much better than anything I thought of. +1.

...Robert

#9Thom Brown
thombrown@gmail.com
In reply to: Robert Haas (#8)
1 attachment(s)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On 31 March 2010 15:45, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 31, 2010 at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Mar 31, 2010 at 12:54 AM, Fujii Masao <masao.fujii@gmail.com>

wrote:

could not accept connection from the standby because max_wal_senders

is 0

Well, that might still leave someone confused if they had one standby
and were trying to bring up a second one.

I'd suggest something like "number of requested standby connections
exceeds max_wal_senders (currently %d)"

Oh, that's much better than anything I thought of. +1.

...Robert

That provides more explicit information. :)

Attachments:

sr_error_message_v2.patchapplication/octet-stream; name=sr_error_message_v2.patchDownload
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
***************
*** 476,482 ****
  	if (MyWalSnd == NULL)
  		ereport(FATAL,
  				(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
! 				 errmsg("sorry, too many standbys already")));
  
  	/* Arrange to clean up at walsender exit */
  	on_shmem_exit(WalSndKill, 0);
--- 476,483 ----
  	if (MyWalSnd == NULL)
  		ereport(FATAL,
  				(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
! 				 errmsg("number of requested standby connections"
! 					"exceeds max_wal_senders (currently %d)", i + 1)));
  
  	/* Arrange to clean up at walsender exit */
  	on_shmem_exit(WalSndKill, 0);
#10Robert Haas
robertmhaas@gmail.com
In reply to: Thom Brown (#9)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On Wed, Mar 31, 2010 at 11:24 AM, Thom Brown <thombrown@gmail.com> wrote:

[patch]

As a general rule, I really appreciate people being willing to take
the time to put proposed changes into patch form, even if they're
small, but this three-line patch contains two bugs. :-(

Thanks for your many typo corrections, though!

...Robert

#11Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#1)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

Robert Haas wrote:

After snapshotting my master using hot backup to create a workable
slave instance, I created recovery.conf on the slave and tried to get
it to connect to the master and stream WAL.

This led to the message "sorry, too many standbys already", which did
not immediately clue me in as to what I needed to do to fix the
problem. Grepping the source code for the error message revealed that
the problem was that MaxWalSenders was zero. A few seconds further
head-scratching revealed that this was the GUC max_wal_senders, which
I duly increased from 0 to 1, after which it worked.

I think perhaps this error message needs some adjustment. It should
be reasonably possible to guess the name of the GUC that needs
increasing based on the error message, and it currently isn't. Also
I'd vote for making the variable name max_wal_senders rather than
MaxWalSenders, but maybe that's being persnicketty.

Glad the error message has been improved. I was bitten by this exact
error message and didn't know the cause for a while, and was going to
suggest such a fix.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

#12Thom Brown
thombrown@gmail.com
In reply to: Robert Haas (#10)
Re: sorry, too many standbys already vs. MaxWalSenders vs. max_wal_senders

On 1 April 2010 01:51, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 31, 2010 at 11:24 AM, Thom Brown <thombrown@gmail.com> wrote:

[patch]

As a general rule, I really appreciate people being willing to take
the time to put proposed changes into patch form, even if they're
small, but this three-line patch contains two bugs. :-(

Thanks for your many typo corrections, though!

...Robert

Or my changes were so good, the world wasn't ready for it? Yes, I think
that was it.