wal_sender_delay is still required?

Started by Fujii Masaoabout 15 years ago10 messages
#1Fujii Masao
masao.fujii@gmail.com

Hi,

Walsender doesn't need the periodic wakeups anymore, thanks to
the latch feature. So wal_sender_delay is basically useless now.
How about dropping wal_sender_delay or increasing the default
value?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#1)
Re: wal_sender_delay is still required?

Fujii Masao <masao.fujii@gmail.com> writes:

Walsender doesn't need the periodic wakeups anymore, thanks to
the latch feature. So wal_sender_delay is basically useless now.
How about dropping wal_sender_delay or increasing the default
value?

If we don't need it, we should remove it.

regards, tom lane

#3Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#2)
1 attachment(s)
Re: wal_sender_delay is still required?

On Tue, Dec 7, 2010 at 12:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

Walsender doesn't need the periodic wakeups anymore, thanks to
the latch feature. So wal_sender_delay is basically useless now.
How about dropping wal_sender_delay or increasing the default
value?

If we don't need it, we should remove it.

The attached patch removes wal_sender_delay and uses hard-coded
10 seconds instead of wal_sender_delay as the delay between activity
rounds for walsender.

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

drop_wal_sender_delay_v1.patchapplication/octet-stream; name=drop_wal_sender_delay_v1.patchDownload
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 1909,1934 **** SET ENABLE_SEQSCAN TO OFF;
         </para>
         </listitem>
        </varlistentry>
-       <varlistentry id="guc-wal-sender-delay" xreflabel="wal_sender_delay">
-        <term><varname>wal_sender_delay</varname> (<type>integer</type>)</term>
-        <indexterm>
-         <primary><varname>wal_sender_delay</> configuration parameter</primary>
-        </indexterm>
-        <listitem>
-        <para>
-         Specifies the delay between activity rounds for WAL sender processes.
-         In each round the WAL sender sends any WAL accumulated since the last
-         round to the standby server. It then sleeps for
-         <varname>wal_sender_delay</> milliseconds, and repeats. The default
-         value is 200 milliseconds (<literal>200ms</>).
-         Note that on many systems, the effective resolution of sleep delays is
-         10 milliseconds; setting <varname>wal_sender_delay</> to a value that
-         is not a multiple of 10 might have the same results as setting it to
-         the next higher multiple of 10. This parameter can only be set in the
-         <filename>postgresql.conf</> file or on the server command line.
-        </para>
-        </listitem>
-       </varlistentry>
  
        <varlistentry id="guc-wal-keep-segments" xreflabel="wal_keep_segments">
         <term><varname>wal_keep_segments</varname> (<type>integer</type>)</term>
--- 1909,1914 ----
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
***************
*** 65,71 **** bool		am_walsender = false;		/* Am I a walsender process ? */
  
  /* User-settable parameters for walsender */
  int			max_wal_senders = 0;	/* the maximum number of concurrent walsenders */
- int			WalSndDelay = 200;	/* max sleep time between some actions */
  
  /*
   * These variables are used similarly to openLogFile/Id/Seg/Off,
--- 65,70 ----
***************
*** 77,82 **** static uint32 sendSeg = 0;
--- 76,89 ----
  static uint32 sendOff = 0;
  
  /*
+  * The delay between activity rounds for walsender. In each round
+  * walsender sends any WAL accumulated since the last round to the
+  * standby server. It then sleeps for 10000000 microseconds (10s),
+  * and repeats. 
+  */
+ #define WAL_SENDER_DELAY 10000000
+ 
+ /*
   * How far have we sent WAL already? This is also advertised in
   * MyWalSnd->sentPtr.  (Actually, this is the next WAL location to send.)
   */
***************
*** 444,458 **** WalSndLoop(void)
  				break;
  			if (caughtup && !got_SIGHUP && !ready_to_stop && !shutdown_requested)
  			{
- 				/*
- 				 * XXX: We don't really need the periodic wakeups anymore,
- 				 * WaitLatchOrSocket should reliably wake up as soon as
- 				 * something interesting happens.
- 				 */
- 
  				/* Sleep */
  				WaitLatchOrSocket(&MyWalSnd->latch, MyProcPort->sock,
! 								  WalSndDelay * 1000L);
  			}
  
  			/* Check if the connection was closed */
--- 451,459 ----
  				break;
  			if (caughtup && !got_SIGHUP && !ready_to_stop && !shutdown_requested)
  			{
  				/* Sleep */
  				WaitLatchOrSocket(&MyWalSnd->latch, MyProcPort->sock,
! 								  WAL_SENDER_DELAY);
  			}
  
  			/* Check if the connection was closed */
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 1790,1805 **** static struct config_int ConfigureNamesInt[] =
  	},
  
  	{
- 		{"wal_sender_delay", PGC_SIGHUP, WAL_REPLICATION,
- 			gettext_noop("WAL sender sleep time between WAL replications."),
- 			NULL,
- 			GUC_UNIT_MS
- 		},
- 		&WalSndDelay,
- 		200, 1, 10000, NULL, NULL
- 	},
- 
- 	{
  		{"commit_delay", PGC_USERSET, WAL_SETTINGS,
  			gettext_noop("Sets the delay in microseconds between transaction commit and "
  						 "flushing WAL to disk."),
--- 1790,1795 ----
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 188,194 ****
  
  #max_wal_senders = 0		# max number of walsender processes
  				# (change requires restart)
- #wal_sender_delay = 200ms	# walsender cycle time, 1-10000 milliseconds
  #wal_keep_segments = 0		# in logfile segments, 16MB each; 0 disables
  #vacuum_defer_cleanup_age = 0	# number of xacts by which cleanup is delayed
  
--- 188,193 ----
*** a/src/include/replication/walsender.h
--- b/src/include/replication/walsender.h
***************
*** 45,51 **** extern WalSndCtlData *WalSndCtl;
  extern bool am_walsender;
  
  /* user-settable parameters */
- extern int	WalSndDelay;
  extern int	max_wal_senders;
  
  extern int	WalSenderMain(void);
--- 45,50 ----
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#3)
Re: wal_sender_delay is still required?

Fujii Masao <masao.fujii@gmail.com> writes:

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Oh. Hm. I'm hesitant to remove the setting if there's still some
behavior that it would control. Maybe we should just crank up the
default value instead.

regards, tom lane

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#4)
Re: wal_sender_delay is still required?

Excerpts from Tom Lane's message of lun dic 06 23:49:52 -0300 2010:

Fujii Masao <masao.fujii@gmail.com> writes:

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Oh. Hm. I'm hesitant to remove the setting if there's still some
behavior that it would control. Maybe we should just crank up the
default value instead.

Maybe we should have a single tunable for processes that just sleep
waiting for events or postmaster death. For example pgstats has a
hardcoded 2 seconds, and the archiver process has a hardcoded value too
AFAICS.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#4)
Re: wal_sender_delay is still required?

On Tue, Dec 7, 2010 at 11:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Oh.  Hm.  I'm hesitant to remove the setting if there's still some
behavior that it would control.  Maybe we should just crank up the
default value instead.

Fair enough. How about increasing the default to 10 seconds?
Since bgwriter has already using 10s as a nap time if there is no
configured activity, I think that 10s is non-nonsense default value.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#7Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#6)
Re: wal_sender_delay is still required?

On Mon, Dec 6, 2010 at 10:07 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Dec 7, 2010 at 11:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Oh.  Hm.  I'm hesitant to remove the setting if there's still some
behavior that it would control.  Maybe we should just crank up the
default value instead.

Fair enough. How about increasing the default to 10 seconds?
Since bgwriter has already using 10s as a nap time if there is no
configured activity, I think that 10s is non-nonsense default value.

What do we get out of making this non-configurable?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#7)
Re: wal_sender_delay is still required?

On Tue, Dec 7, 2010 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Fair enough. How about increasing the default to 10 seconds?
Since bgwriter has already using 10s as a nap time if there is no
configured activity, I think that 10s is non-nonsense default value.

What do we get out of making this non-configurable?

Which would make the setting of replication simpler, I think.
But I agree to just increase the default value of wal_sender_delay
rather than dropping it.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#5)
Re: wal_sender_delay is still required?

Alvaro Herrera <alvherre@commandprompt.com> writes:

Maybe we should have a single tunable for processes that just sleep
waiting for events or postmaster death. For example pgstats has a
hardcoded 2 seconds, and the archiver process has a hardcoded value too
AFAICS.

That would make sense once we get to the point where for all of those
processes, the sleep delay *only* affects the time to notice postmaster
death. Right now I think there are still several other behaviors mixed
in with that, and not all of them necessarily want the same response
time.

regards, tom lane

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#8)
Re: wal_sender_delay is still required?

On 07.12.2010 05:51, Fujii Masao wrote:

On Tue, Dec 7, 2010 at 12:22 PM, Robert Haas<robertmhaas@gmail.com> wrote:

Fair enough. How about increasing the default to 10 seconds?
Since bgwriter has already using 10s as a nap time if there is no
configured activity, I think that 10s is non-nonsense default value.

What do we get out of making this non-configurable?

Which would make the setting of replication simpler, I think.
But I agree to just increase the default value of wal_sender_delay
rather than dropping it.

I dropped the ball on this one..

For comparison, the archiver process and autovacuum launcher wake up
once a second to check if postmaster is still alive. bgwriter, when
bgwriter_lru_maxpages and archive_timeout are set to 0 to disable it,
checks for dead postmaster every 10 seconds.

I'll bump the default for wal_sender_delay to 1 second. Maybe an even
higher value would be good, but it also seems good to kill replication
connections in a timely fashion if postmaster dies.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com