Disable page writes when fsync off, add GUC

Started by Bruce Momjianover 20 years ago10 messages
#1Bruce Momjian
pgman@candle.pha.pa.us
1 attachment(s)

This patch disables page writes to WAL when fsync is off, because with
no fsync guarantee, the page write recovery isn't useful.

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes, but still might want fsync.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/pgpatches/fsynctext/plainDownload
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.335
diff -c -c -r1.335 runtime.sgml
*** doc/src/sgml/runtime.sgml	2 Jul 2005 19:16:36 -0000	1.335
--- doc/src/sgml/runtime.sgml	4 Jul 2005 03:58:34 -0000
***************
*** 1687,1692 ****
--- 1687,1723 ----
        </listitem>
       </varlistentry>
       
+      <varlistentry id="guc-full-page-writes" xreflabel="full_page_writes">
+       <indexterm>
+        <primary><varname>full_page_writes</> configuration parameter</primary>
+       </indexterm>
+       <term><varname>full_page_writes</varname> (<type>boolean</type>)</term>
+       <listitem>
+        <para>
+         A page write in process during an operating system crash might
+         be only partially written to disk, leading to an on-disk page
+         that contains a mix of old and new data. During recovery, the
+         row changes stored in WAL are not enough to recover from this
+         situation.
+        </para>
+ 
+        <para>
+         When this option is on, the <productname>PostgreSQL</> server
+         writes full pages when first modified after a checkpoint to WAL
+         so full recovery is possible. Turning this option off might lead
+         to a corrupt system after an operating system crash because
+         uncorrected partial pages might contain inconsistent or corrupt
+         data. The risks are less but similar to <varname>fsync</>.
+        </para>
+ 
+        <para>
+         This option can only be set at server start or in the
+         <filename>postgresql.conf</filename> file.  The default is
+         <literal>on</>.
+        </para>
+       </listitem>
+      </varlistentry>
+      
       <varlistentry id="guc-wal-buffers" xreflabel="wal_buffers">
        <term><varname>wal_buffers</varname> (<type>integer</type>)</term>
        <indexterm>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.205
diff -c -c -r1.205 xlog.c
*** src/backend/access/transam/xlog.c	30 Jun 2005 00:00:50 -0000	1.205
--- src/backend/access/transam/xlog.c	4 Jul 2005 03:58:38 -0000
***************
*** 97,102 ****
--- 97,103 ----
  char	   *XLogArchiveCommand = NULL;
  char	   *XLOG_sync_method = NULL;
  const char	XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
+ bool		fullPageWrites = true;
  
  #ifdef WAL_DEBUG
  bool		XLOG_DEBUG = false;
***************
*** 593,599 ****
  				{
  					/* OK, put it in this slot */
  					dtbuf[i] = rdt->buffer;
! 					if (XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
  					{
  						dtbuf_bkp[i] = true;
  						rdt->data = NULL;
--- 594,602 ----
  				{
  					/* OK, put it in this slot */
  					dtbuf[i] = rdt->buffer;
! 					/* If fsync is off, no need to backup pages. */
! 					if (enableFsync && fullPageWrites &&
! 						XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
  					{
  						dtbuf_bkp[i] = true;
  						rdt->data = NULL;
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.271
diff -c -c -r1.271 guc.c
*** src/backend/utils/misc/guc.c	28 Jun 2005 05:09:02 -0000	1.271
--- src/backend/utils/misc/guc.c	4 Jul 2005 03:58:46 -0000
***************
*** 82,87 ****
--- 82,88 ----
  extern int	CommitDelay;
  extern int	CommitSiblings;
  extern char *default_tablespace;
+ extern bool	fullPageWrites;
  
  static const char *assign_log_destination(const char *value,
  					   bool doit, GucSource source);
***************
*** 482,487 ****
--- 483,500 ----
  		false, NULL, NULL
  	},
  	{
+ 		{"full_page_writes", PGC_SIGHUP, WAL_SETTINGS,
+ 			gettext_noop("Fully writes pages when first modified after a checkpoint."),
+ 			gettext_noop("A page write in process during an operating system crash might be "
+ 						 "only partially written to disk.  During recovery, the row changes"
+ 						 "stored in WAL are not enough to recover.  This option writes "
+ 						 "pages when first modified after a checkpoint to WAL so full recovery "
+ 						 "is possible.")
+ 		},
+ 		&fullPageWrites,
+ 		true, NULL, NULL
+ 	},
+ 	{
  		{"silent_mode", PGC_POSTMASTER, LOGGING_WHEN,
  			gettext_noop("Runs the server silently."),
  			gettext_noop("If this parameter is set, the server will automatically run in the "
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.151
diff -c -c -r1.151 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	2 Jul 2005 18:46:45 -0000	1.151
--- src/backend/utils/misc/postgresql.conf.sample	4 Jul 2005 03:58:46 -0000
***************
*** 121,126 ****
--- 121,127 ----
  #wal_sync_method = fsync	# the default varies across platforms:
  				# fsync, fdatasync, fsync_writethrough,
  				# open_sync, open_datasync
+ #full_page_writes = on		# recover from partial page writes
  #wal_buffers = 8		# min 4, 8KB each
  #commit_delay = 0		# range 0-100000, in microseconds
  #commit_siblings = 5		# range 1-1000
#2Stephen Frost
sfrost@snowman.net
In reply to: Bruce Momjian (#1)
Re: Disable page writes when fsync off, add GUC

* Bruce Momjian (pgman@candle.pha.pa.us) wrote:

This patch disables page writes to WAL when fsync is off, because with
no fsync guarantee, the page write recovery isn't useful.

This doesn't seem quite right to me. What happens with PITR? And
Postgres crashes? While many people seriously distrust running w/ fsync
off, I'm sure there's quite a few folks which do.

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes, but still might want fsync.

Adding an option to not do page writes to WAL seems fine to me, but I
think WAL writes should be on by default, even in the fsync=off case.
If people want to turn it off, fine, for either case since we expect
they understand what it means to have it turned off, but I don't think
the two options should be coupled as is being proposed.

Thanks,

Stephen

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Stephen Frost (#2)
Re: [PATCHES] Disable page writes when fsync off, add GUC

Stephen Frost wrote:
-- Start of PGP signed section.

* Bruce Momjian (pgman@candle.pha.pa.us) wrote:

This patch disables page writes to WAL when fsync is off, because with
no fsync guarantee, the page write recovery isn't useful.

This doesn't seem quite right to me. What happens with PITR? And

PITR doesn't need page writes at all because it has a full backup the
file system to start with. In fact with PITR the crashed file system
isn't used at all (restored from backup). In fact there is a TODO to
exclude full page writes from the PITR backup of WAL.

Postgres crashes? While many people seriously distrust running w/ fsync
off, I'm sure there's quite a few folks which do.

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes, but still might want fsync.

Adding an option to not do page writes to WAL seems fine to me, but I
think WAL writes should be on by default, even in the fsync=off case.
If people want to turn it off, fine, for either case since we expect
they understand what it means to have it turned off, but I don't think
the two options should be coupled as is being proposed.

That is a question I had in my mind. I added documentation that turning
off fsync also disables full_page_writes, but we could decouple them and
tell people to consider disableing full_pages_writes if they turn off
fsync, basically suggesting they make the second change.

Other opinions?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#4Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#1)
Re: Disable page writes when fsync off, add GUC

Bruce Momjian wrote:

This patch disables page writes to WAL when fsync is off, because
with no fsync guarantee, the page write recovery isn't useful.

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes, but still might want
fsync.

Do you have some numbers to suggest that there is a performance benefit
to be had?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#5Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#4)
Re: Disable page writes when fsync off, add GUC

Peter Eisentraut wrote:

Bruce Momjian wrote:

This patch disables page writes to WAL when fsync is off, because
with no fsync guarantee, the page write recovery isn't useful.

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes, but still might want
fsync.

Do you have some numbers to suggest that there is a performance benefit
to be had?

Josh reported page writes to be a big hit (which we already knew), but I
don't have any with fsync off, though it seems like a no-brainer.
However, I am thinking decoupling them is best.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: [PATCHES] Disable page writes when fsync off, add GUC

Bruce Momjian <pgman@candle.pha.pa.us> writes:

That is a question I had in my mind. I added documentation that turning
off fsync also disables full_page_writes, but we could decouple them and
tell people to consider disableing full_pages_writes if they turn off
fsync, basically suggesting they make the second change.

Other opinions?

I'm for treating them as independent options.

regards, tom lane

#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#6)
Re: [PATCHES] Disable page writes when fsync off, add GUC

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

That is a question I had in my mind. I added documentation that turning
off fsync also disables full_page_writes, but we could decouple them and
tell people to consider disableing full_pages_writes if they turn off
fsync, basically suggesting they make the second change.

Other opinions?

I'm for treating them as independent options.

Agreed. I will modify and apply.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Bruce Momjian (#1)
1 attachment(s)
Re: Disable page writes when fsync off, add GUC

Bruce Momjian wrote:

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes.

Fsync linkage removed, patch attached and applied.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/difftext/plainDownload
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.335
diff -c -c -r1.335 runtime.sgml
*** doc/src/sgml/runtime.sgml	2 Jul 2005 19:16:36 -0000	1.335
--- doc/src/sgml/runtime.sgml	5 Jul 2005 23:15:33 -0000
***************
*** 1660,1666 ****
  
         <para>
          This option can only be set at server start or in the
!         <filename>postgresql.conf</filename> file.
         </para>
        </listitem>
       </varlistentry>
--- 1660,1668 ----
  
         <para>
          This option can only be set at server start or in the
!         <filename>postgresql.conf</filename> file.  If this option
!         is <literal>off</>, consider also turning off 
!         <varname>guc-full-page-writes</>.
         </para>
        </listitem>
       </varlistentry>
***************
*** 1687,1692 ****
--- 1689,1725 ----
        </listitem>
       </varlistentry>
       
+      <varlistentry id="guc-full-page-writes" xreflabel="full_page_writes">
+       <indexterm>
+        <primary><varname>full_page_writes</> configuration parameter</primary>
+       </indexterm>
+       <term><varname>full_page_writes</varname> (<type>boolean</type>)</term>
+       <listitem>
+        <para>
+         A page write in process during an operating system crash might
+         be only partially written to disk, leading to an on-disk page
+         that contains a mix of old and new data. During recovery, the
+         row changes stored in WAL are not enough to completely restore
+         the page.
+        </para>
+ 
+        <para>
+         When this option is on, the <productname>PostgreSQL</> server
+         writes full pages to WAL when they first modified after a checkpoint
+         so full recovery is possible. Turning this option off might lead
+         to a corrupt system after an operating system crash because
+         uncorrected partial pages might contain inconsistent or corrupt
+         data. The risks are less but similar to <varname>fsync</>.
+        </para>
+ 
+        <para>
+         This option can only be set at server start or in the
+         <filename>postgresql.conf</filename> file.  The default is
+         <literal>on</>.
+        </para>
+       </listitem>
+      </varlistentry>
+      
       <varlistentry id="guc-wal-buffers" xreflabel="wal_buffers">
        <term><varname>wal_buffers</varname> (<type>integer</type>)</term>
        <indexterm>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.206
diff -c -c -r1.206 xlog.c
*** src/backend/access/transam/xlog.c	4 Jul 2005 04:51:44 -0000	1.206
--- src/backend/access/transam/xlog.c	5 Jul 2005 23:15:36 -0000
***************
*** 103,108 ****
--- 103,109 ----
  char	   *XLogArchiveCommand = NULL;
  char	   *XLOG_sync_method = NULL;
  const char	XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
+ bool		fullPageWrites = true;
  
  #ifdef WAL_DEBUG
  bool		XLOG_DEBUG = false;
***************
*** 594,600 ****
  				{
  					/* OK, put it in this slot */
  					dtbuf[i] = rdt->buffer;
! 					if (XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
  					{
  						dtbuf_bkp[i] = true;
  						rdt->data = NULL;
--- 595,603 ----
  				{
  					/* OK, put it in this slot */
  					dtbuf[i] = rdt->buffer;
! 					/* If fsync is off, no need to backup pages. */
! 					if (fullPageWrites &&
! 						XLogCheckBuffer(rdt, &(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
  					{
  						dtbuf_bkp[i] = true;
  						rdt->data = NULL;
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.272
diff -c -c -r1.272 guc.c
*** src/backend/utils/misc/guc.c	4 Jul 2005 04:51:51 -0000	1.272
--- src/backend/utils/misc/guc.c	5 Jul 2005 23:15:39 -0000
***************
*** 83,88 ****
--- 83,89 ----
  extern int	CommitDelay;
  extern int	CommitSiblings;
  extern char *default_tablespace;
+ extern bool	fullPageWrites;
  
  static const char *assign_log_destination(const char *value,
  					   bool doit, GucSource source);
***************
*** 483,488 ****
--- 484,501 ----
  		false, NULL, NULL
  	},
  	{
+ 		{"full_page_writes", PGC_SIGHUP, WAL_SETTINGS,
+ 			gettext_noop("Writes full pages to WAL when first modified after a checkpoint."),
+ 			gettext_noop("A page write in process during an operating system crash might be "
+ 						 "only partially written to disk.  During recovery, the row changes"
+ 						 "stored in WAL are not enough to recover.  This option writes "
+ 						 "pages when first modified after a checkpoint to WAL so full recovery "
+ 						 "is possible.")
+ 		},
+ 		&fullPageWrites,
+ 		true, NULL, NULL
+ 	},
+ 	{
  		{"silent_mode", PGC_POSTMASTER, LOGGING_WHEN,
  			gettext_noop("Runs the server silently."),
  			gettext_noop("If this parameter is set, the server will automatically run in the "
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.151
diff -c -c -r1.151 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	2 Jul 2005 18:46:45 -0000	1.151
--- src/backend/utils/misc/postgresql.conf.sample	5 Jul 2005 23:15:39 -0000
***************
*** 121,126 ****
--- 121,127 ----
  #wal_sync_method = fsync	# the default varies across platforms:
  				# fsync, fdatasync, fsync_writethrough,
  				# open_sync, open_datasync
+ #full_page_writes = on		# recover from partial page writes
  #wal_buffers = 8		# min 4, 8KB each
  #commit_delay = 0		# range 0-100000, in microseconds
  #commit_siblings = 5		# range 1-1000
#9Michael Paesold
mpaesold@gmx.at
In reply to: Bruce Momjian (#8)
Re: Disable page writes when fsync off, add GUC

Bruce Momjian wrote:

Bruce Momjian wrote:

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes.

Fsync linkage removed, patch attached and applied.

...
+     When this option is on, the <productname>PostgreSQL</> server
+     writes full pages to WAL when they first modified after a checkpoint
+     so full recovery is possible.

I believe this should be "when they _are_ first modified after".

Perhaps you should also mention power failure, not only an operating system
crash as disaster scenario, even if the latter includes the former.

Best Regards,
Michael Paesold

#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Michael Paesold (#9)
1 attachment(s)
Re: Disable page writes when fsync off, add GUC

Michael Paesold wrote:

Bruce Momjian wrote:

Bruce Momjian wrote:

This also adds a full_page_writes GUC to turn off page writes to WAL.
Some people might not want full_page_writes.

Fsync linkage removed, patch attached and applied.

...
+     When this option is on, the <productname>PostgreSQL</> server
+     writes full pages to WAL when they first modified after a checkpoint
+     so full recovery is possible.

I believe this should be "when they _are_ first modified after".

Perhaps you should also mention power failure, not only an operating system
crash as disaster scenario, even if the latter includes the former.

Thanks. Done.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/difftext/plainDownload
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.336
diff -c -c -r1.336 runtime.sgml
*** doc/src/sgml/runtime.sgml	5 Jul 2005 23:18:09 -0000	1.336
--- doc/src/sgml/runtime.sgml	6 Jul 2005 14:40:15 -0000
***************
*** 1705,1715 ****
  
         <para>
          When this option is on, the <productname>PostgreSQL</> server
!         writes full pages to WAL when they first modified after a checkpoint
!         so full recovery is possible. Turning this option off might lead
!         to a corrupt system after an operating system crash because
!         uncorrected partial pages might contain inconsistent or corrupt
!         data. The risks are less but similar to <varname>fsync</>.
         </para>
  
         <para>
--- 1705,1716 ----
  
         <para>
          When this option is on, the <productname>PostgreSQL</> server
!         writes full pages to WAL when they are first modified after a
!         checkpoint so full recovery is possible. Turning this option off
!         might lead to a corrupt system after an operating system crash
!         or power failure because uncorrected partial pages might contain
!         inconsistent or corrupt data. The risks are less but similar to
!         <varname>fsync</>.
         </para>
  
         <para>