WAL documentation changes

Started by Michael Rennerabout 17 years ago10 messages
#1Michael Renner
michael.renner@amd.co.at

Hi,

the comment WRT WAL recovery and FS journals [1]64b3d98baaf96afea815b0c37ff918f02fda11c9 is a bit misleading in
it's current form.

First, none of the general purpose filesystems I've seen so far do data
journalling per default, since it's a huge performance penalty, even for
non-RDBMS workloads. The feature you talk about is ext3 specific (and
should be pointed out as such) and only disables write ordering, meaning
that metadata and file content updates are not synchronized.

best regards,
Michael

[1]: 64b3d98baaf96afea815b0c37ff918f02fda11c9

#2Bruce Momjian
bruce@momjian.us
In reply to: Michael Renner (#1)
1 attachment(s)
Re: WAL documentation changes

Michael Renner wrote:

Hi,

the comment WRT WAL recovery and FS journals [1] is a bit misleading in
it's current form.

First, none of the general purpose filesystems I've seen so far do data
journalling per default, since it's a huge performance penalty, even for
non-RDBMS workloads. The feature you talk about is ext3 specific (and
should be pointed out as such) and only disables write ordering, meaning
that metadata and file content updates are not synchronized.

You are right that my docs were misleading. I have improved them by
mentioning that it is _data_ flush that as part of journalling that can
be a problem, and documented that the mount option listed is
ext3-specific, not linux-specific.

Updated docs attached. Please let me know if I can improve it some
more.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload
Index: doc/src/sgml/wal.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/wal.sgml,v
retrieving revision 1.54
diff -c -c -r1.54 wal.sgml
*** doc/src/sgml/wal.sgml	6 Dec 2008 21:34:27 -0000	1.54
--- doc/src/sgml/wal.sgml	10 Dec 2008 11:04:08 -0000
***************
*** 139,151 ****
      <para>
       Because <acronym>WAL</acronym> restores database file
       contents after a crash, it is not necessary to use a
!      journaled filesystem;  in fact, journaling overhead can
!      reduce performance.  For best performance, turn off
!      <emphasis>data</emphasis> journaling as a filesystem mount
!      option, e.g. use <literal>data=writeback</> on Linux.
!      Meta-data journaling (e.g.  file creation and directory
!      modification) is still desirable for faster rebooting after
!      a crash.
      </para>
     </tip>
  
--- 139,151 ----
      <para>
       Because <acronym>WAL</acronym> restores database file
       contents after a crash, it is not necessary to use a
!      journaled filesystem for reliability.  In fact, journaling
!      overhead can reduce performance, especially if journaling
!      causes file system <emphasis>data</emphasis> to be flushed
!      to disk.  Fortunately, data flushing during journaling can
!      often be disabled with a filesystem mount option, e.g.
!      <literal>data=writeback</> on a Linux ext3 file system.
!      Journaled file systems do improve boot speed after a crash.
      </para>
     </tip>
  
#3Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#2)
Re: WAL documentation changes

First, none of the general purpose filesystems I've seen so far do data
journalling per default, since it's a huge performance penalty, even for
non-RDBMS workloads. The feature you talk about is ext3 specific (and
should be pointed out as such) and only disables write ordering, meaning
that metadata and file content updates are not synchronized.

You are right that my docs were misleading. I have improved them by
mentioning that it is _data_ flush that as part of journalling that can
be a problem, and documented that the mount option listed is
ext3-specific, not linux-specific.

Actually, I think that some of the other journalling filesystems allow
data journalling (I know ReiserFS does), they just don't default to it.
For that matter, a few (ZFS in particular) have data journalling which
can't be turned off. While it's not a tuning parameter, users should be
warned that they'll take a performance hit from it.

--Josh

#4Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#3)
Re: WAL documentation changes

Josh Berkus wrote:

First, none of the general purpose filesystems I've seen so far do data
journalling per default, since it's a huge performance penalty, even for
non-RDBMS workloads. The feature you talk about is ext3 specific (and
should be pointed out as such) and only disables write ordering, meaning
that metadata and file content updates are not synchronized.

You are right that my docs were misleading. I have improved them by
mentioning that it is _data_ flush that as part of journalling that can
be a problem, and documented that the mount option listed is
ext3-specific, not linux-specific.

Actually, I think that some of the other journalling filesystems allow
data journalling (I know ReiserFS does), they just don't default to it.
For that matter, a few (ZFS in particular) have data journalling which
can't be turned off. While it's not a tuning parameter, users should be
warned that they'll take a performance hit from it.

So I assume you are saying the docs are fine now.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#5Tatsuo Ishii
ishii@postgresql.org
In reply to: Bruce Momjian (#4)
Re: WAL documentation changes

Bruce,

In your document change which one can be placed on non-journalling
file system? data? wal? or both?

For me it seems it's not clear.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Show quoted text

Josh Berkus wrote:

First, none of the general purpose filesystems I've seen so far do data
journalling per default, since it's a huge performance penalty, even for
non-RDBMS workloads. The feature you talk about is ext3 specific (and
should be pointed out as such) and only disables write ordering, meaning
that metadata and file content updates are not synchronized.

You are right that my docs were misleading. I have improved them by
mentioning that it is _data_ flush that as part of journalling that can
be a problem, and documented that the mount option listed is
ext3-specific, not linux-specific.

Actually, I think that some of the other journalling filesystems allow
data journalling (I know ReiserFS does), they just don't default to it.
For that matter, a few (ZFS in particular) have data journalling which
can't be turned off. While it's not a tuning parameter, users should be
warned that they'll take a performance hit from it.

So I assume you are saying the docs are fine now.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Bruce Momjian
bruce@momjian.us
In reply to: Tatsuo Ishii (#5)
1 attachment(s)
Re: WAL documentation changes

Tatsuo Ishii wrote:

Bruce,

In your document change which one can be placed on non-journalling
file system? data? wal? or both?

Both. I have updated the docs to mention this, patch attached.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload
Index: doc/src/sgml/wal.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/wal.sgml,v
retrieving revision 1.55
diff -c -c -r1.55 wal.sgml
*** doc/src/sgml/wal.sgml	10 Dec 2008 11:05:49 -0000	1.55
--- doc/src/sgml/wal.sgml	18 Dec 2008 22:15:53 -0000
***************
*** 138,145 ****
     <tip>
      <para>
       Because <acronym>WAL</acronym> restores database file
!      contents after a crash, it is not necessary to use a
!      journaled filesystem for reliability.  In fact, journaling
       overhead can reduce performance, especially if journaling
       causes file system <emphasis>data</emphasis> to be flushed
       to disk.  Fortunately, data flushing during journaling can
--- 138,145 ----
     <tip>
      <para>
       Because <acronym>WAL</acronym> restores database file
!      contents after a crash, journaled filesystems are necessary for
!      reliable storage of the data files or WAL files.  In fact, journaling
       overhead can reduce performance, especially if journaling
       causes file system <emphasis>data</emphasis> to be flushed
       to disk.  Fortunately, data flushing during journaling can
#7Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#6)
Re: WAL documentation changes

Bruce Momjian <bruce@momjian.us> wrote:

Tatsuo Ishii wrote:

In your document change which one can be placed on non-journalling
file system? data? wal? or both?

Both. I have updated the docs to mention this, patch attached.

Did you mean to say that journaled file systems are *not* necessary?

-Kevin

#8Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#7)
Re: WAL documentation changes

Kevin Grittner wrote:

Bruce Momjian <bruce@momjian.us> wrote:

Tatsuo Ishii wrote:

In your document change which one can be placed on non-journalling
file system? data? wal? or both?

Both. I have updated the docs to mention this, patch attached.

Did you mean to say that journaled file systems are *not* necessary?

Yes, not needed for database reliablity. The patch text was attached;
was it unclear?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#9Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#8)
Re: WAL documentation changes

Bruce Momjian <bruce@momjian.us> wrote:

Kevin Grittner wrote:

Did you mean to say that journaled file systems are *not*

necessary?

Yes, not needed for database reliablity. The patch text was

attached;

was it unclear?

I think you accidentally left out the word "not".

-Kevin

#10Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#9)
Re: WAL documentation changes

Kevin Grittner wrote:

Bruce Momjian <bruce@momjian.us> wrote:

Kevin Grittner wrote:

Did you mean to say that journaled file systems are *not*

necessary?

Yes, not needed for database reliablity. The patch text was

attached;

was it unclear?

I think you accidentally left out the word "not".

Oops, right, added. Good catch. Warping that sentence into something
that allowed the mention of WAL and data files was obviously too much
for me. ;-)

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +