no universally correct setting for fsync
Someone just posted to the -admin list with a database corrupted
while running with fsync=off. I was all set to refer him to the
documentation explaining why he should stop doing this, but to my
surprise the documentation waffles on the issue way past what I
think is reasonable.
http://www.postgresql.org/docs/8.4/interactive/runtime-config-wal.html#GUC-FSYNC
There are dire-sounding statements interspersed with:
| using fsync results in a performance penalty
| Due to the risks involved, there is no universally correct setting
| for fsync.
| If you trust your operating system, your hardware, and your
| utility company (or your battery backup), you can consider
| disabling fsync.
Isn't this a little too rosy a picture to paint?
-Kevin
On Fri, May 7, 2010 at 9:47 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
Someone just posted to the -admin list with a database corrupted
while running with fsync=off. I was all set to refer him to the
documentation explaining why he should stop doing this, but to my
surprise the documentation waffles on the issue way past what I
think is reasonable.http://www.postgresql.org/docs/8.4/interactive/runtime-config-wal.html#GUC-FSYNC
There are dire-sounding statements interspersed with:
| using fsync results in a performance penalty
| Due to the risks involved, there is no universally correct setting
| for fsync.| If you trust your operating system, your hardware, and your
| utility company (or your battery backup), you can consider
| disabling fsync.Isn't this a little too rosy a picture to paint?
I agree. I've always thought this part of the documentation made
setting fsync=off much more reasonable than I feel it to be.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Fri, May 7, 2010 at 16:00, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, May 7, 2010 at 9:47 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:Someone just posted to the -admin list with a database corrupted
while running with fsync=off. I was all set to refer him to the
documentation explaining why he should stop doing this, but to my
surprise the documentation waffles on the issue way past what I
think is reasonable.http://www.postgresql.org/docs/8.4/interactive/runtime-config-wal.html#GUC-FSYNC
There are dire-sounding statements interspersed with:
| using fsync results in a performance penalty
| Due to the risks involved, there is no universally correct setting
| for fsync.| If you trust your operating system, your hardware, and your
| utility company (or your battery backup), you can consider
| disabling fsync.Isn't this a little too rosy a picture to paint?
I agree. I've always thought this part of the documentation made
setting fsync=off much more reasonable than I feel it to be.
+1, definitely. fsync=off should only be done if you *really*
understand what it means, and that requires a lot more explanation
than that...
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
| If you trust your operating system, your hardware, and your
| utility company (or your battery backup), you can consider
| disabling fsync.
Isn't this a little too rosy a picture to paint?
I think that statement is true as far as it goes, but I agree with
rejiggering the surrounding text. The whole thing was written back
when Postgres was by far the least reliable component of the stack.
It isn't anymore. We should make it clear that fsync=off is not ever
recommended for production.
regards, tom lane
Kevin Grittner wrote:
There are dire-sounding statements interspersed with:
| using fsync results in a performance penalty
| Due to the risks involved, there is no universally correct setting
| for fsync.| If you trust your operating system, your hardware, and your
| utility company (or your battery backup), you can consider
| disabling fsync.Isn't this a little too rosy a picture to paint?
I think the critical question is really whether you are prepared to lose
your database.
I have a customer who rotates databases in and out of line, and
processes major updates on the out of line database. If they lose the
database occasionally they are prepared to wear that risk for the
performance gain they get from running with fsync off. It just means
that they have to recover and so the inline database will get a bit
staler than usual while they do.
So I think its true that there is no universally right answer. Maybe the
criteria mentioned in the last para need tweaking some, though. It's not
just a matter of trusting hardware etc. I have seen mishaps when idiots
knock out power cords and the like. The unexpected does sometime happen,
despite the best planning.
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> wrote:
I think the critical question is really whether you are prepared
to lose your database.
Precisely; and the docs don't make that at all clear. They mention
the possibility of database corruption, but downplay it:
| When fsync is disabled, the operating system is allowed to do its
| best in buffering, ordering, and delaying writes. This can result
| in significantly improved performance. However, if the system
| crashes, the results of the last few committed transactions might
| be lost in part or whole. In the worst case, unrecoverable data
| corruption might occur.
[valid use case for fsync=off]
So I think its true that there is no universally right answer.
Maybe the criteria mentioned in the last para need tweaking some,
though.
I think it goes beyond "tweaking" -- I think we should have a bald
statement like "don't turn this off unless you're OK with losing the
entire contents of the database cluster." A brief listing of some
cases where that is OK might be illustrative.
I never meant to suggest any statement in that section is factually
wrong; it's just all too rosy, leading people to believe it's no big
deal to turn it off.
-Kevin
I never meant to suggest any statement in that section is factually
wrong; it's just all too rosy, leading people to believe it's no big
deal to turn it off.
Yeah, that section is overdue for an update. I'll write some new text
and post it to pgsql-docs.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
--On 7. Mai 2010 09:48:53 -0500 Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
I think it goes beyond "tweaking" -- I think we should have a bald
statement like "don't turn this off unless you're OK with losing the
entire contents of the database cluster." A brief listing of some
cases where that is OK might be illustrative.
+1
I never meant to suggest any statement in that section is factually
wrong; it's just all too rosy, leading people to believe it's no big
deal to turn it off.
I think one mistake in this paragraph is the passing mention of
"performance". I've seen installations in the past with fsync=off only
because the admin was pressured to get instantly "more speed" out of the
database (think of "fast_mode=on"). In my opinion, phrases like
"performance penalty" are misleading, if you need that setting in 99% of
all use cases for reliable operation.
I've recently even started to wonder if the performance gain with fsync=off
is still that large on modern hardware. While testing large migration
procedures to a new version some time ago (on an admitedly fast storage) i
forgot here and then to turn it off, without a significant degradation in
performance.
--
Thanks
Bernd
Bernd Helmle <mailings@oopsware.de> writes:
I've recently even started to wonder if the performance gain with fsync=off
is still that large on modern hardware. While testing large migration
procedures to a new version some time ago (on an admitedly fast storage) i
forgot here and then to turn it off, without a significant degradation in
performance.
That says to me either that you're using a battery-backed write cache,
or your fsyncs don't really work (no write barriers or something like
that).
regards, tom lane
Folks,
This is what I have to replace the current fsync entry in config.sgml.
I believe that the note about needing fsync for Warm Standby to work
correctly is true, but could someone verify it?
=========================
<varlistentry id="guc-fsync" xreflabel="fsync">
<indexterm>
<primary><varname>fsync</> configuration parameter</primary>
</indexterm>
<term><varname>fsync</varname> (<type>boolean</type>)</term>
<listitem>
<para>
If this parameter is on, the <productname>PostgreSQL</> server
will try to make sure that updates are physically written to
disk, by issuing <function>fsync()</> system calls or various
equivalent methods (see <xref linkend="guc-wal-sync-method">).
This ensures that the database cluster can recover to a
consistent state after an operating system or hardware crash.
</para>
<para>
While turning off <varname>fsync</varname> is often a performance
benefit, this can result in unrecoverable data corruption in the
event
of an unexpected shutdown. Thus it is only advisable to turn off
<varname>fsync</varname> if you can easily recreate
your entire database from external data. <varname>fsync</varname>
must be on for WAL archiving to work correctly
(see <xref linkend="continuous-archiving">).
<para>
<para>
In many situations, turning off <xref
linkend="guc-synchronous-commit">
for noncritical transactions can provide much of the potential
performance benefit of turning off <varname>fsync</varname>, without
the attendant risks of data corruption.
</para>
<para>
<varname>fsync</varname> can only be set in the
<filename>postgresql.conf</>
file or on the server command line.
If you turn this parameter off, also consider turning off
<xref linkend="guc-full-page-writes">.
</para>
</listitem>
</varlistentry>
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
This is what I have to replace the current fsync entry in config.sgml.
s/unexpected shutdown/system crash/, perhaps. The wording you have
suggests that a forced Postgres stoppage produces a problem, which it
doesn't. It takes a failure at the OS level or below to cause a
problem.
I believe that the note about needing fsync for Warm Standby to work
correctly is true, but could someone verify it?
AFAIK that's nonsense. The filesystem state that pg_standby could see
will be updated in any case; pg_standby has no direct access to the bits
on the platters, any more than Postgres does.
regards, tom lane
--On 7. Mai 2010 19:49:15 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bernd Helmle <mailings@oopsware.de> writes:
I've recently even started to wonder if the performance gain with
fsync=off is still that large on modern hardware. While testing large
migration procedures to a new version some time ago (on an admitedly
fast storage) i forgot here and then to turn it off, without a
significant degradation in performance.That says to me either that you're using a battery-backed write cache,
or your fsyncs don't really work (no write barriers or something like
that).
Well, yes, BBU present and proven storage. Maybe i'm wrong, but it seems
battery backed write caches aren't that seldom even in low end systems
nowadays.
--
Thanks
Bernd
On 5/7/10 5:13 PM, Tom Lane wrote:
Josh Berkus <josh@agliodbs.com> writes:
This is what I have to replace the current fsync entry in config.sgml.
s/unexpected shutdown/system crash/, perhaps. The wording you have
suggests that a forced Postgres stoppage produces a problem, which it
doesn't. It takes a failure at the OS level or below to cause a
problem.
I actually meant "unexpected *system* shutdown", i.e. power-out. A lot
of people think "crash" just means kernel dump, whereas a UPS failure or
tripped power cord is a lot more likely (except maybe on Windows).
Revised:
==================
<varlistentry id="guc-fsync" xreflabel="fsync">
<indexterm>
<primary><varname>fsync</> configuration parameter</primary>
</indexterm>
<term><varname>fsync</varname> (<type>boolean</type>)</term>
<listitem>
<para>
If this parameter is on, the <productname>PostgreSQL</> server
will try to make sure that updates are physically written to
disk, by issuing <function>fsync()</> system calls or various
equivalent methods (see <xref linkend="guc-wal-sync-method">).
This ensures that the database cluster can recover to a
consistent state after an operating system or hardware crash.
</para>
<para>
While turning off <varname>fsync</varname> is often a performance
benefit, this can result in unrecoverable data corruption in the
event
of an unexpected system shutdown or crash. Thus it is only
advisable
to turn off <varname>fsync</varname> if you can easily recreate
your entire database from external data.
<para>
<para>
In many situations, turning off <xref
linkend="guc-synchronous-commit">
for noncritical transactions can provide much of the potential
performance benefit of turning off <varname>fsync</varname>, without
the attendant risks of data corruption.
</para>
<para>
<varname>fsync</varname> can only be set in the
<filename>postgresql.conf</>
file or on the server command line.
If you turn this parameter off, also consider turning off
<xref linkend="guc-full-page-writes">.
</para>
</listitem>
</varlistentry>
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
On 8/05/2010 1:56 AM, Josh Berkus wrote:
I never meant to suggest any statement in that section is factually
wrong; it's just all too rosy, leading people to believe it's no big
deal to turn it off.Yeah, that section is overdue for an update. I'll write some new text
and post it to pgsql-docs.
It's probably worth mentioning that people who want to turn off fsync to
gain a performance boost should instead look at a RAID controller with a
BBU so they can safely enable write-back caching, getting most of the
benefits of fsync=off safely.
--
Craig Ringer
Josh Berkus wrote:
I believe that the note about needing fsync for Warm Standby to
work correctly is true, but could someone verify it?
It couldn't really affect the archiving of the WAL files, but if your
warm standby is there for recovery purposes, it might not make a lot
of sense to turn off fsync on the standby -- if that is something
which has an effect during the recovery phase. Does it?
Also, perhaps the issue deserves some mention in the PITR recovery
section:
http://www.postgresql.org/docs/9.0/static/continuous-archiving.html#BACKUP-PITR-RECOVERY
Step 6 says:
| If you have unarchived WAL segment files that you saved in step 2,
| copy them into pg_xlog/. (It is best to copy them, not move them,
| so you still have the unmodified files if a problem occurs and you
| have to start over.)
If the recovery is happening because of OS or hardware failure on the
source, and it was running with fsync off, this might be unwise.
-Kevin
Import Notes
Resolved by subject fallback
On 05/08/2010 04:07 AM, Craig Ringer wrote:
It's probably worth mentioning that people who want to turn off fsync to
gain a performance boost should instead look at a RAID controller with a
BBU so they can safely enable write-back caching, getting most of the
benefits of fsync=off safely.
Which options specifically should be set if a BBU is in use? Obviously
fsync should be on always, but can full_page_writes be disabled? Are
there other tweaks that can be done?
It would be great to see some practical hints in the documentation while
the fsync part is getting changed.
-- m. tharp
Michael Tharp wrote:
On 05/08/2010 04:07 AM, Craig Ringer wrote:
It's probably worth mentioning that people who want to turn off fsync to
gain a performance boost should instead look at a RAID controller with a
BBU so they can safely enable write-back caching, getting most of the
benefits of fsync=off safely.Which options specifically should be set if a BBU is in use? Obviously
fsync should be on always, but can full_page_writes be disabled? Are
there other tweaks that can be done?It would be great to see some practical hints in the documentation while
the fsync part is getting changed.
Uh, our docs have:
Turning this parameter off speeds normal operation, but might
lead to a corrupt database after an operating system crash or
power failure. The risks are similar to turning off
<varname>fsync</>, though smaller. It might be safe to turn
off this parameter if you have hardware (such as a battery-backed
disk controller) or file-system software that reduces the risk
of partial page writes to an acceptably low level (e.g., ZFS).
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
On Mon, May 10, 2010 at 11:12 AM, Bruce Momjian <bruce@momjian.us> wrote:
Michael Tharp wrote:
On 05/08/2010 04:07 AM, Craig Ringer wrote:
It's probably worth mentioning that people who want to turn off fsync to
gain a performance boost should instead look at a RAID controller with a
BBU so they can safely enable write-back caching, getting most of the
benefits of fsync=off safely.Which options specifically should be set if a BBU is in use? Obviously
fsync should be on always, but can full_page_writes be disabled? Are
there other tweaks that can be done?It would be great to see some practical hints in the documentation while
the fsync part is getting changed.Uh, our docs have:
Turning this parameter off speeds normal operation, but might
lead to a corrupt database after an operating system crash or
power failure. The risks are similar to turning off
<varname>fsync</>, though smaller. It might be safe to turn
off this parameter if you have hardware (such as a battery-backed
disk controller) or file-system software that reduces the risk
of partial page writes to an acceptably low level (e.g., ZFS).
"It might be safe" is a bit of a waffle. It would be nice if we could
provide some more clear guidance as to whether it is or is not, or how
someone could go about testing their hardware to find out.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas wrote:
On Mon, May 10, 2010 at 11:12 AM, Bruce Momjian <bruce@momjian.us> wrote:
Michael Tharp wrote:
On 05/08/2010 04:07 AM, Craig Ringer wrote:
It's probably worth mentioning that people who want to turn off fsync to
gain a performance boost should instead look at a RAID controller with a
BBU so they can safely enable write-back caching, getting most of the
benefits of fsync=off safely.Which options specifically should be set if a BBU is in use? Obviously
fsync should be on always, but can full_page_writes be disabled? Are
there other tweaks that can be done?It would be great to see some practical hints in the documentation while
the fsync part is getting changed.Uh, our docs have:
? ? ? ?Turning this parameter off speeds normal operation, but might
? ? ? ?lead to a corrupt database after an operating system crash or
? ? ? ?power failure. The risks are similar to turning off
? ? ? ?<varname>fsync</>, though smaller. ?It might be safe to turn
? ? ? ?off this parameter if you have hardware (such as a battery-backed
? ? ? ?disk controller) or file-system software that reduces the risk
? ? ? ?of partial page writes to an acceptably low level (e.g., ZFS)."It might be safe" is a bit of a waffle. It would be nice if we could
provide some more clear guidance as to whether it is or is not, or how
someone could go about testing their hardware to find out.
Agreed. It is "safe" for us to be definitive here?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
Robert Haas <robertmhaas@gmail.com> wrote:
"It might be safe" is a bit of a waffle. It would be nice if we
could provide some more clear guidance as to whether it is or is
not, or how someone could go about testing their hardware to find
out.
I think that the issue is that you could have corruption if some,
but not all, disk sectors from a page were written from OS cache to
controller cache when a failure occurred. The window would be small
for a RAM-to-RAM write, but it wouldn't be entirely *safe* unless
there's some OS/driver environment where you could count on all the
sectors making it or none of them making it for every single page.
Does such an environment exist?
-Kevin