Why O_SYNC is faster than fsync on ext3

Started by Yusuf Goolamabbasalmost 22 years ago3 messages
#1Yusuf Goolamabbas
yusufg@outblaze.com

I sent this to Bruce but forgot to cc pgsql-hackers, The patches are
likely to go into 2.6.6. People interested in extremely safe fsync
writes should also follow the IDE barrier thread and the true fsync() in
Linux on IDE thread

----- Forwarded message from Yusuf Goolamabbas <yusufg@outblaze.com> -----

Date: Sat, 20 Mar 2004 20:52:34 +0800
From: Yusuf Goolamabbas <yusufg@outblaze.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
Subject: Your fsync thread on hackers
Message-ID: <20040320125234.GA11221@outblaze.com>

Bruce, haven't followed the thread completely. Accessing the web archive
is slow from Hong Kong but I just wanted to point you to this lkml post
which shows why O_SYNC is much faster than fsync (at least on ext3)

http://marc.theaimsgroup.com/?l=linux-kernel&amp;m=107959907410443&amp;w=2

There are some pending fsync speedups on XFS also. You might want to
consider pointing Tom to do this so he can get the Redhat/Fedora guys to
look at the patches

Hope this helps, Regards, Yusuf

----- End forwarded message -----

--
If you're not using Firefox, you're not surfing the web
you're suffering it
http://www.mozilla.org/products/firefox/why/

#2Manfred Spraul
manfred@colorfullife.com
In reply to: Yusuf Goolamabbas (#1)
Re: Why O_SYNC is faster than fsync on ext3

Yusuf Goolamabbas wrote:

I sent this to Bruce but forgot to cc pgsql-hackers, The patches are
likely to go into 2.6.6. People interested in extremely safe fsync
writes should also follow the IDE barrier thread and the true fsync() in
Linux on IDE thread

Actually the most interesting part of the thread was the initial post
from Peter Zaitsev on a fcntl(fd, F_FULLSYNC, NULL): He wrote that this
is necessary for Mac OS X to force a flush of the write caches in the
disks. Unfortunately I can't find anything about this flag with google.

Another interesting point is that right now, ide write caches must be
disabled for reliable fsync operations with Linux. Recent suse kernels
contain partial support. If the existing patches are completed and
merged, it will be safe to enable write caching.

Perhaps Bruce's cache flush test could be modified slightly to check
that the OS isn't lying about fsync: if fsync is faster than the
rotational delay of the disks, then the setup is not suitable for
postgres. This could be recommended as a setup test in the install document.

--
Manfred

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yusuf Goolamabbas (#1)
Re: Why O_SYNC is faster than fsync on ext3

Yusuf Goolamabbas <yusufg@outblaze.com> writes:

Bruce, haven't followed the thread completely. Accessing the web archive
is slow from Hong Kong but I just wanted to point you to this lkml post
which shows why O_SYNC is much faster than fsync (at least on ext3)
http://marc.theaimsgroup.com/?l=linux-kernel&amp;m=107959907410443&amp;w=2

That patch is broken on its face. If O_SYNC doesn't take longer than
O_DSYNC, and likewise fsync longer than fdatasync, then the Unix
filesystem semantics are not being honored because the file mod time
isn't being updated.

regards, tom lane