fsync with sync, and Win32 unlink
I have talked to Tom today and he is willing to implement the discussed
method of doing fsync on every file modified between checkpoints, and
add unlink handling for open files for Win32.
Here are the implementation details:
1) Create a list in shared memory that holds a fixed number of dirty
files and files that win32 can't delete because they are open.
The list will need to be locked for each insertion. Periodially, the
background writer will lock and empty the list and store it in its own
local memory. Each entry will contain dbid, relfilenode, and the offset
of extent number of the file. As an optimization, inserts will check to
see if the previous entry already matches.
2) Checkpoint behavior will be moved into the background writer.
3) On checkpoint request, either by the user or postmaster, the
background writer will create a subprocess, do a sync(), wait, then do
fsync of all files that were marked as dirty. The sync() should flush
out most of the dirty files in an optimal manner.
4) Losing the shared memory list of delete files during a crash will
not be a problem. In case of a crash, the WAL logs contain information
about deleted files (new in 7.5), and those files will be delete on
recovery.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
I have talked to Tom today and he is willing to implement the
discussed method of doing fsync on every file modified
between checkpoints, and add unlink handling for open files for Win32.
Great news. I'm sure this will benefig Unix platforms as well, when
taking into account the discussions previously.
<snip>
3) On checkpoint request, either by the user or postmaster,
the background writer will create a subprocess, do a sync(),
wait, then do fsync of all files that were marked as dirty.
The sync() should flush out most of the dirty files in an
optimal manner.
Please make a config variable to make this sync() call optional. This
goes for Unix as well, and not just win32 (which doesn't support it).
Consider either a box with many different postgresql instances, or one
that run both postgresql and other software. Issuing sync() in that
sitaution will cause sync of a lot of data that probably doesn't need
syncing.
But it'd probably be a very good thing on a dedicated server, giving the
kernel the chance to optimise.
//Magnus
Import Notes
Resolved by subject fallback
Consider either a box with many different postgresql instances, or one
that run both postgresql and other software. Issuing sync() in that
sitaution will cause sync of a lot of data that probably doesn't need
syncing.
But it'd probably be a very good thing on a dedicated server, giving the
kernel the chance to optimise.
It is not like the sync is done every few seconds ! It is currently done
every 5 minutes (I actually think this is too frequent now that we have
bgwriter, 10 - 20 min would be sufficient). So imho even on a heavily
otherwise used system the sync will be better.
Andreas
Import Notes
Resolved by subject fallback
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
Consider either a box with many different postgresql instances, or one
that run both postgresql and other software. Issuing sync() in that
sitaution will cause sync of a lot of data that probably doesn't need
syncing.
But it'd probably be a very good thing on a dedicated server, giving the
kernel the chance to optimise.It is not like the sync is done every few seconds ! It is currently done
every 5 minutes (I actually think this is too frequent now that we have
bgwriter, 10 - 20 min would be sufficient). So imho even on a heavily
otherwise used system the sync will be better.
Well, the further apart it is the more dangerous it is to be calling sync...
I've seen some pretty severe damage caused by calling sync(2) on a loaded
system. The system in question was in the process of copying data to an NFS
mounted archival site. When the sync hit basically everything stopped until
the buffered network writes could be synced. The live database was basically
frozen for a few seconds and the web site nearly crashed. The sysadmin had to
send out a notice asking everybody to refrain from using sync until the
archival process had completed.
Now that's not a common situation, but I think it shows how doing things that
cause system-wide effects is unwise.
--
greg
Greg Stark <gsstark@mit.edu> writes:
I've seen some pretty severe damage caused by calling sync(2) on a loaded
system. The system in question was in the process of copying data to an NFS
mounted archival site. When the sync hit basically everything stopped until
the buffered network writes could be synced. The live database was basically
frozen for a few seconds and the web site nearly crashed. The sysadmin had to
send out a notice asking everybody to refrain from using sync until the
archival process had completed.
This seems, um, hard to believe. Did he shut down the standard syncer
daemon? I have never seen a Unix system that would allow more than
thirty seconds' worth of unwritten buffers to accumulate, and would not
care to use one if it existed.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes:
This seems, um, hard to believe. Did he shut down the standard syncer
daemon? I have never seen a Unix system that would allow more than
thirty seconds' worth of unwritten buffers to accumulate, and would not
care to use one if it existed.
Well it was Solaris so it didn't have the BSD 30s sync style strategy. But
this was a large NFS file transfer to another host on a 100Mb/s network. In
30s there could be a lot of writes buffered up.
I'm not saying the behaviour was ideal, and I don't know exactly why it
interfered with anything else. But I'm not entirely surprised either.
--
greg