RE: RE: xlog checkpoint depends on sync() ... seems uns afe

Started by Mikheev, Vadimalmost 25 years ago5 messages
#1Mikheev, Vadim
vmikheev@SECTORBASE.COM

to re-write smgr. I don't know how useful is second sync() call, but
on Solaris (and I believe on many other *NIXes) rc0 calls it
three times, -:) Why?

The idea is, that by the time the last sync has run, the
first sync will be done flushing the buffers to disk. - this is what
we were told by the IBM engineers when I worked tier-2/3 AIX support
at IBM.

I was told the same a long ago about FreeBSD. How much can we count on
this undocumented sync() feature?

Vadim

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mikheev, Vadim (#1)
Re: RE: xlog checkpoint depends on sync() ... seems uns afe

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

The idea is, that by the time the last sync has run, the
first sync will be done flushing the buffers to disk. - this is what
we were told by the IBM engineers when I worked tier-2/3 AIX support
at IBM.

I was told the same a long ago about FreeBSD. How much can we count on
this undocumented sync() feature?

Sounds quite unreliable to me. Unless there's some interlock ... like,
say, the second sync not being able to advance past a buffer page that's
as yet unwritten by the first sync. But would all Unixen share such a
strange detail of implementation?

regards, tom lane

#3Doug McNaught
doug@wireboard.com
In reply to: Mikheev, Vadim (#1)
Re: RE: xlog checkpoint depends on sync() ... seems uns afe

Tom Lane <tgl@sss.pgh.pa.us> writes:

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

The idea is, that by the time the last sync has run, the
first sync will be done flushing the buffers to disk. - this is what
we were told by the IBM engineers when I worked tier-2/3 AIX support
at IBM.

I was told the same a long ago about FreeBSD. How much can we count on
this undocumented sync() feature?

Sounds quite unreliable to me. Unless there's some interlock ... like,
say, the second sync not being able to advance past a buffer page that's
as yet unwritten by the first sync. But would all Unixen share such a
strange detail of implementation?

I'm pretty sure it has no basis in fact, it's just one of these habits
that gives sysadmins a warm fuzzy feeling. ;) It's apparently been
around a long time, though I don't remember where I read about it--it
was quite a few years ago.

-Doug

#4Giles Lean
giles@nemeton.com.au
In reply to: Tom Lane (#2)
Re: RE: xlog checkpoint depends on sync() ... seems uns afe

Sounds quite unreliable to me. Unless there's some interlock ... like,
say, the second sync not being able to advance past a buffer page that's
as yet unwritten by the first sync. But would all Unixen share such a
strange detail of implementation?

I heard Kirk McKusick tell this story in a 4.4BSD internals class.
His explanation was that having an *operator* type 'sync' three times
provided enough time for the first sync to do the work before the
operator powered the system down or reset it or whatever.

I've not heard of any filesystem implementation where the number of
sync() system calls issued makes a difference, and imagine that any
programmer who has written code to call sync three times has only
heard part of the story. :-)

Regards,

Giles

#5Matthew Kirkwood
matthew@hairy.beasts.org
In reply to: Tom Lane (#2)
Re: RE: xlog checkpoint depends on sync() ... seems uns afe

On Tue, 13 Mar 2001, Tom Lane wrote:

I was told the same a long ago about FreeBSD. How much can we count on
this undocumented sync() feature?

Sounds quite unreliable to me. Unless there's some interlock ...
like, say, the second sync not being able to advance past a buffer
page that's as yet unwritten by the first sync. But would all Unixen
share such a strange detail of implementation?

The Linux manpage says:

NAME
sync - commit buffer cache to disk.
[..]

DESCRIPTION
sync first commits inodes to buffers, and then buffers to
disk.
[..]

CONFORMING TO
SVr4, SVID, X/OPEN, BSD 4.3

BUGS
According to the standard specification (e.g., SVID),
sync() schedules the writes, but may return before the
actual writing is done. However, since version 1.3.20
Linux does actually wait. (This still does not guarantee
data integrity: modern disks have large caches.)

And it's still true. On a fast system, if you do:

$ cp /dev/zero /tmp & sleep 1; sync

the sync will often never finish. (Of course, that's
just an implementation detail really.)

Matthew.