Thread: RE: RE: xlog checkpoint depends on sync() ... seems uns afe

RE: RE: xlog checkpoint depends on sync() ... seems uns afe

From

"Mikheev, Vadim"

Date:

13 March 2001, 00:13:49

> > to re-write smgr. I don't know how useful is second sync() call, but
> > on Solaris (and I believe on many other *NIXes) rc0 calls it
> > three times, -:) Why?
> 
> The idea is, that by the time the last sync has run, the 
> first sync will be done flushing the buffers to disk. - this is what
> we were told by the IBM engineers when I worked tier-2/3 AIX support
> at IBM.

I was told the same a long ago about FreeBSD. How much can we count on
this undocumented sync() feature?

Vadim

Re: RE: xlog checkpoint depends on sync() ... seems uns afe

From

Tom Lane

Date:

13 March 2001, 00:22:35

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> The idea is, that by the time the last sync has run, the 
>> first sync will be done flushing the buffers to disk. - this is what
>> we were told by the IBM engineers when I worked tier-2/3 AIX support
>> at IBM.

> I was told the same a long ago about FreeBSD. How much can we count on
> this undocumented sync() feature?

Sounds quite unreliable to me.  Unless there's some interlock ... like,
say, the second sync not being able to advance past a buffer page that's
as yet unwritten by the first sync.  But would all Unixen share such a
strange detail of implementation?
        regards, tom lane

Re: RE: xlog checkpoint depends on sync() ... seems uns afe

From

Doug McNaught

Date:

13 March 2001, 00:43:05

Tom Lane <tgl@sss.pgh.pa.us> writes:

> "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> >> The idea is, that by the time the last sync has run, the 
> >> first sync will be done flushing the buffers to disk. - this is what
> >> we were told by the IBM engineers when I worked tier-2/3 AIX support
> >> at IBM.
> 
> > I was told the same a long ago about FreeBSD. How much can we count on
> > this undocumented sync() feature?
> 
> Sounds quite unreliable to me.  Unless there's some interlock ... like,
> say, the second sync not being able to advance past a buffer page that's
> as yet unwritten by the first sync.  But would all Unixen share such a
> strange detail of implementation?

I'm pretty sure it has no basis in fact, it's just one of these habits 
that gives sysadmins a warm fuzzy feeling.  ;)  It's apparently been
around a long time, though I don't remember where I read about it--it
was quite a few years ago.

-Doug

Re: RE: xlog checkpoint depends on sync() ... seems uns afe

From

Giles Lean

Date:

13 March 2001, 01:48:39

> Sounds quite unreliable to me.  Unless there's some interlock ... like,
> say, the second sync not being able to advance past a buffer page that's
> as yet unwritten by the first sync.  But would all Unixen share such a
> strange detail of implementation?

I heard Kirk McKusick tell this story in a 4.4BSD internals class.
His explanation was that having an *operator* type 'sync' three times
provided enough time for the first sync to do the work before the
operator powered the system down or reset it or whatever.

I've not heard of any filesystem implementation where the number of
sync() system calls issued makes a difference, and imagine that any
programmer who has written code to call sync three times has only
heard part of the story. :-)

Regards,

Giles

Re: RE: xlog checkpoint depends on sync() ... seems uns afe

From

Matthew Kirkwood

Date:

13 March 2001, 05:54:54

On Tue, 13 Mar 2001, Tom Lane wrote:

> > I was told the same a long ago about FreeBSD. How much can we count on
> > this undocumented sync() feature?
> 
> Sounds quite unreliable to me.  Unless there's some interlock ...
> like, say, the second sync not being able to advance past a buffer
> page that's as yet unwritten by the first sync.  But would all Unixen
> share such a strange detail of implementation?

The Linux manpage says:

NAME      sync - commit buffer cache to disk.
[..]

DESCRIPTION      sync  first commits inodes to buffers, and then buffers to      disk.
[..]

CONFORMING TO      SVr4, SVID, X/OPEN, BSD 4.3

BUGS      According to  the  standard  specification  (e.g.,  SVID),      sync()  schedules  the  writes,  but may
returnbefore the      actual writing is done.   However,  since  version  1.3.20      Linux  does actually wait.  (This
stilldoes not guarantee      data integrity: modern disks have large caches.)

And it's still true.  On a fast system, if you do:

$ cp /dev/zero /tmp & sleep 1; sync

the sync will often never finish.  (Of course, that's
just an implementation detail really.)

Matthew.