Thread: -F option

-F option

From
newsreader@mediaone.net
Date:
manual page states that

       -F     Disable an automatic fsync()  call  after  each  transaction.   This  option
              improves  performance,  but an operating system crash while a transaction is
              in progress may cause the loss of the most recently  entered  data.  Without
              the  fsync()  call the data is buffered by the operating system, and written
              to disk sometime later.

What I would like to know is what 'sometime later' means.
Is it one hour? 30 seconds? 30 minutes?  24 hours?  I really don't
mind losing the last 3 minutes or so of data.  If we are talking about 10
hours or so then I will not use that switch.  I'm not looking
for an answer with milisecond accuracy; just an upper bound +- 5 minutes
will be ok.

My usage is frequent lookup of small pieces of information, ocassionally
insert of small pieces of information and even less frequent update of small pieces
of information.  I have 7.0.3 on linux 2.2.18 and accessing from Apache::DBI
mod_perl.  I have about three tables or so with the largest table being over 1000 rows
and growing. Is postgres an overkill for this size of data?

Thanks

Re: -F option

From
Charles Curley
Date:
On Mon, Dec 11, 2000 at 06:45:22PM -0500, newsreader@mediaone.net wrote:
> manual page states that
>
>        -F     Disable an automatic fsync()  call  after  each  transaction.   This  option
>               improves  performance,  but an operating system crash while a transaction is
>               in progress may cause the loss of the most recently  entered  data.  Without
>               the  fsync()  call the data is buffered by the operating system, and written
>               to disk sometime later.
>
> What I would like to know is what 'sometime later' means.
> Is it one hour? 30 seconds? 30 minutes?  24 hours?  I really don't
> mind losing the last 3 minutes or so of data.  If we are talking about 10
> hours or so then I will not use that switch.  I'm not looking
> for an answer with milisecond accuracy; just an upper bound +- 5 minutes
> will be ok.

Read "man fsync()".

The answer is, "When your operating system gets around to it." The answer
depends on a great many factors: hard drive speed, drive bus speed (SCSI
or IDE), RAID overhead, if any, file system overhead, processor
loading. Whether the proper sacrifices to Murphy were made. Just to list
the ones that come easily to mind.

Having written disk drivers, I can tell you that you should be OK with a 5
minute latency requirement. In fact, on a reasonably loaded system, I
would hope all the data would be written within a minute. But I shan't be
greatly concerned unless latency is routinely greater than a minute.

I am assuming these are disk writes. With network writes (say, via NFS)
all bets are off.



--

        -- C^2

No windows were crashed in the making of this email.

Looking for fine software and/or web pages?
http://w3.trib.com/~ccurley
Attachment

Re: -F option

From
newsreader@mediaone.net
Date:
Thank you. I'm doing disk writes not nfs -- plain ide drive on
an plain dell celeron box.  I see though that
'man fsync' does not come down to my level of literacy.

I'm quite reassured by your email and will turn on
-F switch in no time. Coupled with linux stability
I have a feeling that the probability of my losing any data due to
an os crash is smaller than that due to some natural disaster
striking the server.

kz


On Mon, Dec 11, 2000 at 05:20:18PM -0700, Charles Curley wrote:
> On Mon, Dec 11, 2000 at 06:45:22PM -0500, newsreader@mediaone.net wrote:
> > manual page states that
> >
> >        -F     Disable an automatic fsync()  call  after  each  transaction.   This  option
> Read "man fsync()".
>
> The answer is, "When your operating system gets around to it." The answer
> depends on a great many factors: hard drive speed, drive bus speed (SCSI


Re: -F option

From
Tom Lane
Date:
newsreader@mediaone.net writes:
> What I would like to know is what 'sometime later' means.
> Is it one hour? 30 seconds? 30 minutes?  24 hours?

On a typical unix setup it's the cycle length of your syncer daemon
(typically 30 seconds), plus however long it physically takes the
OS to push the data out to the drive and then the drive to get around
to writing it.  The nearby estimate of 1 minute sounds good to me as
a (fairly conservative) upper bound, at least under normal conditions.

The standard advice about -F is that it's cool if you trust your OS,
your hardware, and your UPS.  You do *not* need to worry about Postgres
crashes --- the backend will write the data to the kernel at commit
in any case.  The only question is whether we try to encourage the
kernel to push the data down to disk before we report that the
transaction has been committed.

There is a long thread on pghackers recently to the effect that even
without -F, you are at the mercy of disk drive and power supply
failures, because fsync() only guarantees that the kernel has given the
data to the disk drive; modern disk drives may buffer the data for
awhile before they plop it down onto the platter.  So, you probably
want a UPS in any case.  Beyond that, how many kernel crashes and
hardware failures have you seen lately?

> My usage is frequent lookup of small pieces of information,
> ocassionally insert of small pieces of information and even less
> frequent update of small pieces of information.

OTOH, if you are not doing a lot of insert/update/delete then -F gains
little performance anyway...

            regards, tom lane

Re: -F option

From
newsreader@mediaone.net
Date:
On Mon, Dec 11, 2000 at 09:02:54PM -0500, Tom Lane wrote:
> newsreader@mediaone.net writes:
> > What I would like to know is what 'sometime later' means.
> > Is it one hour? 30 seconds? 30 minutes?  24 hours?
>
> to writing it.  The nearby estimate of 1 minute sounds good to me as
> > ocassionally insert of small pieces of information and even less
> > frequent update of small pieces of information.
>
> OTOH, if you are not doing a lot of insert/update/delete then -F gains
> little performance anyway...
>
>             regards, tom lane


No os or hardware crashes for as long as I can remember on this box.
I also have ups running properly.

Anyway before I started using postgres -- just last week -- I was reading in
data from dbm file.  The main purpose was just to lookup one record out of over
1000 but I felt that as the size of that file grow I might start to see performance
hit in the future.   As far as I can see Apache::DBI is keeping the backends
alive.  Whenever I check top I see that there is always the exact same number
of apache children as the number of postgres backends.  So... my semi educated
guess is that data is being retrieved at least as fast as opening a dbm file and picking
out a record. No??

In any case the real reason I started using postgres was that after graduating
my server to mod_perl I was having a nightmare getting my previously working??
file locking mechanisms to work right.  With postgres I just don't have this
kind of problem.  As a side effect my codes are much cleaner now because
of the sheer power of the postgres.

Thanks much

kz