Thread: Ensuring data integrity with fsync=off

Ensuring data integrity with fsync=off

From
"Benjamin Arai"
Date:
I have been working on optimizing a PostgreSQL server for weekly updates where data is only updated once a week then for the remaining portion of the week the data is static.  So far I have set fsync to off and increased the segment size among other things.  I need to ensure that at the end of the update each week the data is in state where a crash will not kill the database. 
 
Right now I run "sync" afte the updates have finished to ensure that the data is synced to disk but I am concerned about the segment data and anything else I am missing that PostgreSQL explicitly handles.  Is there something I can do in addition to sync to tell PostgreSQL exlplicitly that it is time to ensure everything is stored in its final destionation and etc?
 
Benjamin

Re: Ensuring data integrity with fsync=off

From
Tom Lane
Date:
"Benjamin Arai" <barai@cs.ucr.edu> writes:
> Right now I run "sync" afte the updates have finished to ensure that the
> data is synced to disk but I am concerned about the segment data and
> anything else I am missing that PostgreSQL explicitly handles.  Is there
> something I can do in addition to sync to tell PostgreSQL exlplicitly that
> it is time to ensure everything is stored in its final destionation and etc?

You need to give PG a CHECKPOINT command to flush stuff out of its
internal buffers.  After that finishes, a manual "sync" commnd will
push everything down to disk.

You realize, of course, that a system failure while the updates are
running might leave your database corrupt?  As long as you are prepared
to restore from scratch, this might be a good tradeoff ... but don't
let yourself get caught without an up-to-date backup ...

            regards, tom lane

Re: Ensuring data integrity with fsync=off

From
"Jim C. Nasby"
Date:
On Sat, Jan 14, 2006 at 01:41:43PM -0500, Tom Lane wrote:
> "Benjamin Arai" <barai@cs.ucr.edu> writes:
> > Right now I run "sync" afte the updates have finished to ensure that the
> > data is synced to disk but I am concerned about the segment data and
> > anything else I am missing that PostgreSQL explicitly handles.  Is there
> > something I can do in addition to sync to tell PostgreSQL exlplicitly that
> > it is time to ensure everything is stored in its final destionation and etc?
>
> You need to give PG a CHECKPOINT command to flush stuff out of its
> internal buffers.  After that finishes, a manual "sync" commnd will
> push everything down to disk.
>
> You realize, of course, that a system failure while the updates are
> running might leave your database corrupt?  As long as you are prepared
> to restore from scratch, this might be a good tradeoff ... but don't
> let yourself get caught without an up-to-date backup ...

Another alternative that may (or may not) be simpler would be to run
everything in one transaction and just let that commit at the end. Also,
there is ongoing work towards allowing certain operations to occur
without generating any log writes. Currently there is code submitted
that allows COPY into a table that was created in the same transaction
to go un-logged, though I think it's only in HEAD. In any case, there
should be some features that could be very useful to you in 8.2.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461