Thread: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Guoping Zhang"
Date:
Hi,.

We are new to Postgresql. I am appreciated if the following question can be
answered.

Our application has a strict speed requirement for DB operation. Our tests
show that it takes about 10secs for the operation when setting fsync off,
but takes about 70 seconds when setting fsync ON (with other WAL related
parametered tuned).

We have to looking at setting fsync OFF option for performance reason,
our questions are

  a) if we set fsync OFF and anything (very low chance though) like OS
crash, loss of power, or hardware fault happened, can postgresql rolls back
to the state that the last checkpoint was done ( but all the operations
after that is lost)

  b) Does this roll back to last checkpoint can ensure the database back to
consistent state?

  c) What is worst scenarios if setting fsync OFF in term of database
safety. We try to avoid to restore the database from nightly backup.

We view our application is not that data loss critical, say loss of five
minutes of data and operation occasionally, but the database integrity and
consistency must be kept.

Can we set fsync OFF for the performance benefit, have the risk of only 5
minutes data loss or much worse?

Thanks in advance.

Regards,

Guoping


Re: how unsafe (or worst scenarios) when setting fsync

From
Simon Riggs
Date:
On Thu, 2006-04-27 at 16:31 +1000, Guoping Zhang wrote:

> We have to looking at setting fsync OFF option for performance reason,
> our questions are
>
>   a) if we set fsync OFF and anything (very low chance though) like OS
> crash, loss of power, or hardware fault happened, can postgresql rolls back
> to the state that the last checkpoint was done ( but all the operations
> after that is lost)

There is no rollback, only a rollforward from the checkpoint.

>   b) Does this roll back to last checkpoint can ensure the database back to
> consistent state?

Therefore no consistent state guaranteed if some WAL is missing

>   c) What is worst scenarios if setting fsync OFF in term of database
> safety. We try to avoid to restore the database from nightly backup.

Losing some DDL changes, probably. You'd need to be wary of things like
ANALYZE, VACUUM etc, since these make catalog changes also.

> We view our application is not that data loss critical, say loss of five
> minutes of data and operation occasionally, but the database integrity and
> consistency must be kept.
>
> Can we set fsync OFF for the performance benefit, have the risk of only 5
> minutes data loss or much worse?

Thats up to you.

fsync can be turned on and off, so you can make critical changes with
fsync on, then continue with fsync off.

The risk and the decision, are yours. You are warned.

--
  Simon Riggs
  EnterpriseDB          http://www.enterprisedb.com/


Re: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Mikael Carneholm"
Date:
Get a SCSI controller with a battery backed cache, and mount the disks
with data=writeback (if you use ext3). If you loose power in the middle
of a transaction, the battery will ensure that the write operation still
completes. With asynch writing setup like this, fsync operations will
return almost immidiately giving you performance close to that of
running with fsync off.

Regards,
Mikael

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Guoping
Zhang
Sent: den 27 april 2006 08:31
To: pgsql-performance@postgresql.org
Cc: Guoping Zhang (E-mail)
Subject: [PERFORM] how unsafe (or worst scenarios) when setting fsync
OFF for postgresql

Hi,.

We are new to Postgresql. I am appreciated if the following question can
be answered.

Our application has a strict speed requirement for DB operation. Our
tests show that it takes about 10secs for the operation when setting
fsync off, but takes about 70 seconds when setting fsync ON (with other
WAL related parametered tuned).

We have to looking at setting fsync OFF option for performance reason,
our questions are

  a) if we set fsync OFF and anything (very low chance though) like OS
crash, loss of power, or hardware fault happened, can postgresql rolls
back to the state that the last checkpoint was done ( but all the
operations after that is lost)

  b) Does this roll back to last checkpoint can ensure the database back
to consistent state?

  c) What is worst scenarios if setting fsync OFF in term of database
safety. We try to avoid to restore the database from nightly backup.

We view our application is not that data loss critical, say loss of five
minutes of data and operation occasionally, but the database integrity
and consistency must be kept.

Can we set fsync OFF for the performance benefit, have the risk of only
5 minutes data loss or much worse?

Thanks in advance.

Regards,

Guoping


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Re: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Guillaume Smet"
Date:
Guoping,

On 4/27/06, Guoping Zhang <guoping.zhang@nec.com.au> wrote:
> We have to looking at setting fsync OFF option for performance reason,

Did you try the other wal sync methods (fdatasync in particular)? I
saw a few posts lately explaining how changing sync method can affect
performances in specific cases.

--
Guillaume

"Guoping Zhang" <guoping.zhang@nec.com.au> writes:
> Our application has a strict speed requirement for DB operation. Our tests
> show that it takes about 10secs for the operation when setting fsync off,
> but takes about 70 seconds when setting fsync ON (with other WAL related
> parametered tuned).

I can't believe that a properly tuned application would have an fsync
penalty that large.  Are you performing that "operation" as several
thousand small transactions, or some such?  Try grouping the operations
into one (or at most a few) transactions.  Also, what wal_buffers and
wal_sync_method settings are you using, and have you experimented with
alternatives?   What sort of platform is this on?  What PG version?

            regards, tom lane

Re: how unsafe (or worst scenarios) when setting fsync

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Thu, 2006-04-27 at 16:31 +1000, Guoping Zhang wrote:
>> Can we set fsync OFF for the performance benefit, have the risk of only 5
>> minutes data loss or much worse?

> Thats up to you.

> fsync can be turned on and off, so you can make critical changes with
> fsync on, then continue with fsync off.

I think it would be a mistake to assume that the behavior would be
nice clean "we only lost recent changes".  Things could get arbitrarily
badly corrupted if some writes make it to disk and some don't.

            regards, tom lane

Re: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Guoping Zhang"
Date:
Hi, Tom,

Thanks for the reply.

a) The tests consists of ten thousands very small transactions, which are
not grouped, that is why so slow with compare to set fsync off.
b) we are using Solaris 10 on a SUN Fire 240 SPARC machine with a latest
postgresql release (8.1.3)
c) wal_sync_method is set to 'open_datasync', which is fastest among the
four, right?
d) wal_buffers set to 32

Looks like, if we have to set fsync be true, we need to modify our
application.

Thanks and regards,
Guoping

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Tom Lane
Sent: 2006Äê4ÔÂ28ÈÕ 0:53
To: guoping.zhang@nec.com.au
Cc: pgsql-performance@postgresql.org; Guoping Zhang (E-mail)
Subject: Re: [PERFORM] how unsafe (or worst scenarios) when setting
fsync OFF for postgresql


"Guoping Zhang" <guoping.zhang@nec.com.au> writes:
> Our application has a strict speed requirement for DB operation. Our tests
> show that it takes about 10secs for the operation when setting fsync off,
> but takes about 70 seconds when setting fsync ON (with other WAL related
> parametered tuned).

I can't believe that a properly tuned application would have an fsync
penalty that large.  Are you performing that "operation" as several
thousand small transactions, or some such?  Try grouping the operations
into one (or at most a few) transactions.  Also, what wal_buffers and
wal_sync_method settings are you using, and have you experimented with
alternatives?   What sort of platform is this on?  What PG version?

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


"Guoping Zhang" <guoping.zhang@nec.com.au> writes:
> a) The tests consists of ten thousands very small transactions, which are
> not grouped, that is why so slow with compare to set fsync off.

Yup.

> c) wal_sync_method is set to 'open_datasync', which is fastest among the
> four, right?

Well, is it?  You shouldn't assume that without testing.

> Looks like, if we have to set fsync be true, we need to modify our
> application.

Yes, you should definitely look into batching your operations into
larger transactions.  On normal hardware you can't expect to commit
transactions faster than one per disk revolution (unless they're coming
from multiple clients, where there's a hope of ganging several parallel
commits per revolution).

Or buy a disk controller with battery-backed write cache and put your
faith in that cache surviving a machine crash.  But don't turn off fsync
if you care about your data.

            regards, tom lane

Re: how unsafe (or worst scenarios) when setting fsync

From
"Guoping Zhang"
Date:
Hi, Simon/tom,

Thanks for the reply.

It appears to me that we have to set fsync ON, as a badly corrupted database
by any chance in production line
will lead a serious problem.

However, when try the differnt 'wal_sync_method' setting, lead a quite
different operation time (open_datasync is best for speed).

But altering the commit_delay from 1 to 100000, I observed that there is no
time difference for the operation. Why is that? As our tests consists of
10000 small transactions which completed in 66 seconds, that is, about 160
transactions per second. When commit_delay set to 100000 (i.e., 0.1 second),
that in theory, shall group around 16 transactions into one commit, but
result is same from the repeated test. Am I mistaken something here?

Cheers and Regards,
Guoping

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 2006Äê4ÔÂ28ÈÕ 0:58
To: Simon Riggs
Cc: guoping.zhang@nec.com.au; pgsql-performance@postgresql.org; Guoping
Zhang (E-mail)
Subject: Re: [PERFORM] how unsafe (or worst scenarios) when setting
fsync


Simon Riggs <simon@2ndquadrant.com> writes:
> On Thu, 2006-04-27 at 16:31 +1000, Guoping Zhang wrote:
>> Can we set fsync OFF for the performance benefit, have the risk of only 5
>> minutes data loss or much worse?

> Thats up to you.

> fsync can be turned on and off, so you can make critical changes with
> fsync on, then continue with fsync off.

I think it would be a mistake to assume that the behavior would be
nice clean "we only lost recent changes".  Things could get arbitrarily
badly corrupted if some writes make it to disk and some don't.

            regards, tom lane


Re: how unsafe (or worst scenarios) when setting fsync

From
Tom Lane
Date:
"Guoping Zhang" <guoping.zhang@nec.com.au> writes:
> But altering the commit_delay from 1 to 100000, I observed that there is no
> time difference for the operation. Why is that? As our tests consists of
> 10000 small transactions which completed in 66 seconds, that is, about 160
> transactions per second. When commit_delay set to 100000 (i.e., 0.1 second),
> that in theory, shall group around 16 transactions into one commit, but
> result is same from the repeated test. Am I mistaken something here?

commit_delay can only help if there are multiple clients issuing
transactions concurrently, so that there are multiple commits pending at
the same instant.  If you are issuing one serial stream of transactions,
it's useless.

If you do have multiple active clients, then we need to look more closely;
but your statement does not indicate that.

            regards, tom lane

Re: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Guoping Zhang"
Date:
Hi, Guillaume,

Thanks for the reply.

I am using wal_sync_methods be open_datasync, which appear much faster than
'fdatasync'.

Regards,
Guoping

-----Original Message-----
From: Guillaume Smet [mailto:guillaume.smet@gmail.com]
Sent: 2006Äê4ÔÂ27ÈÕ 19:26
To: guoping.zhang@nec.com.au
Cc: pgsql-performance@postgresql.org; Guoping Zhang (E-mail)
Subject: Re: [PERFORM] how unsafe (or worst scenarios) when setting
fsync OFF for postgresql


Guoping,

On 4/27/06, Guoping Zhang <guoping.zhang@nec.com.au> wrote:
> We have to looking at setting fsync OFF option for performance reason,

Did you try the other wal sync methods (fdatasync in particular)? I
saw a few posts lately explaining how changing sync method can affect
performances in specific cases.

--
Guillaume


Re: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Guoping Zhang"
Date:
Hi, Tom

Many thanks for quick replies and that helps a lot.

Just in case, anyone out there can recommend a good but cost effective
battery-backed write cache SCSI for Solaris SPARC platform? How well does it
work with UFS or newer ZFS for solaris?

Cheers and regards,
Guoping




-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 2006Äê4ÔÂ28ÈÕ 14:57
To: guoping.zhang@nec.com.au
Cc: pgsql-performance@postgresql.org; 'Guoping Zhang (E-mail)'
Subject: Re: [PERFORM] how unsafe (or worst scenarios) when setting
fsync OFF for postgresql


"Guoping Zhang" <guoping.zhang@nec.com.au> writes:
> a) The tests consists of ten thousands very small transactions, which are
> not grouped, that is why so slow with compare to set fsync off.

Yup.

> c) wal_sync_method is set to 'open_datasync', which is fastest among the
> four, right?

Well, is it?  You shouldn't assume that without testing.

> Looks like, if we have to set fsync be true, we need to modify our
> application.

Yes, you should definitely look into batching your operations into
larger transactions.  On normal hardware you can't expect to commit
transactions faster than one per disk revolution (unless they're coming
from multiple clients, where there's a hope of ganging several parallel
commits per revolution).

Or buy a disk controller with battery-backed write cache and put your
faith in that cache surviving a machine crash.  But don't turn off fsync
if you care about your data.

            regards, tom lane


Re: how unsafe (or worst scenarios) when setting fsync OFF for postgresql

From
"Mikael Carneholm"
Date:
a) I have absolutely no idea regarding price tags when it comes to SUN hardware, last time I worked with SUN gear was
in´01 so you'll have to check with your local (SUN-)supplier for uptodate prices. 

b) Same here (no idea). But I'd be surprised if UFS (and ZFS) was unable to take advantage of battery backed write
cache...

Regards,
Mikael


-----Original Message-----
From: Guoping Zhang [mailto:guoping.zhang@nec.com.au]
Sent: den 28 april 2006 07:35
To: Mikael Carneholm; pgsql-performance@postgresql.org
Cc: Guoping Zhang (E-mail)
Subject: RE: [PERFORM] how unsafe (or worst scenarios) when setting fsync OFF for postgresql

Hi, Mikael,

We have not looked at this option yet, but very good direction though.

Two issues are unsure:
a) we are on SUN SPARC platform, unsure what the price tag for such a hardware device with SUN brand?

b) how well does UFS (or a new ZFS) work with the device (as ext3 can mount with data=writeback)?

Cheers and regards,
Guoping


-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Mikael Carneholm
Sent: 2006Äê4ÔÂ27ÈÕ 17:43
To: guoping.zhang@nec.com.au; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] how unsafe (or worst scenarios) when setting fsync OFF for postgresql


Get a SCSI controller with a battery backed cache, and mount the disks with data=writeback (if you use ext3). If you
loosepower in the middle of a transaction, the battery will ensure that the write operation still completes. With
asynchwriting setup like this, fsync operations will return almost immidiately giving you performance close to that of
runningwith fsync off. 

Regards,
Mikael

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Guoping Zhang
Sent: den 27 april 2006 08:31
To: pgsql-performance@postgresql.org
Cc: Guoping Zhang (E-mail)
Subject: [PERFORM] how unsafe (or worst scenarios) when setting fsync OFF for postgresql

Hi,.

We are new to Postgresql. I am appreciated if the following question can be answered.

Our application has a strict speed requirement for DB operation. Our tests show that it takes about 10secs for the
operationwhen setting fsync off, but takes about 70 seconds when setting fsync ON (with other WAL related parametered
tuned).

We have to looking at setting fsync OFF option for performance reason, our questions are

  a) if we set fsync OFF and anything (very low chance though) like OS crash, loss of power, or hardware fault
happened,can postgresql rolls back to the state that the last checkpoint was done ( but all the operations after that
islost) 

  b) Does this roll back to last checkpoint can ensure the database back to consistent state?

  c) What is worst scenarios if setting fsync OFF in term of database safety. We try to avoid to restore the database
fromnightly backup. 

We view our application is not that data loss critical, say loss of five minutes of data and operation occasionally,
butthe database integrity and consistency must be kept. 

Can we set fsync OFF for the performance benefit, have the risk of only
5 minutes data loss or much worse?

Thanks in advance.

Regards,

Guoping


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Re: how unsafe (or worst scenarios) when setting fsync

From
Markus Schaber
Date:
Hk, Guoping,

Guoping Zhang wrote:

> a) The tests consists of ten thousands very small transactions, which are
> not grouped, that is why so slow with compare to set fsync off.

If those transactions are submitted by concurrent applications over
several simulataneous connections, playing with commit_delay and
commit_siblins may improve your situation.

HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org