Thread: PostgreSQL Reliability when fsync = false on Linux-XFS

PostgreSQL Reliability when fsync = false on Linux-XFS

From
Federico Sevilla III
Date:
(Please follow Mail-Followup-To, I'm not on the pgsql-performance
mailing list but am on the Linux-XFS mailing list. My apologies too for
the cross-post. I'm cc'ing the Linux-XFS mailing list in case people
there will be interested in this, too.)


Hi,

We have a server running PostgreSQL v7.3.3 on Debian GNU/Linux with
Linux kernel 2.4.21-xfs. The PostgreSQL data is stored on an XFS[1]
partition mounted with the options "rw,noatime,logbufs=8". The machine
is an Intel Pentium III 733MHz with 512MB RAM and a four-disk hardware
IDE RAID-5 array with a 3ware controller.

Among other databases, we have a centralized Snort[2] database that is
analyzed by ACIDLab[3]. I noticed performance problems during SELECT and
INSERT operations when the tables reach around 200,000 records. Because
of timeout issues, the PHP-based ACIDLab can't be used properly.

I read the performance section of the ACID FAQ[4] as well as the
PostgreSQL "Managing Kernel Resources" document , and so far have tuned
my system by setting /proc/sys/kernel/{shmall,shmmax} to 134217728.

I also turned off fsync in /etc/postgresql/postgresql.conf.

The latter did a LOT to improve INSERT performance, which is now
CPU-bound instead of I/O-bound. However, as expected, I am concerned
about the reliability penalty this will cause. Our server has been up
and running without problems for 67 days since the last reboot, but this
doesn't mean it will never hiccup either because of some random problem
or because of an extended power outage.

Would anyone have "authoritative" information with respect to:

 - the way PostgreSQL expects data to be written to disk without the
   fsync calls for things not to get corrupted in the event of a crash,
   and

 - the way XFS writes data to disk without the fsync calls that
   PostgreSQL normally does and how this will affect PostgreSQL data
   integrity in the event of a system crash?

I know that at the end of the day, if I value my data, I must (1) back
it up regularly, and (2) keep fsync enabled in PostgreSQL. However given
the significance performance hit (at least as far as massive INSERT or
UPDATE operations are concerned) and the journalling component of XFS,
it would be great to find out just how bad the odds are if the system
goes down unexpectedly.

Thank you very much for your time. :)

 --> Jijo

Note- I should also have selected RAID10 instead of RAID5, but that's a
change I can't afford to do at this point so I have to explore other
options.

[1] http://oss.sgi.com/projects/xfs/
[2] http://www.snort.org
[3] http://acidlab.sourceforge.net
[4] http://www.andrew.cmu.edu/~rdanyliw/snort/acid_faq.html#faq_c9
[5] http://developer.postgresql.org/docs/postgres/kernel-resources.html

--
Federico Sevilla III  : http://jijo.free.net.ph      : When we speak of free
Network Administrator : The Leather Collection, Inc. : software we refer to
GnuPG Key ID          : 0x93B746BE                   : freedom, not price.

Re: PostgreSQL Reliability when fsync = false on Linux-XFS

From
Rod Taylor
Date:
>  - the way PostgreSQL expects data to be written to disk without the
>    fsync calls for things not to get corrupted in the event of a crash,
>    and

If you want the filesystem to deal with this, I believe it is necessary
for it to write the data out in the same order the write requests are
supplied in between ALL PostgreSQL processes. If you can accomplish
this, you do not need WAL.

There are shortcuts which can be taken in the above, which is where WAL
comes in. WAL writes are ordered between processes and WAL of a single
process always hits disk prior to commit -- fsync forces both of these.
Due to WAL being in place, data can be written at almost any time.  The
benefit to WAL is a single file fsync rather than the entire database
requiring one (PostgreSQL pre-7.1 method).

> I know that at the end of the day, if I value my data, I must (1) back
> it up regularly, and (2) keep fsync enabled in PostgreSQL. However given
> the significance performance hit (at least as far as massive INSERT or

If you want good performance, invest in a SCSI controller that has
battery backed write cache. A few megs will do it. You will find
performance similar to fsync being off (you don't wait for disk
rotation) but without the whole dataloss issue. Another alternative is
to buy a small 15krpm disk dedicated for WAL. In theory you can achieve
one commit per rotation.

I assume your inserts are not supplied in Bulk. The fsync overhead is
per transaction, not per insert.

Attachment

Re: PostgreSQL Reliability when fsync = false on

From
Hannu Krosing
Date:
Rod Taylor kirjutas N, 04.09.2003 kell 06:36:
> Another alternative is
> to buy a small 15krpm disk dedicated for WAL. In theory you can achieve
> one commit per rotation.

One commit per rotation would still be only 15000/60. = 250 tps, but
fortunately you can get better results if you use multiple concurrent
backends, then in the best case you can get one commit per backend per
rotation.

-----------------
Hannu


Re: PostgreSQL Reliability when fsync = false on Linux-XFS

From
"Shridhar Daithankar"
Date:
On 3 Sep 2003 at 23:36, Rod Taylor wrote:

> >  - the way PostgreSQL expects data to be written to disk without the
> >    fsync calls for things not to get corrupted in the event of a crash,
> >    and
>
> If you want the filesystem to deal with this, I believe it is necessary
> for it to write the data out in the same order the write requests are
> supplied in between ALL PostgreSQL processes. If you can accomplish
> this, you do not need WAL.
>
> There are shortcuts which can be taken in the above, which is where WAL
> comes in. WAL writes are ordered between processes and WAL of a single
> process always hits disk prior to commit -- fsync forces both of these.
> Due to WAL being in place, data can be written at almost any time.  The
> benefit to WAL is a single file fsync rather than the entire database
> requiring one (PostgreSQL pre-7.1 method).
>
> > I know that at the end of the day, if I value my data, I must (1) back
> > it up regularly, and (2) keep fsync enabled in PostgreSQL. However given
> > the significance performance hit (at least as far as massive INSERT or
>
> If you want good performance, invest in a SCSI controller that has
> battery backed write cache. A few megs will do it. You will find
> performance similar to fsync being off (you don't wait for disk
> rotation) but without the whole dataloss issue. Another alternative is
> to buy a small 15krpm disk dedicated for WAL. In theory you can achieve
> one commit per rotation.

Just wonderin. What if you symlink WAL to a directory which is on mounted USB
RAM drive?

Will that increase any throughput? I am sure a 256/512MB flash drive will cost
lot less than a SCSI disk. May be even a GB on flash drive would do..

Just a thought..

Bye
 Shridhar

--
Ambition, n:    An overmastering desire to be vilified by enemies while    living and
made ridiculous by friends when dead.        -- Ambrose Bierce


Re: PostgreSQL Reliability when fsync = false on Linux-XFS

From
Sean Chittenden
Date:
> Just wonderin. What if you symlink WAL to a directory which is on
> mounted USB RAM drive?

USB 2.0 you mean?  It supposedly runs at 1394 speeds, but USB 1.0/1.1
runs at 1MB/s under ideal circumstances... that's slower than even old
IDE drives.

> Will that increase any throughput?

Probably not...

> I am sure a 256/512MB flash drive will cost lot less than a SCSI
> disk. May be even a GB on flash drive would do..

That's true... but on a per $$/MB, you're better off investing in RAM
and increasing your effective_cache_size.  If dd to a flash card is
faster than to an IDE drive, please let me know.  :) -sc

--
Sean Chittenden
UNIX(TM), a BSD like Operating System

Re: PostgreSQL Reliability when fsync = false on Linux-XFS

From
Tom Lane
Date:
Sean Chittenden <sean@chittenden.org> writes:
>> Just wonderin. What if you symlink WAL to a directory which is on
>> mounted USB RAM drive?

> USB 2.0 you mean?  It supposedly runs at 1394 speeds, but USB 1.0/1.1
> runs at 1MB/s under ideal circumstances... that's slower than even old
> IDE drives.

>> Will that increase any throughput?

> Probably not...

Also, doesn't flash memory have a very limited lifetime in write cycles?
Using it as WAL, you'd wear it out PDQ.

            regards, tom lane

Re: PostgreSQL Reliability when fsync = false on Linux-XFS

From
"scott.marlowe"
Date:
On Thu, 4 Sep 2003, Federico Sevilla III wrote:

> (Please follow Mail-Followup-To, I'm not on the pgsql-performance
> mailing list but am on the Linux-XFS mailing list. My apologies too for
> the cross-post. I'm cc'ing the Linux-XFS mailing list in case people
> there will be interested in this, too.)
>
>
> Hi,
>
> We have a server running PostgreSQL v7.3.3 on Debian GNU/Linux with
> Linux kernel 2.4.21-xfs. The PostgreSQL data is stored on an XFS[1]

Two points.

1:  7.3.3 has a data loss issue fixed in 7.3.4.  You should upgrade to
avoid the pain associated with this problem.

2:  When you turn off fsync, all bets are off.  If the data doesn't get
written in the right order, your database may be corrupted if power is
shut off.