Re: Filesystem options for storing pg_data - Mailing list pgsql-general

From Marco Colombo
Subject Re: Filesystem options for storing pg_data
Date
Msg-id Pine.LNX.4.61.0504211943220.27506@Megathlon.ESI
Whole thread Raw
In response to Re: Filesystem options for storing pg_data  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: Filesystem options for storing pg_data  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-general
On Thu, 21 Apr 2005, Scott Marlowe wrote:

> References:
>
> http://archives.postgresql.org/pgsql-performance/2005-01/msg00131.php
> http://archives.postgresql.org/pgsql-performance/2004-05/msg00130.php
> http://archives.postgresql.org/pgsql-performance/2003-08/msg00191.php
>
http://groups-beta.google.com/group/comp.os.linux.misc/msg/b299a71fd540c2b8?q=ext2+corrupt+%22power+failure%22&hl=en&lr=&ie=UTF-8&rnum=9
> http://oss.sgi.com/projects/xfs/papers/filesystem-perf-tm.pdf
> http://www.oracle.com/technology/oramag/webcolumns/2002/techarticles/scalzo_linux02.html
> http://jamesthornton.com/hotlist/linux-filesystems/
>
> It took me all of about 10 minutes to find all of those.  But I've got
> work to do, so I'll leave further research here to the rest of the list.

Thanks for your precious time, but when I say I searched the archives
I really mean it. If you cared to read _my_ message, I was looking for
any benchmark (or comment) under the following conditions:

1) PostgreSQL load - that is, a benchmarck based on PostgreSQL, or,
    alternatively, on another database, or on artificial write+fsync load.
    Any other (cached) write load is _meaningless_ to our purposes.
2) the author was aware of mount options, and actually used them.
    I think there's enough evidence that ext3 default mount options
    are on the safe side (_safer_ than other fses, it seems), so there's
    no point in comparing default ext3 alone (comparing all modes
    _is_ interesting, tho).

I've spend much more than 10 minutes of my time, and found nothing,
but the links that _I_ posted.

I'll invest more time, and comment on the links you posted
(which I had read already, of course):

http://archives.postgresql.org/pgsql-performance/2005-01/msg00131.php
it's not clear at all, it possibly fails both 1) and 2). The authors
says nothing about a write+fsync benchmark or about ext3 mount options.

http://archives.postgresql.org/pgsql-performance/2004-05/msg00130.php
that's the one I got Bert Scalzo's article from. Other links
fail to meet 1) and some 2). Note that fsync is likely to
disrupt most optimizations. The fact that a filesystem "scales better"
under normal (cached) load, means nothing when it comes to fsyncing.

http://archives.postgresql.org/pgsql-performance/2003-08/msg00191.php
this _defends_ ext2 from the accusation of being buggy. The author
prefers XFS, "but I only have fuzzy reasons, as opposed to metrics."
I was looking for metrics. It's says nothing about ext3, so does not
apply.

These are not from postgresql lists, but anyway:


http://groups-beta.google.com/group/comp.os.linux.misc/msg/b299a71fd540c2b8?q=ext2+corrupt+%22power+failure%22&hl=en&lr=&ie=UTF-8&rnum=9
"People are referring to the old ext2 filesystem here. The new ext3 is very
resistant to this issue."
If you're referring to what "Jinny" said, well all the evidence
is "...recently I have come to know from a reliable group that Linux
is not so stable". Does not meet 1) and 2), sorry.

http://oss.sgi.com/projects/xfs/papers/filesystem-perf-tm.pdf
Yes, surprisingly enough I've read this one, too. The only interesting
part is "[XFS] Perfomance features include asynchronous write ahead
logging (similar to Ext2 " - no, ext3 - " with data=writeback), ...". This
confirms my comment about comparing apples and oranges, and completely
justifies my requirement 2) - and comes from a XFS paper!
It's not clear at all if what they call OLTP Workload really performs
fsync after write. Anyway, there's only _one_ graph in the results
(how weird) and all filesystems are pretty close. No tests with
data=journal. All other graphs in the Appendix fail requirement 1).

http://www.oracle.com/technology/oramag/webcolumns/2002/techarticles/scalzo_linux02.html
thanks, this is the like that _I_ posted. Have _you_ read it?
It shows that EXT3 is almost twice as fast as JFS. Too bad there's no
XFS here.
BTW, this meets 1), I'm not sure about 2), but the options they used
seem enough to outperform JFS.

http://jamesthornton.com/hotlist/linux-filesystems/
this is just a collection of links. It's not clear which one would
back up your claim of XFS and JFS being _generally_ considered superior
for PostgreSQL or other database usage.

Let's see:
http://www-106.ibm.com/developerworks/linux/library/l-fs8.html
"data=ordered mode effectively solves the corruption problem found in
  data=writeback mode and _most other journaled filesystems_, and it does
  so without requiring full data journaling"

(emphasis mine) interesting enough, most journaled filesystems do have
a corruption problem, ext3 in default mode doesn't.
But this does not really apply to us, this refers to normal writes not
write+fsyncs. I think any fs has to be badly broken if it looses data
after fsycn, anyway.

http://www-106.ibm.com/developerworks/library/l-fs9.html
  "Other than that, XFS performance was very close to that of ReiserFS and
   generally surpasses that of ext3... "

uh, this sounds interesting... but wait...

  "... One of the nicest things about XFS is that, like ReiserFS, it doesn't
   generate a lot of unnecessary disk activity. XFS tries to cache as much
   data in memory as possible, and generally only writes things out to disk
   when memory pressure dictates that it do so."

so, if a benchmark shows XFS is faster, it's matter of better caching,
right? But it comes at a price of possible (data) corruption...
Thankfully, it's pretty useless to us, with every write followed by a sync.


I'm sorry, but with the links _you_ selected, applying my filter
conditions 1) and 2), which are necessary for a fair comparison,
one could say there's general consensus on EXT3 being far superior
to other filesystems, not the opposite.

Note that I'm not interested in supporting such a claim. As I already
wrote I think FS selection has generally a minimal impact on PostgreSQL
performance.

But again, what was you original claim
  "Generally XFS and JFS are considered superior to ext2/3."
based upon?

I apologize if I sound somehow harsh, it's not really intented.
But next time please assume that:
- I'm able to do a 10 minute search;
- I've got some work to do, too, but I'm willing so spend more than
   10 minutes on this research (it already took me more than 2 hours
   actually, of my spare time);
- if I say I've searched the lists and read many messages, I've
   really done so.

You're absolutely entitled to have your opinion, if you like XFS and
JFS go ahead and use them, because of their name, the names of their
sponsors (IBM and SGI), or their features, or your personal experience,
or whatever. Just please don't claim that's general consensus for the
pgsql lists. There's _no_ general consensus. There's _no_ clear winner.
And if you do want a winner anyway, it's ext3, so far.

This "ext3 is not good as XFS as JFS" is a recurring subject, as
long as "ext3 is buggy". _Every single time_ I've asked for
references to back up such claims, nothing valuable was presented.
On the contrary, the only references I've found are on the
"ext3 is equal or better" side.

Now, feel free to prove me wrong.

.TM.
--
       ____/  ____/   /
      /      /       /            Marco Colombo
     ___/  ___  /   /              Technical Manager
    /          /   /             ESI s.r.l.
  _____/ _____/  _/               Colombo@ESI.it

pgsql-general by date:

Previous
From: David Wheeler
Date:
Subject: Waiting for Disconnect
Next
From: "Steve - DND"
Date:
Subject: timezone() with timeofday() converts the wrong direction?