Re: WAL sync behaviour - Mailing list pgsql-performance

From Tom Lane
Subject Re: WAL sync behaviour
Date
Msg-id 9791.1131640774@sss.pgh.pa.us
Whole thread Raw
In response to Re: WAL sync behaviour  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: WAL sync behaviour  (Scott Marlowe <smarlowe@g2switchworks.com>)
Re: WAL sync behaviour  (mark@mark.mielke.cc)
List pgsql-performance
Scott Marlowe <smarlowe@g2switchworks.com> writes:
> On Thu, 2005-11-10 at 08:43, Michael Stone wrote:
>> There's no reason to use a journaled filesystem for the wal. Use ext2 in
>> preference to ext3.

> Not from what I understood.  Ext2 can't guarantee that your data will
> even be there in any form after a crash.  I believe only metadata
> journaling is needed though.

No, Mike is right: for WAL you shouldn't need any journaling.  This is
because we zero out *and fsync* an entire WAL file before we ever
consider putting live WAL data in it.  During live use of a WAL file,
its metadata is not changing.  As long as the filesystem follows
the minimal rule of syncing metadata about a file when it fsyncs the
file, all the live WAL files should survive crashes OK.

We can afford to do this mainly because WAL files can normally be
recycled instead of created afresh, so the zero-out overhead doesn't
get paid during normal operation.

You do need metadata journaling for all non-WAL PG files, since we don't
fsync them every time we extend them; which means the filesystem could
lose track of which disk blocks belong to such a file, if it's not
journaled.

            regards, tom lane

pgsql-performance by date:

Previous
From: Alex Turner
Date:
Subject: Re: Sort performance on large tables
Next
From: Mitch Skinner
Date:
Subject: same plan, add 1 condition, 1900x slower