Thread: Need Help Recovering from Botched Upgrade Attempt

Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
   Despite trying to be careful, I managed to mess up the upgrade from -8.1.4
to -8.3.3 on my Slackware-11.0 server/workstation. I expect that someone
here will see my error and point me in the right direction to recover a
working dbms.

   Here's what I did:

   1.) As a user, I ran pg_dumpall on version 8.1.4 and had that written to
/usr4/postgres-backups/.

   2.) Created /usr4/pgsql_old/, and copied all of /var/lib/pgsql/ there ('cp
-a /var/lib/pgsql/* .')

   3.) In /usr4/pgsql_old/data/postgresql.conf, changed port to 5466.

   4.) As root, ran '/etc/rc.d/rc.postgresql stop', which reported that's
what it did.

   5.) Built postgresql-8.3.3 using the SlackBuild script, then ran
'upgradepkg postgresql-8.3.3*tgz'; other than reporting not finding an
expected pid file, that went smoothly.

   6.) Tried starting the new version, /etc/rc.d/rc.postgresql start. It
reported that the server was already running, but that it started anyway.

   7.) A search of the process list could not find the postmaster process, or
any other indication of pgsql running. Trying to run pgsql as a user also
failed. Tried, as user postgres, to run 'pg_sql start' but that also failed.

TIA,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Klint Gore
Date:
Rich Shepard wrote:
>    Despite trying to be careful, I managed to mess up the upgrade from -8.1.4
> to -8.3.3 on my Slackware-11.0 server/workstation. I expect that someone
> here will see my error and point me in the right direction to recover a
> working dbms.
>
>    Here's what I did:
>
>    1.) As a user, I ran pg_dumpall on version 8.1.4 and had that written to
> /usr4/postgres-backups/.
>
>    2.) Created /usr4/pgsql_old/, and copied all of /var/lib/pgsql/ there ('cp
> -a /var/lib/pgsql/* .')
>
>    3.) In /usr4/pgsql_old/data/postgresql.conf, changed port to 5466.
>
>    4.) As root, ran '/etc/rc.d/rc.postgresql stop', which reported that's
> what it did.
>
You copied the files without stopping the database?  move 4 to 2.


>    5.) Built postgresql-8.3.3 using the SlackBuild script, then ran
> 'upgradepkg postgresql-8.3.3*tgz'; other than reporting not finding an
> expected pid file, that went smoothly.
>
Is there an initdb in here somewhere?  Or is the 8.3 server trying to
start with an 8.1 file structure?

>    6.) Tried starting the new version, /etc/rc.d/rc.postgresql start. It
> reported that the server was already running, but that it started anyway.
>
klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Wed, 18 Jun 2008, Klint Gore wrote:

> You copied the files without stopping the database?  move 4 to 2.

Klint,

   Yes, actually. There was no activity on any of the databases.

> Is there an initdb in here somewhere?  Or is the 8.3 server trying to start
> with an 8.1 file structure?

   Ah, yes. I missed stating that. I switched to user postgres and ran:

postgres@salmo:/var/lib/pgsql$ initdb -D ./data/
The files belonging to this database system will be owned by user
"postgres".
This user must also own the server process.

The database cluster will be initialized with locales
   COLLATE:  C
   CTYPE:    en_US
   MESSAGES: en_US
   MONETARY: en_US
   NUMERIC:  en_US
   TIME:     en_US
could not determine encoding for locale "en_US": codeset is "ANSI_X3.4-1968"
initdb: could not find suitable encoding for locale en_US
Rerun initdb with the -E option.
Try "initdb --help" for more information.

   But, using the -E option with en_US tells me that it's not a valid locale.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Wed, 18 Jun 2008, Klint Gore wrote:

>>    5.) Built postgresql-8.3.3 using the SlackBuild script, then ran
>> 'upgradepkg postgresql-8.3.3*tgz'; other than reporting not finding an
>> expected pid file, that went smoothly.
>>
> Is there an initdb in here somewhere?  Or is the 8.3 server trying to start
> with an 8.1 file structure?

Klint,

   Backed up a couple of steps, and tried again. Removed postgresql-8.3.3;
deleted all contents of /var/lib/pgsql/data (because initdb is supposed to
create the contents, if I correctly read the Postgresql book); re-installed
postgresql-8.3.3; ran (as user postgres) 'initdb -D /var/lib/pgsql/data'.
Nothing!

   I've really FUBARed this and don't understand how, or what to do to
recover.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Adrian Klaver
Date:
On Tuesday 17 June 2008 7:18 pm, Rich Shepard wrote:
> On Wed, 18 Jun 2008, Klint Gore wrote:
> >>    5.) Built postgresql-8.3.3 using the SlackBuild script, then ran
> >> 'upgradepkg postgresql-8.3.3*tgz'; other than reporting not finding an
> >> expected pid file, that went smoothly.
> >
> > Is there an initdb in here somewhere?  Or is the 8.3 server trying to
> > start with an 8.1 file structure?
>
> Klint,
>
>    Backed up a couple of steps, and tried again. Removed postgresql-8.3.3;
> deleted all contents of /var/lib/pgsql/data (because initdb is supposed to
> create the contents, if I correctly read the Postgresql book); re-installed
> postgresql-8.3.3; ran (as user postgres) 'initdb -D /var/lib/pgsql/data'.
> Nothing!

Define nothing. When you ran initdb there where no messages? Also when in
doubt I use the full path /var/lib/pgsql/bin/initdb as you have an old
version of initdb present in the old version directory you copied. When you
have two versions  present at the same time it is easy to get cross reference
problems.

>
>    I've really FUBARed this and don't understand how, or what to do to
> recover.
>
> Thanks,
>
> Rich
>
> --
> Richard B. Shepard, Ph.D.               |  Integrity            Credibility
> Applied Ecosystem Services, Inc.        |            Innovation
> <http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

--
Adrian Klaver
aklaver@comcast.net

Re: Need Help Recovering from Botched Upgrade Attempt

From
Klint Gore
Date:
Rich Shepard wrote:
> On Wed, 18 Jun 2008, Klint Gore wrote:
>
> >>    5.) Built postgresql-8.3.3 using the SlackBuild script, then ran
> >> 'upgradepkg postgresql-8.3.3*tgz'; other than reporting not finding an
> >> expected pid file, that went smoothly.
> >>
> > Is there an initdb in here somewhere?  Or is the 8.3 server trying to start
> > with an 8.1 file structure?
>
> Klint,
>
>    Backed up a couple of steps, and tried again. Removed postgresql-8.3.3;
> deleted all contents of /var/lib/pgsql/data (because initdb is supposed to
> create the contents, if I correctly read the Postgresql book); re-installed
> postgresql-8.3.3; ran (as user postgres) 'initdb -D /var/lib/pgsql/data'.
> Nothing!
>
>    I've really FUBARed this and don't understand how, or what to do to
> recover.
>
> Thanks,
>
>
Make sure that initdb is the version you want
   initdb --version

then
   initdb -E UTF8 -D /var/lib/pgsql/data

then post the output of that.

kllint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Re: Need Help Recovering from Botched Upgrade Attempt

From
Craig Ringer
Date:
Klint Gore wrote:
> Rich Shepard wrote:
>>    Despite trying to be careful, I managed to mess up the upgrade from
>> -8.1.4
>> to -8.3.3 on my Slackware-11.0 server/workstation. I expect that someone
>> here will see my error and point me in the right direction to recover a
>> working dbms.
>>
>>    Here's what I did:
>>
>>    1.) As a user, I ran pg_dumpall on version 8.1.4 and had that
>> written to
>> /usr4/postgres-backups/.
>>
>>    2.) Created /usr4/pgsql_old/, and copied all of /var/lib/pgsql/
>> there ('cp
>> -a /var/lib/pgsql/* .')

I hope you mean cp -aR , because you need those subdirectories if you're
ever going to try to use the _old copy. Even if you actually did a
recursive copy, if you really copied the data directories with the DB
server running and without executing:

    select pg_start_backup('migrate');

or similar before starting the copy then you're going to have problems
using that data. You can copy a working postgresql instance's data
directories, but only if you've enabled WAL logging and you tell Pg
about it so it can write appropriate markers for recovery.



It would probably have been better to:

- pg_dumpall

- STOP THE 8.1 DATABASE SERVER and make sure it's stopped (no postmaster
or postgres processes hanging around).

- Make the old DB server binaries non-executable with chmod and/or
remove them from the PATH

- mv /var/lib/pgsql to /usr4/pgsql_old

- /path/to/8.3/bin/initdb -D /var/lib/pgsql/data

- Adjust postgresql.conf and pg_hba.conf as required

- Start the 8.3 server

- pg_restore all databases from dumps



Right now, you probably need to make REALLY sure you've put a copy of
those dumps somewhere safe, because I suspect your _old copy will be
useless. Then use 8.3's initdb on a new, empty directory, verify that
the config files are correct, and start the 8.3 server.

--
Craig Ringer


Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Wed, 18 Jun 2008, Craig Ringer wrote:

> I hope you mean cp -aR , because you need those subdirectories if you're
> ever going to try to use the _old copy. Even if you actually did a
> recursive copy, if you really copied the data directories with the DB
> server running and without executing:

Craig,

   According to the cp man page here, 'cp -a' is equivalent to 'cp -dpR'.

>    select pg_start_backup('migrate');
>
> or similar before starting the copy then you're going to have problems
> using that data. You can copy a working postgresql instance's data
> directories, but only if you've enabled WAL logging and you tell Pg about
> it so it can write appropriate markers for recovery.

   This is interesting. I've not read about this before. As I'm the only one
using the databases (primarily for the accounting data) I know that nothing
was changed during the copy operation.

> Right now, you probably need to make REALLY sure you've put a copy of
> those dumps somewhere safe, because I suspect your _old copy will be
> useless. Then use 8.3's initdb on a new, empty directory, verify that the
> config files are correct, and start the 8.3 server.

   Every file from /var/lib/pgsql/ before I started this is on the weekly
backup tape from last Friday night. If need be I can restore from that and
start over.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Wed, 18 Jun 2008, Klint Gore wrote:

> Make sure that initdb is the version you want
>  initdb --version

Klint,

   Yes, it is: 8.3.

> then
>  initdb -E UTF8 -D /var/lib/pgsql/data
> then post the output of that.

   Very interesting. While en_US is not accepted, UTF8 is.

postgres@salmo:/var/lib/pgsql$ initdb -E UTF8 -D /var/lib/pgsql/data
The files belonging to this database system will be owned by user
"postgres".
This user must also own the server process.

The database cluster will be initialized with locales
   COLLATE:  C
   CTYPE:    en_US
   MESSAGES: en_US
   MONETARY: en_US
   NUMERIC:  en_US
   TIME:     en_US
could not determine encoding for locale "en_US": codeset is "ANSI_X3.4-1968"
The default text search configuration will be set to "english".

fixing permissions on existing directory /var/lib/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers/max_fsm_pages ... 24MB/153600
creating configuration files ... ok
creating template1 database in /var/lib/pgsql/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... WARNING:  could not determine encoding
for locale "en_US": codeset is "ANSI_X3.4-1968"
DETAIL:  Please report this to <pgsql-bugs@postgresql.org>.
ok
copying template1 to postgres ... WARNING:  could not determine encoding for
locale "en_US": codeset is "ANSI_X3.4-1968"
DETAIL:  Please report this to <pgsql-bugs@postgresql.org>.
ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the -A option the
next time you run initdb.

Success. You can now start the database server using:

     postgres -D /var/lib/pgsql/data
or
     pg_ctl -D /var/lib/pgsql/data -l logfile start

   Notice the warning detail about us_EN encoding. Is there a protocol for
what to include in a report to pgsql-bugs?

   I will now try a new dumpall using the 8.3.3 version, once I find how to
specify a different $PGDATA on the command line.

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Tue, 17 Jun 2008, Adrian Klaver wrote:

> Define nothing. When you ran initdb there where no messages? Also when in
> doubt I use the full path /var/lib/pgsql/bin/initdb as you have an old
> version of initdb present in the old version directory you copied. When
> you have two versions present at the same time it is easy to get cross
> reference problems.

Adrian,

   No messages, other than that us_EN was not an encoding found in the
locale.

   Re-running initdb specifying UTF8 worked. I've no idea if this affects the
existing databases.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Craig Ringer
Date:
Rich Shepard wrote:

>   According to the cp man page here, 'cp -a' is equivalent to 'cp -dpR'.

You're quite right. I was thinking "aah, a BSD-ism" but no, it's true
for Linux too. Sorry.

>>    select pg_start_backup('migrate');
>>
>> or similar before starting the copy then you're going to have problems
>> using that data. You can copy a working postgresql instance's data
>> directories, but only if you've enabled WAL logging and you tell Pg about
>> it so it can write appropriate markers for recovery.
>
>   This is interesting. I've not read about this before. As I'm the only one
> using the databases (primarily for the accounting data) I know that nothing
> was changed during the copy operation.

However, that may not be sufficient. Just because you didn't perform any
DB operations does *not* mean that Pg didn't write anything. Autovacuum,
for example, could be working away. With 8.1 that may not be such a
concern, but it's not the only thing that might be touching the DB files.

You need to stop the DB or use WAL archiving and pg_start_backup if you
want to be reasonably safe.

>   Every file from /var/lib/pgsql/ before I started this is on the weekly
> backup tape from last Friday night. If need be I can restore from that and
> start over.

Well, no worries then. I'm sure you can understand that for many people
- way TOO many people - that is not the case, so it's well worth
stressing the point.

--
Craig Ringer

Re: Need Help Recovering from Botched Upgrade Attempt

From
Alan Hodgson
Date:
On Wednesday 18 June 2008, Craig Ringer <craig@postnewspapers.com.au> wrote:
> >   Every file from /var/lib/pgsql/ before I started this is on the
> > weekly backup tape from last Friday night. If need be I can restore
> > from that and start over.
>
> Well, no worries then. I'm sure you can understand that for many people
> - way TOO many people - that is not the case, so it's well worth
> stressing the point.

If the database was in use when _that_ backup was taken, it may also not be
usable.

You can't just backup a live database from the filesystem level and expect
it to work ...

--
Alan

Re: Need Help Recovering from Botched Upgrade Attempt

From
Craig Ringer
Date:
Alan Hodgson wrote:
> On Wednesday 18 June 2008, Craig Ringer <craig@postnewspapers.com.au> wrote:
>>>   Every file from /var/lib/pgsql/ before I started this is on the
>>> weekly backup tape from last Friday night. If need be I can restore
>>> from that and start over.
>> Well, no worries then. I'm sure you can understand that for many people
>> - way TOO many people - that is not the case, so it's well worth
>> stressing the point.
>
> If the database was in use when _that_ backup was taken, it may also not be
> usable.
>
> You can't just backup a live database from the filesystem level and expect
> it to work ...

It should be OK, if less than ideal, if:

- You have fsync enabled (which you do if you care about your data); and
- Your filesystem supports consistent snapshots

If you can take a point-in-time snapshot at the filesystem level and
copy that, it should be OK. Pg will still have to do a bunch of work
when started up off the restored data, though.

It makes much more sense to just warn Pg about the copy about to be
taken, or use pg_dump . Any decent backup system will provide hooks to
run pre- and post- scripts to do this sort of thing.

--
Craig Ringer

On Wed, Jun 18, 2008 at 11:24:16PM +0800, Craig Ringer wrote:
> Alan Hodgson wrote:
> >You can't just backup a live database from the filesystem level and expect
> >it to work ...
>
> It should be OK, if less than ideal, if:
>
> - You have fsync enabled (which you do if you care about your data);

I've always wondered about this!  Isn't fsync only a side-effect of
having a write-back cache between programs and the disk?  This means
it's only purpose is to ensure that the cache is consistent with what's
on disk.  Because all programs running within a system are running on
top of the cache they don't know or care whether the cache actually
matches up to the disk.

Therefore, if I understand things correctly, the state of fsync
shouldn't matter in this use case.  It's equally borken independent to
the state of fsync.


  Sam

Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Wed, 18 Jun 2008, Alan Hodgson wrote:

> If the database was in use when _that_ backup was taken, it may also not
> be usable.
>
> You can't just backup a live database from the filesystem level and expect
> it to work ...

Alan,

   The only database in the cluster that has seen any use recently is the one
for SQL-Ledger. When daily incremental and weekly full backups are made,
there is no activity by any user. All users are logged out of the system
because it's the middle of the night.

   Also, while I'm glad to learn more than I knew before how to go about
making backups and upgrading the PostgreSQL installation, having folks
telling me all I did incorrectly is not as helpful to me as guidance on
getting the cluster correctly transferred from the 8.1.4 version to the
8.3.3 version. My readings and research obviously did not sufficiently
educate me on the proper process.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Need Help Recovering from Botched Upgrade Attempt

From
Sam Mason
Date:
On Wed, Jun 18, 2008 at 08:48:41AM -0700, Rich Shepard wrote:
>   while I'm glad to learn more than I knew before how to go about
> making backups and upgrading the PostgreSQL installation, having folks
> telling me all I did incorrectly is not as helpful to me as guidance on
> getting the cluster correctly transferred from the 8.1.4 version to the
> 8.3.3 version. My readings and research obviously did not sufficiently
> educate me on the proper process.

What I'd try doing is this: find a 8.1 version of PG (8.1.4 or later)
and run this against the data you saved off, once this is running you
can then run 8.3's version of pg_dump against it, then you can restore
this dump into the new version of PG.


  Sam

Re: Need Help Recovering from Botched Upgrade Attempt

From
Rich Shepard
Date:
On Wed, 18 Jun 2008, Sam Mason wrote:

> What I'd try doing is this: find a 8.1 version of PG (8.1.4 or later) and
> run this against the data you saved off, once this is running you can then
> run 8.3's version of pg_dump against it, then you can restore this dump
> into the new version of PG.

Sam,

   OK. I'm downloading 8.1.13. I'll build this to install in /usr/local.

   If I cd to /usr/local/bin/ and specify 'postgres -D /usr4/pgsql_old/data',
then I can specify /usr/bin/pg_dumpall to use the 8.3.3 version against the
running 8.1.13 server. (Writing to myself, unless there's an egregious error
here. In that case, please let me know!)

   Results when I have them.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.               |  Integrity            Credibility
Applied Ecosystem Services, Inc.        |            Innovation
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

On Wed, 18 Jun 2008, Sam Mason wrote:

> Isn't fsync only a side-effect of having a write-back cache between
> programs and the disk?  This means it's only purpose is to ensure that
> the cache is consistent with what's on disk.  Because all programs
> running within a system are running on top of the cache they don't know
> or care whether the cache actually matches up to the disk.

Most programs don't.  PostgreSQL writes to the database in two stages:
the WAL, followed by an fsync, then later to the main database files.
You can't trust the WAL will be around for recovery until the first fsync
returns.  The checkpoint process makes sure everything that went into the
WAL then made it to the main database files, and again it doesn't trust
that it's really on disk until the fsync returns.

> Therefore, if I understand things correctly, the state of fsync
> shouldn't matter in this use case.  It's equally borken independent to
> the state of fsync.

Quote borken indeed, and fsync has nothing to do with it.  The theory
proposed is that since no writes were done, the backup should be
consistant.  This is quite wrong.  The most obvious case showing that is
one where a time-driven checkpoint occured (as happens every 5 minutes by
default) while you were in the middle of backing up.  Let's say the main
database files are backed up before the checkpoint, but the backup is
still going on some giant archival table.  The checkpoint happens; it
updates the earlier files already in the backup.  The checkpoint finishes,
and erases the WAL logs.  Now the backup makes it way to the WAL files.
You're screwed when you try and recover this database from the backup.
The database doesn't have the latest updates, and the WAL can't recover
them because it already cleared its copy of them out thinking they weren't
needed anymore.  You'll be lucky to get the database to start at all, it's
missing data you thought was commited before the backup started, and who
knows what subtle corruption you'll find.

Now, in reality, even time-driven checkpoints don't do anything if there
hasn't been activity, so it may very well be the case that any one
database backup is fine.  But you can't ignore the requirement to do a
pg_start_backup before making a filesystem level backup and expect you'll
get that lucky--sooner or later you will get a backup that won't restore
if you keep that up.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

On Wed, Jun 18, 2008 at 02:17:00PM -0400, Greg Smith wrote:
> On Wed, 18 Jun 2008, Sam Mason wrote:
>
> >Isn't fsync only a side-effect of having a write-back cache between
> >programs and the disk?  This means it's only purpose is to ensure that
> >the cache is consistent with what's on disk.  Because all programs
> >running within a system are running on top of the cache they don't know
> >or care whether the cache actually matches up to the disk.
>
> Most programs don't.  PostgreSQL writes to the database in two stages:
> the WAL, followed by an fsync, then later to the main database files.

Sorry, I wasn't being clear.  When I said "they don't know or care" I
meant that if you've got a PG process writing it's database files and a
backup process running on the same machine then the backup process will
see the data written by PG independently of whether fsync is called or
not.

> You can't trust the WAL will be around for recovery until the first fsync
> returns.  The checkpoint process makes sure everything that went into the
> WAL then made it to the main database files, and again it doesn't trust
> that it's really on disk until the fsync returns.

Yes

> >Therefore, if I understand things correctly, the state of fsync
> >shouldn't matter in this use case.  It's equally borken independent to
> >the state of fsync.
>
> Quote borken indeed, and fsync has nothing to do with it.

My original note was mainly in response to Craig's comment that implied
fsync doing far more than it actually does.  I remember seeing a few
comments recently saying similar things about fsync, so sorry for
picking specifically on you Craig.  Device/filesystem level snapshotting
is exactly what's needed and is independent of any fsync settings.


  Sam

Re: Understanding fsync

From
Craig Ringer
Date:
Sam Mason wrote:

> My original note was mainly in response to Craig's comment that implied
> fsync doing far more than it actually does.  I remember seeing a few
> comments recently saying similar things about fsync, so sorry for
> picking specifically on you Craig.  Device/filesystem level snapshotting
> is exactly what's needed and is independent of any fsync settings.

Good point. I guess fsync would only matter for certain if you were
using snapshots on shared storage (a SAN or similar) where the current
host's view of the yet-to-be-flushed-to-storage FS state is not known.

The main snapshot system I use is LVM, which takes a snapshot of the
block device underlying the file system. It's not actually filesystem
level at all, but volume/block level. I'm just not sure whether
filesystems like ext3 and the underlying Linux VFS will always pass
blocks written by applications through to the block layer immediately
and in order. After all, they do have tricks like delayed allocation
that I'd expect to result in data not being passed to the file system.
If they don't always pass writes to the block layer in the order they're
issued then fsync would also be significant for LVM snapshots. Any idea?

ext3's data=ordered refers to data being written before metadata, not
the order in which different data write requests are processed, so
that's no help.

--
Craig Ringer

Re: Understanding fsync

From
Sam Mason
Date:
On Thu, Jun 19, 2008 at 08:56:20AM +0800, Craig Ringer wrote:
> >My original note was mainly in response to Craig's comment that implied
> >fsync doing far more than it actually does.  I remember seeing a few
> >comments recently saying similar things about fsync, so sorry for
> >picking specifically on you Craig.  Device/filesystem level snapshotting
> >is exactly what's needed and is independent of any fsync settings.
>
> Good point. I guess fsync would only matter for certain if you were
> using snapshots on shared storage (a SAN or similar) where the current
> host's view of the yet-to-be-flushed-to-storage FS state is not known.

Hum that's a fun one!  I've also just realized my original statement
was wrong---I combined device and filesystem snapshotting together
inappropriately.

I think users should care about fsync precisely when they care about the
underlying block device being in a consistent state.  This applies to
the normal case that people worry about of power loss, as well as the
case of block level backups and modern filesystems that maintain a lot
of state independently of the OS's buffer cache.  To summarize, fsync
can be turned off when the user is unconcerned with power loss and when
backups are always performed at the filesystem level.


  Sam