Re: Mount options for Ext3? - Mailing list pgsql-performance

From Kevin Brown
Subject Re: Mount options for Ext3?
Date
Msg-id 20030125021159.GC28252@filer
Whole thread Raw
In response to Re: Mount options for Ext3?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Mount options for Ext3?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
> > Otherwise you'd have to worry about write transactions
> > to the transaction log committing before the writes to the data files
> > during a savepoint,
>
> Actually, the other way around is the problem.  The WAL algorithm works
> so long as log writes hit disk before the data-file changes they
> describe (that's why it's called write *ahead* log).

Hmm...a case where the transaction data gets written to the files
before the transaction itself even manages to get written to the log?
True.  But I was thinking about the following:

I was presuming that when a savepoint occurs, a marker is written to
the log indicating which transactions had been committed to the data
files, and that this marker was paid attention to during database
startup.

So suppose the marker makes it to the log but not all of the data the
marker refers to makes it to the data files.  Then the system crashes.

When the database starts back up, the savepoint marker in the
transaction log shows that the transactions had already been committed
to disk.  But because the OS wrote the requested data (including the
savepoint marker) out of order, the savepoint marker made it to the
disk before some of the data made it to the data files.  And so, the
database is in an inconsistent state and it has no way to know about
it.

But then, I guess the easy way around the above problem is to always
commit all the transactions in the log to disk when the database comes
up, which renders the savepoint marker moot...and leads back to the
scenario you were referring to...

If the savepoint only commits the older transactions in the log (and
not all of them) to disk, the possibility of the situation you're
referring would, I'd think, be reduced (possibly quite considerably).



...or is my understanding of how all this works completely off?




--
Kevin Brown                          kevin@sysexperts.com

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: WEIRD CRASH?!?!
Next
From: Josh Berkus
Date:
Subject: Re: Mount options for Ext3?