Re: BBU Cache vs. spindles - Mailing list pgsql-performance

From Tom Lane
Subject Re: BBU Cache vs. spindles
Date
Msg-id 13038.1287720327@sss.pgh.pa.us
Whole thread Raw
In response to Re: BBU Cache vs. spindles  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: BBU Cache vs. spindles
List pgsql-performance
Greg Smith <greg@2ndquadrant.com> writes:
> At this point, you now have a torn 8K page, with 1/2 old and 1/2 new
> data.

Right.

> Without a full page write in the WAL, is it always possible to
> restore its original state now?  In theory, I think you do.  Since the
> delta in the WAL should be overwriting all of the bytes that changed
> between the old and new version of the page, applying it on top of any
> four possible states here:

You've got entirely too simplistic a view of what the "delta" might be,
I fear.  In particular there are various sorts of changes that involve
inserting the data carried in the WAL record and shifting pre-existing
data around to make room, or removing an item and moving remaining data
around.  If you try to replay that type of action against a torn page,
you'll get corrupted results.

We could possibly redefine the WAL records so that they weren't just the
minimum amount of data but carried every byte that'd changed on the
page, and then I think what you're envisioning would work.  But the
records would be a lot bulkier.  It's not clear to me that this would be
a net savings over the current design, particularly not if there's
a long interval between checkpoints.

            regards, tom lane

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: BBU Cache vs. spindles
Next
From: Greg Smith
Date:
Subject: Re: BBU Cache vs. spindles