Re: Serious problem: media recovery fails after system or PostgreSQL crash - Mailing list pgsql-hackers

"MauMau" <maumau307@gmail.com> writes:
> I'm using PostgreSQL 9.1.6 on Linux.  I encountered a serious problem that 
> media recovery failed showing the following message:
> FATAL:  archive file "000000010000008000000028" has wrong size: 7340032 
> instead of 16777216

Well, that's unfortunate, but it's not clear that automatic recovery is
possible.  The only way out of it would be if an undamaged copy of the
segment was in pg_xlog/ ... but if I recall the logic correctly, we'd
not even be trying to fetch from the archive if we had a local copy.

> Therefore, I think postgres must continue recovery by fetching files from 
> pg_xlog/ when it encounters a partially filled archive files.  In addition, 
> it may be necessary to remove the partially filled archived files, because 
> they might prevent media recovery in the future (is this true?).  I mean we 
> need the following fix.  What do you think?

I think having PG automatically destroy archive files is bordering on
insane.  It might be reasonable for the archiving process to do
something like this, if it has a full-size copy of the file available
to replace the damaged copy with.  But otherwise you're just throwing
away what's probably the only copy of useful data.

> I've heard that the next minor release is scheduled during this weekend.  I 
> really wish this problem will be fixed in that release.  If you wish, I'll 
> post the patch tomorrow or the next day.  Could you include the fix in the 
> weekend release?

Even if this were a good and uncontroversial idea, I'm afraid you're
several days too late.  The release is today, not next week.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Functional dependency in GROUP BY through JOINs
Next
From: Robert Haas
Date:
Subject: Re: Fix for pg_upgrade status display