On 8.12.2012 03:08, Jeff Janes wrote:
> On Thu, Dec 6, 2012 at 3:52 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> Hi,
>>
>> On 6.12.2012 23:45, MauMau wrote:
>>> From: "Tom Lane" <tgl@sss.pgh.pa.us>
>>>> Well, that's unfortunate, but it's not clear that automatic recovery is
>>>> possible. The only way out of it would be if an undamaged copy of the
>>>> segment was in pg_xlog/ ... but if I recall the logic correctly, we'd
>>>> not even be trying to fetch from the archive if we had a local copy.
>>>
>>> No, PG will try to fetch the WAL file from pg_xlog when it cannot get it
>>> from archive. XLogFileReadAnyTLI() does that. Also, PG manual contains
>>> the following description:
>>>
>>> http://www.postgresql.org/docs/9.1/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL
>>>
>>>
>>> WAL segments that cannot be found in the archive will be sought in
>>> pg_xlog/; this allows use of recent un-archived segments. However,
>>> segments that are available from the archive will be used in preference
>>> to files in pg_xlog/.
>>
>> So why don't you use an archive command that does not create such
>> incomplete files? I mean something like this:
>>
>> archive_command = 'cp %p /arch/%f.tmp && mv /arch/%f.tmp /arch/%f'
>>
>> Until the file is renamed, it's considered 'incomplete'.
>
> Wouldn't having the incomplete file be preferable over having none of it at all?
>
> It seems to me you need considerable expertise to figure out how to do
> optimal recovery (i.e. losing the least transactions) in this
> situation, and that that expertise cannot be automated. Do you trust
> a partial file from a good hard drive, or a complete file from a
> partially melted pg_xlog?
It clearly is a rather complex issue, no doubt about that. And yes,
reliability of the devices with pg_xlog on them is an important detail.
Alghough if the WAL is not written in a reliable way, you're hosed
anyway I guess.
The recommended archive command is based on the assumption that the
local pg_xlog is intact (e.g. because it's located on a reliable RAID1
array), which seems to be the assumption of the OP too.
In my opinion it's more likely to meet an incomplete copy of WAL in the
archive than a corrupted local WAL. And if it really is corrupted, it
would be identified during replay.
Tomas