Serious problem: media recovery fails after system or PostgreSQL crash - Mailing list pgsql-hackers

From MauMau
Subject Serious problem: media recovery fails after system or PostgreSQL crash
Date
Msg-id A70482CA20CD460CB1053F2177CD7789@maumau
Whole thread Raw
Responses Re: Serious problem: media recovery fails after system or PostgreSQL crash
List pgsql-hackers
Hello,

Although this may have to be posted to pgsql-bugs or pgsql-general, let me 
ask you here because the problem probably needs PostgreSQL's code fix.


[Problem]
I'm using PostgreSQL 9.1.6 on Linux.  I encountered a serious problem that 
media recovery failed showing the following message:

FATAL:  archive file "000000010000008000000028" has wrong size: 7340032 
instead of 16777216

I'm using normal cp command to archive WAL files.  That is:
   archive_command = '/path/to/my_script.sh "%p" "/backup/archive_log/%f"'

<<my_script.sh>>
--------------------------------------------------
#!/bin/sh
some processing...
cp "$1" "$2"
other processing...
--------------------------------------------------


The media recovery was triggered by power failure.  The disk drive that 
stored $PGDATA failed after a power failure.  So I replaced the failed disk, 
and performed media recovery by creating recovery.conf and running pg_ctl 
start.  However, pg_ctl failed with the above error message.



[Cause]
The cause is clear from the message.  PostgreSQL refuses to continue media 
recovery when it finds an archived WAL file whose size is not 16 MB.  The 
relevant code is in src/backend/access/transam/xlog.c:

--------------------------------------------------  if (expectedSize > 0 && stat_buf.st_size != expectedSize)  {   int
elevel;
 
   /*    * If we find a partial file in standby mode, we assume it's    * because it's just being copied to the
archive,and keep    * trying.    *    * Otherwise treat a wrong-sized file as FATAL to ensure the    * DBA would notice
it,but is that too strong? We could try    * to plow ahead with a local copy of the file ... but the    * problem is
thatthere probably isn't one, and we'd    * incorrectly conclude we've reached the end of WAL and we're    * done
recovering...    */   if (StandbyMode && stat_buf.st_size < expectedSize)    elevel = DEBUG1;   else    elevel = FATAL;
 ereport(elevel,     (errmsg("archive file \"%s\" has wrong size: %lu instead of %lu",       xlogfname,       (unsigned
long)stat_buf.st_size,       (unsigned long) expectedSize)));   return false;  }
 
--------------------------------------------------


[How to fix]
Archived files can become smaller than their expected sizes for some 
reasons:

1. The power fails while archive_command is copying files (as in my case).
2. Immediate shutdown (pg_ctl stop -mi) is performed while archive_command 
is copying files.  In this case, cp or equivalent copying command is 
cancelled by SIGQUIT sent by postmaster.

Therefore, I think postgres must continue recovery by fetching files from 
pg_xlog/ when it encounters a partially filled archive files.  In addition, 
it may be necessary to remove the partially filled archived files, because 
they might prevent media recovery in the future (is this true?).  I mean we 
need the following fix.  What do you think?

--------------------------------------------------  if (expectedSize > 0 && stat_buf.st_size != expectedSize)  {   int
elevel;
 
...   if (StandbyMode && stat_buf.st_size < expectedSize)    elevel = DEBUG1;   else   {    elevel = LOG;
unlink(xlogpath);  }   ereport(elevel,     (errmsg("archive file \"%s\" has wrong size: %lu instead of %lu",
xlogfname,      (unsigned long) stat_buf.st_size,       (unsigned long) expectedSize)));   return false;  }
 
--------------------------------------------------


I've heard that the next minor release is scheduled during this weekend.  I 
really wish this problem will be fixed in that release.  If you wish, I'll 
post the patch tomorrow or the next day.  Could you include the fix in the 
weekend release?


Regards
MauMau




pgsql-hackers by date:

Previous
From: Vik Reykja
Date:
Subject: Re: DEALLOCATE IF EXISTS
Next
From: Andrew Dunstan
Date:
Subject: Re: strange isolation test buildfarm failure on guaibasaurus