warm standby issues - Mailing list pgsql-admin

From kevin kempter
Subject warm standby issues
Date
Msg-id 932B6BCD-0917-49BD-96B7-91395A4EC397@kevinkempterllc.com
Whole thread Raw
Responses Re: warm standby issues
List pgsql-admin
Hi List;

I'm setting up a warm standby server on version 8.1.9

I setup a recovery.sh script to keep the standby cluster in recovery
mode waiting for the next  WAL segment. everything works fine as long
as the standby server is in recovery mode.  I see the recovery taking
place in the postgres log of the standby server. I've set it up to
exit the recovery.sh script if it see's a trigger file (to bring the
standby online).

when I create the trigger file I see this:

copy /home/postgres/healthCareCoding/WAL/000000010000000000000004
pg_xlog/RECOVERYXLOG
`/home/postgres/healthCareCoding/WAL/000000010000000000000004' ->
`pg_xlog/RECOVERYXLOG'
LOG:  restored log file "000000010000000000000004" from archive
LOG:  could not open file "pg_xlog/000000010000000000000005" (log file
0, segment 5): No such file or directory
LOG:  redo done at 0/4FFFE90
PANIC:  could not open file "pg_xlog/000000010000000000000004" (log
file 0, segment 4): No such file or directory
LOG:  startup process (PID 9348) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  logger shutting down


I see that the log 000000010000000000000004 was restored, but then
when it tags the redo as done it complains that it cannot find the
same log.

If I re-start the standby cluster and select count(*) from my test
table (the table I'm inserting data into to fill the logs) I get this:

postgres=# select count(*) from t1 ;
ERROR:  xlog flush request 0/113730C is not satisfied --- flushed only
to 0/1135018
CONTEXT:  writing block 1500 of relation 1663/10819/16384
postgres=#



My recovery.sh code is below, thanks in advance for any help...

/Kevin




#!/bin/bash

#DELAY=400000
DELAY=100
SEG_SIZE=16777216
TRIGGERED=0
TRIGGER_FILE="/home/postgres/healthCareCoding/trigger"
WAL_PATH="/home/postgres/healthCareCoding/WAL"
COPY_FLAG=0
RESTORE_FROM="${WAL_PATH}/$2"
RESTORE_TO=$1


echo "$1" | grep -i history
if [ $? -eq 0 ]
then
     if [ -f "$RESTORE_FROM" ]
     then
         echo "copy $RESTORE_FROM $RESTORE_TO"
         cp -v -i $RESTORE_FROM $RESTORE_TO
     fi
     exit
fi

while [ ! -f "$TRIGGER_FILE"  -a  $COPY_FLAG -eq 0 ]
do
     usleep $DELAY;
     if [ -f "$RESTORE_FROM" ]
     then
         fs=`ls -l $RESTORE_FROM`
         set - $fs
         echo "size= [$5]"
         if [ "$5" == "$SEG_SIZE" ]
         then
             COPY_FLAG=1
         fi
     fi
done

if [ -f "$TRIGGER_FILE" ]
then
     exit
fi

if [ ! -f "$TRIGGER_FILE" ]
then
     echo "copy $RESTORE_FROM $RESTORE_TO"
     cp -v -i $RESTORE_FROM $RESTORE_TO
     exit
fi



pgsql-admin by date:

Previous
From: MC Moisei
Date:
Subject: org.postgresql.util.PSQLException: ERROR: could not open relation
Next
From: "Charles Duffy"
Date:
Subject: Re: warm standby issues