Re: Some problem with warm standby server - Mailing list pgsql-general

From Nico Sabbi
Subject Re: Some problem with warm standby server
Date
Msg-id 46409FD5.70607@officinedigitali.it
Whole thread Raw
In response to Re: Some problem with warm standby server  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-general
Simon Riggs wrote:

>>then I updated the master with a batch of inserts, but after a while the
>>slave stopped with
>>these messages:
>>
>>LOG:  restored log file "000000010000000000000021" from archive
>>LOG:  record with zero length at 0/21000048
>>LOG:  invalid primary checkpoint record
>>LOG:  restored log file "000000010000000000000020" from archive
>>LOG:  restored log file "000000010000000000000021" from archive
>>LOG:  invalid resource manager ID in secondary checkpoint record
>>PANIC:  could not locate a valid checkpoint record
>>LOG:  startup process (PID 19619) was terminated by signal 6
>>LOG:  aborting startup due to startup process failure
>>
>>
>
>Please run pg_controldata to print out the control file.
>
>

Hi, sorry for the long delay.
First of all I had to stop postgres with pg_ctl stop -s immediate, or it
wouldn't die because of the ongoing replication.

This is the output of pg_controldata:

postgres@www3:/usr/local/postgres_replica/data$ pg_controldata
/usr/local/postgres_replica/data/
pg_control version number:            812
Catalog version number:               200510211
Database system identifier:           5001030714849737714
Database cluster state:               in recovery
pg_control last modified:             Fri 27 Apr 2007 13:20:46 CEST
Current log file ID:                  0
Next log file segment:                26
Latest checkpoint location:           0/190C7E04
Prior checkpoint location:            0/190C7DC0
Latest checkpoint's REDO location:    0/190C7E04
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's TimeLineID:       1
Latest checkpoint's NextXID:          3698809
Latest checkpoint's NextOID:          68745
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Time of latest checkpoint:            Fri 27 Apr 2007 11:53:47 CEST
Maximum data alignment:               4
Database block size:                  8192
Blocks per segment of large relation: 131072
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Date/time type storage:               floating-point numbers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C


>Backup all the files in case we need to inspect them.
>
>

ok

>What was the ending log sequence number (e.g. x/xxxx) from the previous
>recovery? I'll see if I can re-create this.
>
>

judging from the logs I gues it is 0/190C7E04:
LOG:  restored log file "000000010000000000000019.000C7E04.backup" from
archive
LOG:  restored log file "000000010000000000000019" from archive
LOG:  checkpoint record is at 0/190C7E04
LOG:  redo record is at 0/190C7E04; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 3698809; next OID: 68745
LOG:  next MultiXactId: 1; next MultiXactOffset: 0
LOG:  automatic recovery in progress
LOG:  redo starts at 0/190C7E48


>
>
>>What did I do wrong? Is there any other procedure to follow to restart a
>>stopped replication?
>>
>>
>
>You're right, using the trigger is not the right way to stop/start the
>standby. Just stop/start the standby server normally.
>
>

as above: a plain stop hangs

>The trigger means that you'd like to perform a failover.
>
>There is a patch not yet applied which will make a new version of
>pg_standby. pg_standby's official status right now is beta, so please
>expect, look for and report any issues you find. Thanks.
>
>
>
thank you

pgsql-general by date:

Previous
From: Andreas
Date:
Subject: PG on Debian 4.0.x ?
Next
From: Sic Transit Gloria Mundi
Date:
Subject: Building Pg 8.2.4 on AIX 5.3 doesn't produce shared libs?