Re: Assertion failure at standby promotion - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Assertion failure at standby promotion
Date
Msg-id CA+HiwqFOrBT2iURbzrUYA6bWA4-hs39H7qwHtFuz5TBKBwb8qw@mail.gmail.com
Whole thread Raw
In response to Assertion failure at standby promotion  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Assertion failure at standby promotion  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Hello,

I  tried reproducing the scenario. Note that I did not archive xlogs
(that is, archive_command = '/bin/true' and corresponding
restore_command = '/bin/false'). I performed the steps you mentioned
and could find following:

***** Log on standby-1:
[Standby-1]LOG:  database system was interrupted; last known up at
2013-05-05 14:05:08 IST
[Standby-1]LOG:  creating missing WAL directory "pg_xlog/archive_status"
[Standby-1]LOG:  entering standby mode
[Standby-1]LOG:  started streaming WAL from primary at 0/2000000 on timeline 1
[Standby-1]LOG:  redo starts at 0/2000024
[Standby-1]LOG:  consistent recovery state reached at 0/20000DC
[Standby-1]LOG:  database system is ready to accept read only connections
[Standby-1]LOG:  received promote request
[Standby-1]FATAL:  terminating walreceiver process due to administrator command
[Standby-1]LOG:  invalid magic number 0000 in log segment
000000010000000000000003, offset 5316608
[Standby-1]LOG:  redo done at 0/350F0B8
[Standby-1]LOG:  last completed transaction was at log time 2013-05-05
14:05:14.571492+05:30
[Standby-1]LOG:  selected new timeline ID: 2
[Standby-1]LOG:  archive recovery complete
>> [Standby-1]ERROR:  server switched off timeline 1 at 0/3510B14, but walsender already streamed up to 0/3512000
[Standby-1]LOG:  database system is ready to accept connections
[Standby-1]LOG:  autovacuum launcher started


****** Log on Standby-2:
[Standby-2]LOG:  database system was interrupted while in recovery at
log time 2013-05-05 14:05:07 IST
[Standby-2]HINT:  If this has occurred more than once some data might
be corrupted and you might need to choose an earlier recovery target.
[Standby-2]LOG:  creating missing WAL directory "pg_xlog/archive_status"
[Standby-2]LOG:  entering standby mode
[Standby-2]LOG:  started streaming WAL from primary at 0/2000000 on timeline 1
[Standby-2]LOG:  redo starts at 0/2000024
[Standby-2]LOG:  consistent recovery state reached at 0/3000000
[Standby-2]LOG:  database system is ready to accept read only connections
>> [Standby-2]FATAL:  could not receive data from WAL stream: ERROR:  server switched off timeline 1 at 0/3510B14, but
walsenderalready streamed up to 0/3512000
 

[Standby-2]LOG:  invalid magic number 0000 in log segment
000000010000000000000003, offset 5316608
[Standby-2]LOG:  fetching timeline history file for timeline 2 from
primary server
[Standby-2]LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
[Standby-2]LOG:  restarted WAL streaming at 0/3000000 on timeline 1
[Standby-2]LOG:  replication terminated by primary server
[Standby-2]DETAIL:  End of WAL reached on timeline 1 at 0/3510B14
...
...
...

****** Also, in the ps output, following is the state of wal sender
(standby-1) and wal receiver (standby-2)

amit      8084  5675  0 14:13 ?        00:00:00 postgres: wal receiver
process   restarting at 0/3000000
amit      8085  5648  0 14:13 ?        00:00:00 postgres: wal sender
process amit [local] idle


Is this related to the assertion failure that you have reported?

--

Amit Langote



pgsql-hackers by date:

Previous
From: soroosh sardari
Date:
Subject: Meaning of keyword category list in src/backend/parser/gram.y
Next
From: Kevin Grittner
Date:
Subject: Re: Remaining beta blockers