After the restart of a PostgreSQL 9.3.1 hot standby doing streaming
replication, the database would not come up again and the logs show
"index contains unexpected zero page at block 0" errors as shown below:
LOG: entering standby mode
LOG: redo starts at 6/55A16990
LOG: consistent recovery state reached at 6/56D5FFF0
LOG: database system is ready to accept read only connections
LOG: invalid record length at 6/56D5FFF0
LOG: started streaming WAL from primary at 6/56000000 on timeline 1
state=XX002,user=repmgr,db=repmgr FATAL: index
"pg_amproc_fam_proc_index" contains unexpected zero page at block 0
state=XX002,user=repmgr,db=repmgr HINT: Please REINDEX it.
state=XX002,user=postgres,db=foo ERROR: index "foo_pkey" contains
unexpected zero page at block 0
state=XX002,user=postgres,db=foo HINT: Please REINDEX it.
WARNING: page 1 of relation base/37706/11821 is uninitialized
CONTEXT: xlog redo vacuum: rel 1663/37706/11821; blk 2,
lastBlockVacuumed 0
PANIC: WAL contains references to invalid pages
CONTEXT: xlog redo vacuum: rel 1663/37706/11821; blk 2,
lastBlockVacuumed 0
What could cause "index contains unexpected zero page at block 0"
errors as shown above on a hot standby?
As this happened only on a standby, there is no need to recover any
data. Instead the point would be to understand what could cause this
and prevent it from happening again (on a master at least). Any hints
on how to investigate this? Could the source of the error be on the
master side (shipping invalid WALs)? Or might it be an issue out of
PostgreSQL's control on the standby (such as the filesystem)?
Also, should one be concerned about log messages such as "invalid
record length at 6/56D5FFF0", as shown at the beginning of the log
snippets above? Searching mailing archives seems to suggest that such
log messages might just indicate that PostgreSQL reached the end of the
local set WALs and will start streaming from the master (see for example
<http://www.postgresql.org/message-id/CAHGQGwFvv0pxaf_iZ1FU1H=d=exhPUoM0ss-9GkDWRP=FureMg@mail.gmail.com>),
but I couldn't find confirmation of this elsewhere.