Setup = Primary and streaming replication based secondary
Few days ago we had a situation where our Primary started to through the error messages below indicating corruption in the database. It crashed sometimes and showed a panic message in the logs
2013-03-25 07:30:39.704 PDT LOG: server process (PID 8715) was terminated by signal 6: Aborted
2013-03-25 07:30:39.704 PDT LOG: terminating any other active server processes
Days before it started to crash it showed the below error messages in the logs.
[d: u:postgres p:2498 7] ERROR: could not access status of transaction 837550133
DETAIL: Could not open file "pg_clog/031E": No such file or directory. [u:postgres p:2498 9]
[d: u:radio p:31917 242] ERROR: could not open file "base/16384/114846.39" (target block 360448000): No such file or directory [d: u:radio p:31917 243]
On top of that, our secondaries are now crashed and would not startup and showed the error messages below in pg logs.
2013-03-27 11:00:47.281 PDT LOG: recovery restart point at 161A/17108AA8
2013-03-27 11:00:47.281 PDT DETAIL: last completed transaction was at log time 2013-03-27 11:00:47.241236-07