Re: Streaming replication slave crash - Mailing list pgsql-general

From Quentin Hartman
Subject Re: Streaming replication slave crash
Date
Msg-id CAJ48qNZMgEv03O_rYS_K_E60CKWfanTSDpr34LjKhh_w67Uh9g@mail.gmail.com
Whole thread Raw
In response to Re: Streaming replication slave crash  (Lonni J Friedman <netllama@gmail.com>)
List pgsql-general
On Fri, Mar 29, 2013 at 10:23 AM, Lonni J Friedman <netllama@gmail.com> wrote:
Looks like you've got some form of coruption:
page 1441792 of relation base/63229/63370 does not exist

Thanks for the insight. I thought that might be it, but never having seen this before I'm glad to have some confirmation.

The question is whether it was corrupted on the master and then
replicated to the slave, or if it was corrupted on the slave.  I'd
guess that the pg_dump tried to read from that page and barfed.  It
would be interesting to try re-running the pg_dump again to see if
this crash can be replicated.  If so, does it also replicate if you
run pg_dump against the master?  If not, then the corruption is
isolated to the slave, and you might have a hardware problem which is
causing the data to get corrupted.

Yes, we've gotten several clean dumps form the slave since then w/o crashing. We're running these machines on EC2 so we sadly have no control over the hardware. With your confirmation, and an apparently clean state now, I'm inclined to chalk this up to an EC2 hiccup getting caught by Postgres and get on with life.

Thanks!

QH

pgsql-general by date:

Previous
From: "D'Arcy J.M. Cain"
Date:
Subject: Re: Money casting too liberal?
Next
From: Tom Lane
Date:
Subject: Re: Streaming replication slave crash