Re: DATA corruption after promoting slave to master - Mailing list pgsql-general

From Shaun Thomas
Subject Re: DATA corruption after promoting slave to master
Date
Msg-id 53AB258C.108@optionshouse.com
Whole thread Raw
In response to DATA corruption after promoting slave to master  (Karthik Iyer <karthik.i@directi.com>)
Responses Re: DATA corruption after promoting slave to master
List pgsql-general
On 06/25/2014 06:29 AM, Karthik Iyer wrote:

> [2]  We also have a daemon process which copies the latest partial WAL
> log file (which postgres is currently writing to, under pg_xlog/) every
> 3 secs to a different location.

No. No, no, no, no no. No.

Also, no. Partial WAL files are not valid for recovery. In fact, I'm
surprised the standby node even applied it at all.

> We are seeing these problems in the newly promoted master now:
>
>      1. when we run queries on primary key, we don't get the rows even
> if it exist in db. However if we force query not to use index, we get
> those entries.
>      2. there are duplicate values for primary keys

This is no surprise. Your slave has partial data commits, which means
your table, index, or possibly both, are corrupt.

The first thing you need to do is back up any tables you've noticed are
having this behavior. Second, try to reindex the tables that are having
problems. The errors you are seeing are due to the data and indexes
being out of sync. If you get an error that says the reindex fails due
to duplicate values, you can do this:

SET enable_indexscan TO false;
SET enable_bitmapscan TO false;

SELECT primary_key, count(1)
   FROM broken_table
  GROUP BY 1
HAVING count(1) > 1;

For any ID that comes back, do this:

SELECT ctid, *
   FROM broken_table
  WHERE primary_key = [value(s) from above];

Then you need to delete one of the bad rows after deciding which. Use
the CTID of the row you want to delete:

DELETE FROM broken_table
  WHERE ctid = 'whatever';

Then reindex the table so the correct values are properly covered. Doing
this for all of your corrupt tables may take a while depending on how
many there are.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Alternative to psql -c ?
Next
From: Jeff Janes
Date:
Subject: Re: python modul pre-import to avoid importing each time