Home > mailing lists

Re: Corrupted Data ? - Mailing list pgsql-general

From	Ioana Danes
Subject	Re: Corrupted Data ?
Date	August 12, 2016 16:09:48
Msg-id	CAPg0s+5V1b72xx7yT8N_9jz1nsUdYHHaWG4nchaHqE3eghce6A@mail.gmail.com Whole thread Raw
In response to	Re: Corrupted Data ? (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses	Re: Corrupted Data ? (Melvin Davidson <melvin6925@gmail.com>) Re: Corrupted Data ? (Francisco Olarte <folarte@peoplecall.com>)
List	pgsql-general

Tree view

Hello Everyone,

I have new information on this case. I also open a post for Bucardo because I am still not sure what triggers this problem.

The problem happened again on the same table but on another field. Few days ago I started a fourth database called drdb that is a PITR slave from db3.

- The record was created on db1 and replicated to db2 and db3
August 11 @ 2:30

- db1, db2 and db3 are in sync (I have a script that compares the data for all 3 dbs every night @ 2:30 am)

August 12 @ 2:30

- db3 is out of sync because of this field (drawid)

- drdb (which is PITRed from db3) is in sync with db1 and db2?????

Because drdb (PITR slave from db3) is in sync with db1 and db2 and because the base backup was taken before the record in case was created, I believe that the xlogs are fine and I have a data kind of corruption on db3 on the data file for that table that happened after August 11 @ 2:30 (because the compare script found the dbs in sync)...

Also the index is correct on db3 as the record in case (with drawid = 318216) is retrieved if I filter by drawid = 318220

Any help is greatly appreciated,

Thank you

On Mon, Aug 8, 2016 at 1:25 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 08/08/2016 10:06 AM, Ioana Danes wrote:

On Mon, Aug 8, 2016 at 12:55 PM, Adrian Klaver> 75315811???

Corrupted index on db3?

yes

Might want to look in the db3 logs to see if anything pops out.

I checked the logs, no traces of errors or corruption.

I just do not know enough about Burcardo to be of much help beyond that.

it is trigger based, it saves the ids of the inserted record in a delta
table and then on sync it creates copy commands to the slave. Even if
there is a bug or corruption in that process I don't see how that
corrupts the index on db3...

It seems to do more then that:

https://bucardo.org/wiki/Bucardo/Documentation/Overview

That is why I suggested the post to the Burcardo list. Folks there will have a better idea what goes under the hood.

There is also this from a previous post:

"Only one master is active at one time the other one is in stand by that is a topic for another discussion but in our case that works well."

Have no idea how that interaction plays out.

At this point what I see is:

1) Data is entered on a master and is correct there.

2) Data is replicated to a single standby from one of two possible sources via Bucardo and is no longer correct.

3) Now Bucardo uses Postgres to do its work so it is possible that something in Postgres is at fault. Still the fact that the data is good on the master but not in the standby tends to indicate that the act of replication is the issue.

4) Exactly how that replication is accomplished is not obvious to me.

So it is either replication bug + index corruption on db3 or data
corruption on db3...

In response to Melvin, the query returns no rows:

SELECT n.nspname,
i.relname,
i.indexrelname,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
'INVALID'
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE idx.indisvalid = FALSE
ORDER BY 1, 2;

nspname | relname | indexrelname | type | ?column?
---------+---------+--------------+------+----------
(0 rows)

Thank you for your thoughts,
ioana

Thanks,
ioana

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
<mailto:adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>>

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

From: Chris Travers
Date: 12 August 2016, 14:32:38
Subject: Re: Postgres Pain Points 2 ruby / node language drivers

From: Melvin Davidson
Date: 12 August 2016, 16:37:05
Subject: Re: Corrupted Data ?

Re: Corrupted Data ? - Mailing list pgsql-general

Previous

Next