Re: replication consistency checking - Mailing list pgsql-admin

From hydra
Subject Re: replication consistency checking
Date
Msg-id CAG6MAzQNQPRSsRL6629id-5i2jv6AQZ0y25=YYU2M3CJAOCiTg@mail.gmail.com
Whole thread Raw
In response to Re: replication consistency checking  (Igor Neyman <ineyman@perceptron.com>)
Responses Re: replication consistency checking  (Igor Neyman <ineyman@perceptron.com>)
List pgsql-admin


On Fri, Jun 5, 2015 at 3:42 PM, Igor Neyman <ineyman@perceptron.com> wrote:

 

 

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of hydra
Sent: Friday, June 05, 2015 12:33 AM
To: pgsql-admin
Subject: Re: [ADMIN] replication consistency checking

 

On Fri, Jun 5, 2015 at 4:40 AM, Sergey Konoplev <gray.ru@gmail.com> wrote:

On Wed, Jun 3, 2015 at 9:49 PM, hydra <hydrapolic@gmail.com> wrote:
> After setting up streaming replication, is it possible to check whether the
> slave has the same data as the master?
>
> In the MySQL world there is the percona-toolkit with pt-table-checksum that
> does this job:
> https://www.percona.com/doc/percona-toolkit/2.2/pt-table-checksum.html

I believe you only need
http://www.postgresql.org/docs/9.4/static/app-initdb.html#APP-INITDB-DATA-CHECKSUMS.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979
gray.ru@gmail.com



Thank you all for replies,
while looking for replication information I found this:
http://thebuild.com/presentations/worst-day-fosdem-2014.pdf

It's a real life experience of hitting this replication bug:
https://wiki.postgresql.org/wiki/Nov2013ReplicationIssue

The primary symptom of this corruption is rows that:

  • are present on the master, but missing on the replica
  • have been deleted on the master still appear to be visible on the replica
  • have been updated, and their old versions appear alongside the new, updated versions on the replica

How can I verify whether I already have this corruption?
There is no known way to identify that the issue has affected a standby in the past but comparing the data from the primary with the standby.

 

So hands up who still think PostgreSQL doesn't need some way of checking the data consistency between master-standby? :)

 

 

My hand is up.

From the wiki page that you referenced you forgot to quote this:

 

This is an issue, discovered Nov. 18, 2013., which can cause data corruption on a Hot-Standby replica when it is (re-)started, by marking committed transactions as uncommitted. This issue is fixed in the December 5th 2013 update releases.”

 

So, what’s your point?  Yes, any software can (and does) have bugs.  What’s important is how quickly the bug is discovered and fixed.

 

You wish to write a utility that compares data on 2 different clusters – sure, by all means, but I believe time could be better spent on something else.

But, how is to say that this utility will be 100% error-free?

 

Regards,

Igor Neyman

 

 


Hello Igor,
no code is without bugs, yes, we are humans. That is why I'm so shocked that everybody blindly trusts the replication (which is code again, plus network and hardware).

My point was to show that such a tool would be beneficial even for PostgreSQL. Yes, even the checking utility can have bugs, but at least you are better than now. Because now everybody relies on the fact that it should be ok. But as we see from the bug report, such problems exist here also. So from the situation where you believe everything should be ok you would have a tool reporting "checksums ok" and would allow your to trust your standby data more.

pgsql-admin by date:

Previous
From: Igor Neyman
Date:
Subject: Re: replication consistency checking
Next
From: Igor Neyman
Date:
Subject: Re: replication consistency checking