Re: Why we lost Uber as a user - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Why we lost Uber as a user
Date
Msg-id CACjxUsMmXzurHeHgfW8PtLAy37zyGJXihX1c7gJxO+WFnNfZWQ@mail.gmail.com
Whole thread Raw
In response to Re: Why we lost Uber as a user  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why we lost Uber as a user  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-hackers
On Wed, Aug 3, 2016 at 2:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
>> On 08/03/2016 11:23 AM, Tom Lane wrote:
>>> I think the realistic answer if you suffer replication-induced corruption
>>> is usually going to be "re-clone that slave", and logical rep doesn't
>>> really offer much gain in that.
>
>> Yes, it actually does. The ability to unsubscribe a set of tables,
>> truncate them and then resubscribe them is vastly superior to having to
>> take a base backup.
>
> True, *if* you can circumscribe the corruption to a relatively small
> part of your database, logical rep might provide more support for a
> partial re-clone.

When I worked with Wisconsin Courts to migrate their databases to
PostgreSQL, we had a DBMS-agnostic logical replication system, and
we had a compare program that could be run off-hours as well as
having that be a background activity for the replication software
to work on during idle time.  Either way. a range of rows based on
primary key was read on each side and hashed, the hashes compared,
and if they didn't match there was a column-by-column compare for
each row in the range, with differences listed.  This is how we
discovered issues like the non-standard handling of backslash
mangling our data.

Personally, I can't imagine running logical replication of
supposedly matching sets of data without something equivalent.

Certainly, the courts had source documents to use for resolving any
question of the correct value on a mismatch, and I would imagine
that many environments would.  If you have a meaningful primary key
(like a court case number, by which the file folder is physically
located), seeing the different values for a specific column in a
specific row makes fixes pretty straightforward.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Optimizing numeric SUM() aggregate
Next
From: Thomas Munro
Date:
Subject: Re: Implementing full UTF-8 support (aka supporting 0x00)