Re: Why we lost Uber as a user - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Why we lost Uber as a user
Date
Msg-id CA+TgmoaC4SgcjSJuSESXE_O5f_4ufsbjjLL1dgiLe2rqRDzp1w@mail.gmail.com
Whole thread Raw
In response to Re: Why we lost Uber as a user  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why we lost Uber as a user  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Aug 2, 2016 at 5:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> With physical replication, there is the concern that a bug in *just* the
>> physical (WAL) side of things could cause corruption.
>
> Right.  But with logical replication, there's the same risk that the
> master's state could be fine but a replication bug creates corruption on
> the slave.
>
> Assuming that the logical replication works by issuing valid SQL commands
> to the slave, one could hope that this sort of "corruption" only extends
> to having valid data on the slave that fails to match the master.
> But that's still not a good state to be in.  And to the extent that
> performance concerns lead the implementation to bypass some levels of the
> SQL engine, you can easily lose that guarantee too.
>
> In short, I think Uber's position that logical replication is somehow more
> reliable than physical is just wishful thinking.  If anything, my money
> would be on the other way around: there's a lot less mechanism that can go
> wrong in physical replication.  Which is not to say there aren't good
> reasons to use logical replication; I just do not believe that one.

I don't think they are saying that logical replication is more
reliable than physical replication, nor do I believe that to be true.
I think they are saying that if logical corruption happens, you can
fix it by typing SQL statements to UPDATE, INSERT, or DELETE the
affected rows, whereas if physical corruption happens, there's no
equally clear path to recovery.  If an index is damaged, you can
recreate it; if a heap page is damaged such that you can no longer
scan the table, you're going to need expert assistance.

And I think there's some point to that.  I agree with the general
sentiment that they could have gotten further and been more successful
with PostgreSQL if they had some expert advice, but I think it's
indisputable that recovering a physically corrupted database is
generally a lot more painful than one where you only have to fix up
some damaged data.  Whether we really have data-corrupting WAL-replay
bugs sufficiently frequently to make this an ongoing issue rather than
a one-time event is also debatable, but nonetheless I don't think
their point is completely invalid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Álvaro Hernández Tortosa
Date:
Subject: Re: Implementing full UTF-8 support (aka supporting 0x00)
Next
From: Álvaro Hernández Tortosa
Date:
Subject: Re: Implementing full UTF-8 support (aka supporting 0x00)