Re: Why we lost Uber as a user - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Why we lost Uber as a user
Date
Msg-id 20160817133535.GA4293@momjian.us
Whole thread Raw
In response to Re: Why we lost Uber as a user  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: Why we lost Uber as a user  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
On Wed, Aug 17, 2016 at 01:27:18PM +0800, Craig Ringer wrote:
> It's really bugging me that people are talking about "statement based"
> replication in MySQL as if it's just sending SQL text around. MySQL's statemnet
> based replication is a lot smarter than that, and in the
> actually-works-properly form it's a hybrid of row and statement based
> replication ("MIXED" mode). As I understand it it lobs around something closer
> to parsetrees with some values pre-computed rather than SQL text where
> possible. It stores some computed values of volatile functions in the binlog
> and reads them from there rather than computing them again when running the
> statement on replicas, which is why AUTO_INCREMENT etc works. It also falls
> back to row based replication where necessary for correctness. Even then it has
> a significant list of caveats, but it's pretty damn impressive. I didn't
> realise how clever the hybrid system was until recently.
> 
> I can see it being desirable to do something like that eventually as an
> optimisation to logical decoding based replication. Where we can show that the
> statement is safe or make it safe by doing things like evaluating and
> substituting volatile function calls, xlog a modified parsetree with oids
> changed to qualified object names etc, send that when decoding, and execute
> that on the downstream(s). If there's something we can't show to be safe then
> replay the logical rows instead. That's way down the track though; I think it's
> more important to focus on completing logical row-based replication to the
> point where we handle table rewrites seamlessly and it "just works" first.

That was very interesting, and good to know.  I assume it also covers
concurrent activity issues which I wrote about in this thread, e.g.

> I saw from the Uber article that they weren't going to per-row logical
> replication but _statement_ replication, which is very hard to do
> because typical SQL doesn't record what concurrent transactions
> committed before a new statement's transaction snapshot is taken, and
> doesn't record lock order for row updates blocked by concurrent activity
> --- both of which affect the final result from the query.

I assume they can do SQL-level replication when there is no other
concurrent activity on the table, and row-based in other cases?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



pgsql-hackers by date:

Previous
From: roshan_myrepublic
Date:
Subject: How to do failover in pglogical replication?
Next
From: Tom Lane
Date:
Subject: Re: Use pread and pwrite instead of lseek + write and read