Re: Why we lost Uber as a user - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Why we lost Uber as a user
Date
Msg-id CAMsr+YFXG_Y8gnhXd2_FLvpqRBLV0LTHYFHcKvfWg8rt_Yv-iA@mail.gmail.com
Whole thread Raw
In response to Re: Why we lost Uber as a user  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: Why we lost Uber as a user  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On 17 August 2016 at 08:36, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Something I didn't see mentioned that I think is a critical point: last I looked, HOT standby (and presumably SR) replays full page writes.

Yes, that's right, all WAL-based physical replication replays FPWs.

We could, at the cost of increased WAL size, retain both the original WAL buffer that triggered the FPW and the FPW page image. That's what wal_level = logical does in some cases. I'm not sure it's that compelling though, it just introduces another redo path that can go wrong.
 
 
Ultimately, people really need to understand the trade-offs to the different solutions so they can make an informed decision on which ones (yes, plural) they want to use. The same can be said about pg_upgrade vs something else, and the different ways of doing backups.

Right.

It's really bugging me that people are talking about "statement based" replication in MySQL as if it's just sending SQL text around. MySQL's statemnet based replication is a lot smarter than that, and in the actually-works-properly form it's a hybrid of row and statement based replication ("MIXED" mode). As I understand it it lobs around something closer to parsetrees with some values pre-computed rather than SQL text where possible. It stores some computed values of volatile functions in the binlog and reads them from there rather than computing them again when running the statement on replicas, which is why AUTO_INCREMENT etc works. It also falls back to row based replication where necessary for correctness. Even then it has a significant list of caveats, but it's pretty damn impressive. I didn't realise how clever the hybrid system was until recently.

I can see it being desirable to do something like that eventually as an optimisation to logical decoding based replication. Where we can show that the statement is safe or make it safe by doing things like evaluating and substituting volatile function calls, xlog a modified parsetree with oids changed to qualified object names etc, send that when decoding, and execute that on the downstream(s). If there's something we can't show to be safe then replay the logical rows instead. That's way down the track though; I think it's more important to focus on completing logical row-based replication to the point where we handle table rewrites seamlessly and it "just works" first.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [GENERAL] C++ port of Postgres
Next
From: Ashutosh Bapat
Date:
Subject: Re: Declarative partitioning - another take