Re: Timeline following for logical slots - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Timeline following for logical slots
Date
Msg-id CAMsr+YHToEyhGkLTAOJXvJo1ddqM0A+E=jHBwm_QhTTRHKJycA@mail.gmail.com
Whole thread Raw
In response to Re: Timeline following for logical slots  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Timeline following for logical slots  (Andres Freund <andres@anarazel.de>)
Re: Timeline following for logical slots  (Oleksii Kliukin <alexk@hintbits.com>)
Re: Timeline following for logical slots  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 5 April 2016 at 04:00, Robert Haas <robertmhaas@gmail.com> wrote:

In general, I think we'd be a lot better off if we got some kind of
logical replication into core first and then worked on lifting these
types of limitations afterwards.

First, I'd like to remind everyone that logical decoding is useful for more than replication. You can consume the change stream for audit logging/archival, to feed into a different DBMS, etc etc. This is not just about replicating from one PostgreSQL to another, though to be sure that'll be the main use in the near term.

The Zalando guys at least are already using it for other things, and interest in the json support suggests they're not alone.

Right now if you're doing any kind of logical deocoding from a master server that fails over to a standby the client just dies. The slot vanishes. You're stuffed. Gee, I hope you didn't need all that nice consistent ordering, because you're going to have to start from scratch and somehow reconcile the data in the new master with what you've already received ... and haven't.

We could certainly require clients to jump through all sorts of extra hoops to make sure they can follow replay over physical failover. Or we could say that you shouldn't actually expect to use logical decoding in real world environments where HA is a necessity until we get around to supporting realistic, usable logical-rep based failover in a few years. Or we could make it "just work" for the physical failover systems everyone already uses and relies on, just like sequences, indexes, and everything else in PostgreSQL that's expected to survive failover.

Would you tell someone to use unlogged tables and then do failover with Londiste? 'cos that's not too far from what's being talked about here. Though frankly, Londiste is more capable than the replication currently delivered with pglogical anyway, and it can follow physical failover too.

I don't understand why it seems to be considered OK for logical slots to just vanish on failover. The only other things I can think of where that's considered OK are unlogged tables (because that's the point and we have failover-safe ones too) and the old hash indexes nobody's quite willing to remove yet.

If I had to pick an order in which
to do the things you list, I'd focus first on the one you list second:
being able to stream and begin applying transactions before they've
committed is a really big deal for large transactions, and lots of
people have some large transactions.

I guess. We can manage DDL externally, however ugly that may be, but we can't do much about big xacts, and all but the very purest OLTP systems do some batch work sometimes.

It's still 9.7 material at very, very best, and things like parallel apply then stack on top of it. 


DDL replication is nice, but
realistically, there are a lot of people who simply don't change their
schema all that often, and who could (and might even prefer to) manage
that process in other ways - e.g. change nodes one by one while they
are off-line, then bring them on-line.

Yeah, it's a bit more complex than that. Schema changes *must* be made at a specific point in replay. You can't generally just ALTER TABLE ... DROP COLUMN on the master then do the same thing on the replica. The replica probably still has un-replayed changes from the master that have the now-dropped column in their change stream, but now it can't apply them to the new table structure on the downstream. This particular case can be worked around, but column type changes, addition of non-null columns etc cannot.

You can only do DDL safely by either:

* Freezing all writes to replicated tables on the master and waiting until all logical slots are replayed up to date, then applying the DDL on each node; or

* Passing the desired DDL to a function that inserts it into a special replicated table on the master then executes it on the master. The replica monitors the replicated table for inserts and executes the same DDL on the replica as it replays the change stream. The insert serves as a barrier between old and new table structures.

pglogical supports either approach, but both have major foot-guns attached.

Also, a growing portion of the world uses generated schemas and migrations and can't easily pipe that through some arbitrary function we provide. They may not be too keen on stopping writes to their entire database as an alternative either (and we don't provide a true read-only mode for them to use if they did want to). I know it's fashionable around here to just write them off as idiots for using ORMs and so on, but they're rather widespread idiots who're just as likely to be interested in geographically distributed selective replication, change stream extraction, etc. This has been a persistent problem with people who want to use BDR, too.

There are workarounds available, though, and we can't fix everything at once. So despite the problems I agree that DDL replication is less crucial.
 
I don't think that we need every
physical replication feature plus some before logical replication can
start to be useful to PostgreSQL users generally. 

That much I agree with, though I think our views on the set of features we do need will differ.

Sequences, for example. We don't support those at all right now. We can kind-of support OWNED BY sequences client-side, or use a "snapshot and apply with a fudge factor" approach like Londiste does, so it's not the end of the world. But that's just one of many.

OTOH, I'm sure we'll both agree that not replicating pg_largeobject isn't exactly something to shed tears over.
 
We do, however,
need the functionality to be accessible to people who are using only
the PostgreSQL core distribution.  The thing that is going to get
people excited about making logical replication better is getting to a
point where they can use it at all - and that is not going to be true
as long as you can't use it without having to download something from
an external website.

I guess it's harder for me to see that because I've been working on it for so long.

Review and test responses have been pretty underwhelming for pglogical, and quite a bit seem to have boiled down to "this should live as an extension, we don't need it in core". It often feels like we can't win: if we seek to get it into core we're told it's not wanted/needed, but if we try to focus on solving issues in core to make it work better and let it live as an extension we're told we shouldn't bother until it's in core.

(I do specifically want to thank Andres Freund and Tomasz Rybak for detailed, constructive and helpful review on pglogical though).

Do you want to get a logical replication system into core that doesn't work properly with lots of the other features in PostgreSQL? That's historically not how we've done things here, and sometimes massive amounts of work have been required to make new feature X work with obscure/awkward existing feature Y. Inheritance comes strongly to mind as an exciting challenge for many new features to support properly. Arguments are usually raised that reference things like the mess created by MySQL's storage engines and how we're better off making everything work right once and for all.

OTOH, there's precedent there: inheritance still doesn't work with FKs or properly support UNIQUE constraints on parent relations, for example. And we're increasingly getting to the point where doing everything all at once is just unmanageable.

Still, I don't really want to block work on making logical decoding more real-world usable on inclusion of a logical replication system for PostgreSQL, especially one that'll be lucky to get in for 9.7 at the earliest.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Abhijit Menon-Sen
Date:
Subject: Re: dealing with extension dependencies that aren't quite 'e'
Next
From: David Rowley
Date:
Subject: Re: Combining Aggregates