Re: Add CREATE support to event triggers - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Add CREATE support to event triggers
Date
Msg-id 20141113124517.GK13473@awork2.anarazel.de
Whole thread Raw
In response to Re: Add CREATE support to event triggers  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Add CREATE support to event triggers  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2014-11-13 07:17:32 -0500, Robert Haas wrote:
> On Wed, Nov 12, 2014 at 4:58 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > That's already the situation today with all the logical replication
> > solutions. They *constantly* break in the field. Most commonly because
> > of DDL differences.
> 
> Right.  And that's why it's cool that logical decoding can operate
> through DDL differences.  The apply side might not be able to cope
> with what pops out, but that's not logical decoding's fault, and
> different apply-sides can adopt different policies as to how to deal
> with whatever problems crop up.

I think pretty much all of the solutions just say "oops, you're on your
own". And I can't blame them for that. Once there's a schema difference
and it causes problem there's really not much that can be done.

> > I don't understand why you think it's likely for logical replication to
> > break due to this? You mean because deparse yielded a invalid statement?
> > In a normal single master setup there really shouldn't be scenarios
> > where that happens? Except bugs - but as you know we had more than in
> > HS/SR as well?
> 
> I don't know exactly what you mean by "a normal single master setup".
> Surely the point of logical decoding is that the replica might not be
> identical to the master.

I actually think that that's not primary the point if you talk about
individual objects. The majority of objects will be exactly the same on
all nodes. If you actually want to have differening objects on the nodes
you'll have to opt out/in (depending on your solution) of ddl
replication for those objects.

> And if it isn't, then a command that
> succeeded on the master might fail on the standby - for example,
> because an object by that name already exists there, or because a type
> doesn't exist there.  (Even if you replicate a CREATE EXTENSION
> command, there's no guarantee that the .so exists on the target.) Then
> what?

Sure. There's reasons logical replication isn't always a win. But I
don't see why that's a reason not to make it as robust as possible.

Btw, the .so problem exists for wal shipping as as well.

> This is basically the same problem as multi-master replication
> conflicts, except with DDL.  Resolving replication conflicts is not a
> very easy thing to get right even if you're only concerned about the
> rows in the tables.  It's probably harder if you're worried about the
> schema, too.

I don't think it's a sane thing to do multimaster with differing schemas
for individual relations, except maybe additional nonunique indexes.

> > We are thinking about extending 2PC to be usable across logical
> > decoding. That's a relatively simple patch. Then it's possible to do the
> > DDL on the primary, ship it to the standby, apply it there, and only
> > afterwards commit the prepared xact if that was successfull.  That's
> > quite cool - but somewhat in the remit of the replication solution.
> 
> That would certainly make the user aware of a quite a few kinds of
> errors that might otherwise go undetected.

It is (or rather would be) a generally quite cool feature imo ;)

> > The solution here doesn't force you to do that, does it? It's something
> > that can be used by more than replication solution?
> 
> In theory, yes.

What's the practical point here?

> > I just don't see the alternative you're proposing? I've so far not even
> > seen a credible *sketch* of an alternative design that also can handle
> > ALTER.  The only current alternatives are 1) the user inserts some
> > events into the queue manually. If they depend on any local state you're
> > screwed. If they have syntax errors they're often screwed. 2). The user
> > does all actions on the standby first. Then on the primary. That's hard
> > for ALTER ADD COLUMN and similar, and just about impossible for renaming
> > things.
> 
> It's a really hard problem.
> 
> I don't think it's possible to make statement-based replication
> no-fail.

I think generally logical replication has more failure cases than
physical ones. Which you seem to agree with.

> Physical replication is basically no-fail because it just
> says, hey, go write these bytes into this page, and we can pretty much
> always do that. But statement-based logical replication means
> basically executing arbitrary chunks of code all over the backend, and
> there is just no way to guarantee that code won't throw an error.  So
> the best we can do is to hope that those errors will get reported back
> to the user, which is going to require some kind of distributed
> transaction.  Your idea to just run the replicated DDL statements on
> the standby before committing on the master is one approach to that
> problem, and probably the simplest one, but not the only one - one can
> imagine something that resembles true clustering, for example.

I think that's generally not what people need for primary/standby
cases (of subsets of tables). In practice there aren't many failures
like that besides schema differences. And there's just many usecases
that aren't doable with physical replication, so we can't advise people
doing that.

> By the way, the fact that you're planning to do log-based replication
> of DML and trigger-based replication of DDL scares the crap out of me.
> I'm not sure how that's going to work at all if the two are
> interleaved in the same transaction.

Maybe that's based on a misunderstanding. All the event trigger does is
insert a event into a (local) queue. That's then shipped to the other
side using the same replication mechanisms as used for rows. Then, when
receiving changes in that ddl queue the standby performs those actions
before continuing with the replay.
That makes the interleaving on the standby to be pretty much the same as
on the primary.

> Also, relying on triggers for
> replication is generally not awesome, because it increases
> administrative complexity.  Event triggers are probably better in that
> regard than ordinary triggers, because they're database-wide, but I
> don't think they solve the problem completely.

Maybe. It's not that hard to make the DDL solution record dependencies
against the event triggers to prohibit the user from dropping them or
such if that's your worry.

We can easily (additionally) define "implicit" event triggers that are
declared using a hook. Then some replication extension can just force
them to be applied.

> But the thing that
> scares me even more is that the DDL replication is not only
> trigger-based, but statement-based.  Why don't we do logical
> replication by recording all of the SQL statements that run on the
> master and re-executing them on the standby?  Well, because we all
> know that there will be plenty of important cases where that doesn't
> yield the same results on both servers.  There's no intrinsic reason
> why that shouldn't also be a problem for DDL replication, and indeed
> it is.  This patch set is trying to patch around that by finding a way
> to emit a revised DDL statement that is guaranteed to do exactly the
> same thing on both machines, and it's probably possible to do that in
> most cases, but it's probably not possible to do that in all cases.

Yes, it's not trivial. And I think there's some commands where you might
not want to try but either scream ERROR or just rereplicate the whole
relation or such.

I very strongly feel that we (as postgres devs) have a *much* higher
chance of recognizing these cases than either some random users (that
write slonik scripts or similar) or some replication solution authors
that aren't closely involved with -hackers.

> These are just ruminations on the difficulty of doing truly robust DDL
> replication.

Right, it's far from easy. But imo that's an argument for providing the
tools to do it as robust as we can in core.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: group locking: incomplete patch, just for discussion
Next
From: Petr Jelinek
Date:
Subject: Re: tracking commit timestamps