Thread: Feature Request - DDL deployment with logical replication

Feature Request - DDL deployment with logical replication

From
Jeremy Finzel
Date:
Hello!

I have not seen much discussion about what the plans are for being able to manage schema changes when using logical replication.  In our own infrastructure, mechanisms that have been provided to manage DDL statements at the same transactional point as they happen on the master have been immensely useful to us, such as replicate_ddl_command from pglogical.

Although we are thrilled with some of the features already in logical replication, this missing feature is the #1 reason that we don't plan to take a serious look at built-in logical replication even for pg11, because we have been able to use pglogical with our own extension pgl_ddl_deploy in order to broadly deploy logical replication without serious overhauls to our SDLC process, having schema changes managed well.  We really want a mechanism to put through DDL changes at the same transactional point on the subscribers as we do on the publishers, which also answers any complexities around deploying master-first or slave-first in some interesting cases.

Is there any particular vision for how the community might address this need in the future?

Thank you!
Jeremy

Re: Feature Request - DDL deployment with logical replication

From
Konstantin Knizhnik
Date:

On 29.03.2018 20:21, Jeremy Finzel wrote:
> Hello!
>
> I have not seen much discussion about what the plans are for being 
> able to manage schema changes when using logical replication.  In our 
> own infrastructure, mechanisms that have been provided to manage DDL 
> statements at the same transactional point as they happen on the 
> master have been immensely useful to us, such as replicate_ddl_command 
> from pglogical.
>
> Although we are thrilled with some of the features already in logical 
> replication, this missing feature is the #1 reason that we don't plan 
> to take a serious look at built-in logical replication even for pg11, 
> because we have been able to use pglogical with our own extension 
> pgl_ddl_deploy in order to broadly deploy logical replication without 
> serious overhauls to our SDLC process, having schema changes managed 
> well.  We really want a mechanism to put through DDL changes at the 
> same transactional point on the subscribers as we do on the 
> publishers, which also answers any complexities around deploying 
> master-first or slave-first in some interesting cases.
>
> Is there any particular vision for how the community might address 
> this need in the future?
>
> Thank you!
> Jeremy

Our team in PostgresPro is also very interested in this discussion, 
because we are using logical decoding in multimaster.
Right now in multimaster DDLs are replicated in this same way as in 
pglogical, using "logical messages" with correspondent DDL statements.
There are a couple of problems related with this approach, for example 
mix of DDL with DML (create table as...) which we have addressed in 
multimaster.

As an alternative we have considered possibility to replicate updates of 
catalog tables. In this case we need  to address the following issues:
1. Map OIDs
2. Perform direct manipulation with files (i.e. truncate, alter table,...)
3. Send invalidations

Looks like it will be not so easy to implement this approach because we 
have to repeat or somehow reuse substantial part of executor code for 
this DDL nodes.
And definitely this approach works only for Postgres-to-Postgres 
replication.

So it will be very interesting for us to know roadmap for DDL support in 
logical replication.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Feature Request - DDL deployment with logical replication

From
Chris Travers
Date:


On Fri, Mar 30, 2018 at 9:26 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:


Our team in PostgresPro is also very interested in this discussion, because we are using logical decoding in multimaster.
Right now in multimaster DDLs are replicated in this same way as in pglogical, using "logical messages" with correspondent DDL statements.
There are a couple of problems related with this approach, for example mix of DDL with DML (create table as...) which we have addressed in multimaster.

As an alternative we have considered possibility to replicate updates of catalog tables. In this case we need  to address the following issues:
1. Map OIDs
2. Perform direct manipulation with files (i.e. truncate, alter table,...)

One very important note here is that since truncate is not replicated there are cases where people are relying on current behaviour.  For example, we are going to use logical replication specifically for this reason as a way of replacing our current FDW-based push approach.  If truncate is ever replicated, it would be really nice to be able to specify not to replicate it on the subscription level.
 
3. Send invalidations

Looks like it will be not so easy to implement this approach because we have to repeat or somehow reuse substantial part of executor code for this DDL nodes.
And definitely this approach works only for Postgres-to-Postgres replication.

So it will be very interesting for us to know roadmap for DDL support in logical replication.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Re: Feature Request - DDL deployment with logical replication

From
David Fetter
Date:
On Fri, Mar 30, 2018 at 09:38:44AM +0200, Chris Travers wrote:
> On Fri, Mar 30, 2018 at 9:26 AM, Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> wrote:
> 
> > Our team in PostgresPro is also very interested in this discussion,
> > because we are using logical decoding in multimaster.
> > Right now in multimaster DDLs are replicated in this same way as in
> > pglogical, using "logical messages" with correspondent DDL statements.
> > There are a couple of problems related with this approach, for example mix
> > of DDL with DML (create table as...) which we have addressed in multimaster.
> >
> > As an alternative we have considered possibility to replicate updates of
> > catalog tables. In this case we need  to address the following issues:
> > 1. Map OIDs
> > 2. Perform direct manipulation with files (i.e. truncate, alter table,...)
> >
> 
> One very important note here is that since truncate is not replicated there
> are cases where people are relying on current behaviour.  For example, we
> are going to use logical replication specifically for this reason as a way
> of replacing our current FDW-based push approach.  If truncate is ever
> replicated, it would be really nice to be able to specify not to replicate
> it on the subscription level.

Do you have some kind of API in mind for describing subscriptions?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Feature Request - DDL deployment with logical replication

From
Peter Eisentraut
Date:
On 3/29/18 13:21, Jeremy Finzel wrote:
> Although we are thrilled with some of the features already in logical
> replication, this missing feature is the #1 reason that we don't plan to
> take a serious look at built-in logical replication even for pg11,
> because we have been able to use pglogical with our own extension
> pgl_ddl_deploy in order to broadly deploy logical replication without
> serious overhauls to our SDLC process, having schema changes managed
> well.  We really want a mechanism to put through DDL changes at the same
> transactional point on the subscribers as we do on the publishers, which
> also answers any complexities around deploying master-first or
> slave-first in some interesting cases.
> 
> Is there any particular vision for how the community might address this
> need in the future?

I think nobody has completely figured this out yet.  Whatever is in
pglogical and bdr and similar external projects are the best current
compromises.  But they have lots of problems, so I don't know if anyone
is ready to propose something for in core yet.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Feature Request - DDL deployment with logical replication

From
Jeremy Finzel
Date:


On Fri, Mar 30, 2018 at 10:16 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/29/18 13:21, Jeremy Finzel wrote:
> Although we are thrilled with some of the features already in logical
> replication, this missing feature is the #1 reason that we don't plan to
> take a serious look at built-in logical replication even for pg11,
> because we have been able to use pglogical with our own extension
> pgl_ddl_deploy in order to broadly deploy logical replication without
> serious overhauls to our SDLC process, having schema changes managed
> well.  We really want a mechanism to put through DDL changes at the same
> transactional point on the subscribers as we do on the publishers, which
> also answers any complexities around deploying master-first or
> slave-first in some interesting cases.
>
> Is there any particular vision for how the community might address this
> need in the future?

I think nobody has completely figured this out yet.  Whatever is in
pglogical and bdr and similar external projects are the best current
compromises.  But they have lots of problems, so I don't know if anyone
is ready to propose something for in core yet.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

IMO, it would be an acceptable and good first step to provide a function that will replicate a SQL command through the replication stream at the right point, even if there is still no automation around it.

Re: Feature Request - DDL deployment with logical replication

From
Jeremy Finzel
Date:


On Fri, Mar 30, 2018 at 2:26 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:

Our team in PostgresPro is also very interested in this discussion, because we are using logical decoding in multimaster.
Right now in multimaster DDLs are replicated in this same way as in pglogical, using "logical messages" with correspondent DDL statements.
There are a couple of problems related with this approach, for example mix of DDL with DML (create table as...) which we have addressed in multimaster.

IMO, I wouldn't even allow CREATE TABLE AS especially as an initial feature.  In our environment, I have rarely if even seen an application migration that uses CREATE TABLE AS, and it's an acceptable compromise for us to simply disallow it.  The way I have written pgl_ddl_deploy is to disallow CREATE TABLE AS from being replicated.  IMO that's a very small price to pay.

My whole point is that in most architectures, DBAs decide to deploy the same SQL on providers and subscribers.  Yes it isn't perfect, but IMO, it is very helpful to try to automate that idea, as opposed to trying to actually replicate DDL at the low level.  The latter is better, yes, but seems to have proven extremely difficult.  Hence, why you see the advent of functions to pipe DDL through the replication stream.

Thanks,
Jeremy

Re: Feature Request - DDL deployment with logical replication

From
Craig Ringer
Date:
On 31 March 2018 at 01:03, Jeremy Finzel <finzelj@gmail.com> wrote:


On Fri, Mar 30, 2018 at 10:16 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/29/18 13:21, Jeremy Finzel wrote:
> Although we are thrilled with some of the features already in logical
> replication, this missing feature is the #1 reason that we don't plan to
> take a serious look at built-in logical replication even for pg11,
> because we have been able to use pglogical with our own extension
> pgl_ddl_deploy in order to broadly deploy logical replication without
> serious overhauls to our SDLC process, having schema changes managed
> well.  We really want a mechanism to put through DDL changes at the same
> transactional point on the subscribers as we do on the publishers, which
> also answers any complexities around deploying master-first or
> slave-first in some interesting cases.
>
> Is there any particular vision for how the community might address this
> need in the future?

I think nobody has completely figured this out yet.  Whatever is in
pglogical and bdr and similar external projects are the best current
compromises.  But they have lots of problems, so I don't know if anyone
is ready to propose something for in core yet.
 
IMO, it would be an acceptable and good first step to provide a function that will replicate a SQL command through the replication stream at the right point, even if there is still no automation around it.


I agree.

Using a regular table as a command queue like pglogical does probably won't be acceptable in core. But I'd suggest using a very similar approach, just with logical WAL messages as the replication mechanism. This is actually a regression in terms of how visible/debuggable it is; the queue table is great in that you can see what's happened recently. But it means we wouldn't need the hack pglogical does of special casing inserts into that table during apply of the normal change stream.

I guess we could add a system catalog that special cased by logical decoding, invoking a plugin callback for DDL. That way we'd still have a change history. But really, it's not like there's a DDL history table for postgres normally, and there's no particular reason logical replication should be different, even if it's very convenient in pglogical.

So I suggest a logical WAL message.

This would be a practical patch for you to tackle for Pg 12 if you're interested.

Presently some of the people involved in in-core logical rep have been able to dedicate less time due to other core patch needs and various other priorities.

If I get time to work on logical rep for pg12, I expect to be looking mainly at integrating hooks and extension points based on the work we've done in pglogical3, so we can use in-core logical rep as the core transport and wrap faster-iterating new functionality like multimaster around it. The pace at which we're able to iterate in core is presently a bit of a challenge in terms of logical replication enhancements.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Feature Request - DDL deployment with logical replication

From
Nikhil Sontakke
Date:
Hi Jeremy,

> My whole point is that in most architectures, DBAs decide to deploy the same
> SQL on providers and subscribers.  Yes it isn't perfect, but IMO, it is very
> helpful to try to automate that idea, as opposed to trying to actually
> replicate DDL at the low level.  The latter is better, yes, but seems to
> have proven extremely difficult.  Hence, why you see the advent of functions
> to pipe DDL through the replication stream.
>

The community is currently working on in the current commitfest to try
and get logical decoding of 2PC in into the core.

Once something like that gets in, for a majority of subset of DDLs
(which works inside transaction blocks), one of the use cases of that
functionality could be to trap these DDLs and convert them into
implicit 2PC commands. Details need to be worked out, but we would get
all the logical replication cluster nodes in sync with each other and
issue a PREPARE transaction involving this DDL on all nodes in the
logical replication cluster. If any of the nodes is not able to
successfully prepare this DDL, then we can rollback or else commit the
2PC, thus moving the entire logical cluster consistently in terms of
schema changes.

Regards,
Nikhils
-- 
 Nikhil Sontakke                   http://www.2ndQuadrant.com/
 PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services


Re: Feature Request - DDL deployment with logical replication

From
Craig Ringer
Date:
On 31 March 2018 at 15:53, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:
Hi Jeremy,

> My whole point is that in most architectures, DBAs decide to deploy the same
> SQL on providers and subscribers.  Yes it isn't perfect, but IMO, it is very
> helpful to try to automate that idea, as opposed to trying to actually
> replicate DDL at the low level.  The latter is better, yes, but seems to
> have proven extremely difficult.  Hence, why you see the advent of functions
> to pipe DDL through the replication stream.
>

The community is currently working on in the current commitfest to try
and get logical decoding of 2PC in into the core.

Once something like that gets in, for a majority of subset of DDLs
(which works inside transaction blocks), one of the use cases of that
functionality could be to trap these DDLs and convert them into
implicit 2PC commands. Details need to be worked out, but we would get
all the logical replication cluster nodes in sync with each other and
issue a PREPARE transaction involving this DDL on all nodes in the
logical replication cluster. If any of the nodes is not able to
successfully prepare this DDL, then we can rollback or else commit the
2PC, thus moving the entire logical cluster consistently in terms of
schema changes.


We'll still need a mechanism to transport them to downstreams (like WAL messages) and to send responses upstream. For responses I think we will finally want to add a backchannel to the logical replication protocol as I've wanted for a long while: downstream can send a COPY message on COPY BOTH proto back to upstream, which passes it to a callback on the output plugin for the output plugin to act on.

The main issue I had when I tried to prototype this before was IIRC not knowing how to set up the right snapshot in which to execute the callback.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Feature Request - DDL deployment with logical replication

From
Andres Freund
Date:
On 2018-03-31 22:13:42 +0800, Craig Ringer wrote:
> We'll still need a mechanism to transport them to downstreams (like WAL
> messages) and to send responses upstream. For responses I think we will
> finally want to add a backchannel to the logical replication protocol as
> I've wanted for a long while: downstream can send a COPY message on COPY
> BOTH proto back to upstream, which passes it to a callback on the output
> plugin for the output plugin to act on.

Not necessarily?  You can just send out the prepare, wait for all
clients to ack it, and then commit/rollback prepared.

Greetings,

Andres Freund


Re: Feature Request - DDL deployment with logical replication

From
Craig Ringer
Date:
On 1 April 2018 at 00:57, Andres Freund <andres@anarazel.de> wrote:
On 2018-03-31 22:13:42 +0800, Craig Ringer wrote:
> We'll still need a mechanism to transport them to downstreams (like WAL
> messages) and to send responses upstream. For responses I think we will
> finally want to add a backchannel to the logical replication protocol as
> I've wanted for a long while: downstream can send a COPY message on COPY
> BOTH proto back to upstream, which passes it to a callback on the output
> plugin for the output plugin to act on.

Not necessarily?  You can just send out the prepare, wait for all
clients to ack it, and then commit/rollback prepared.

We then lack any mechanism by which you can NACK, saying "I can't apply this".

So upstream will wait indefinitely. I guess we just expect the user to intervene and ROLLBACK if they decide a replica isn't going to get the job done, or have checked the replica's logs and found it can't apply it for some hopefully-sane reason. 

It's not like we'd auto-ROLLBACK PREPARED in response to a nack from a downstream anyway, so all we're missing is probably info in the upstream logs about which replica(s) cannot apply it and why. 

OK. So it'd be a nice-to-have, but not vital.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Feature Request - DDL deployment with logical replication

From
Andres Freund
Date:
Hi,

On 2018-04-02 23:07:17 +0800, Craig Ringer wrote:
> We then lack any mechanism by which you can NACK, saying "I can't apply
> this".

Sure, but nothing forces this mechanism to be in-band.


> So upstream will wait indefinitely. I guess we just expect the user to
> intervene and ROLLBACK if they decide a replica isn't going to get the job
> done, or have checked the replica's logs and found it can't apply it for
> some hopefully-sane reason.
> 
> It's not like we'd auto-ROLLBACK PREPARED in response to a nack from a
> downstream anyway, so all we're missing is probably info in the upstream
> logs about which replica(s) cannot apply it and why.
> 
> OK. So it'd be a nice-to-have, but not vital.

I'm not sure that an in-band mechanism that's the same for all potential
users is flexible enough (actually unsure, not intimating it's wrong).
It doesn't seem crazy to do these checks over a separate
connection. That'd allow more flexible error handling etc.

Greetings,

Andres Freund