Thread: Feature Request - DDL deployment with logical replication
Hello!
I have not seen much discussion about what the plans are for being able to manage schema changes when using logical replication. In our own infrastructure, mechanisms that have been provided to manage DDL statements at the same transactional point as they happen on the master have been immensely useful to us, such as replicate_ddl_command from pglogical.
Although we are thrilled with some of the features already in logical replication, this missing feature is the #1 reason that we don't plan to take a serious look at built-in logical replication even for pg11, because we have been able to use pglogical with our own extension pgl_ddl_deploy in order to broadly deploy logical replication without serious overhauls to our SDLC process, having schema changes managed well. We really want a mechanism to put through DDL changes at the same transactional point on the subscribers as we do on the publishers, which also answers any complexities around deploying master-first or slave-first in some interesting cases.
Is there any particular vision for how the community might address this need in the future?
Thank you!
Jeremy
Jeremy
On 29.03.2018 20:21, Jeremy Finzel wrote: > Hello! > > I have not seen much discussion about what the plans are for being > able to manage schema changes when using logical replication. In our > own infrastructure, mechanisms that have been provided to manage DDL > statements at the same transactional point as they happen on the > master have been immensely useful to us, such as replicate_ddl_command > from pglogical. > > Although we are thrilled with some of the features already in logical > replication, this missing feature is the #1 reason that we don't plan > to take a serious look at built-in logical replication even for pg11, > because we have been able to use pglogical with our own extension > pgl_ddl_deploy in order to broadly deploy logical replication without > serious overhauls to our SDLC process, having schema changes managed > well. We really want a mechanism to put through DDL changes at the > same transactional point on the subscribers as we do on the > publishers, which also answers any complexities around deploying > master-first or slave-first in some interesting cases. > > Is there any particular vision for how the community might address > this need in the future? > > Thank you! > Jeremy Our team in PostgresPro is also very interested in this discussion, because we are using logical decoding in multimaster. Right now in multimaster DDLs are replicated in this same way as in pglogical, using "logical messages" with correspondent DDL statements. There are a couple of problems related with this approach, for example mix of DDL with DML (create table as...) which we have addressed in multimaster. As an alternative we have considered possibility to replicate updates of catalog tables. In this case we need to address the following issues: 1. Map OIDs 2. Perform direct manipulation with files (i.e. truncate, alter table,...) 3. Send invalidations Looks like it will be not so easy to implement this approach because we have to repeat or somehow reuse substantial part of executor code for this DDL nodes. And definitely this approach works only for Postgres-to-Postgres replication. So it will be very interesting for us to know roadmap for DDL support in logical replication. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Mar 30, 2018 at 9:26 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Our team in PostgresPro is also very interested in this discussion, because we are using logical decoding in multimaster.
Right now in multimaster DDLs are replicated in this same way as in pglogical, using "logical messages" with correspondent DDL statements.
There are a couple of problems related with this approach, for example mix of DDL with DML (create table as...) which we have addressed in multimaster.
As an alternative we have considered possibility to replicate updates of catalog tables. In this case we need to address the following issues:
1. Map OIDs
2. Perform direct manipulation with files (i.e. truncate, alter table,...)
One very important note here is that since truncate is not replicated there are cases where people are relying on current behaviour. For example, we are going to use logical replication specifically for this reason as a way of replacing our current FDW-based push approach. If truncate is ever replicated, it would be really nice to be able to specify not to replicate it on the subscription level.
3. Send invalidations
Looks like it will be not so easy to implement this approach because we have to repeat or somehow reuse substantial part of executor code for this DDL nodes.
And definitely this approach works only for Postgres-to-Postgres replication.
So it will be very interesting for us to know roadmap for DDL support in logical replication.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Best Regards,
Chris Travers
Database Administrator
Saarbrücker Straße 37a, 10405 Berlin
On Fri, Mar 30, 2018 at 09:38:44AM +0200, Chris Travers wrote: > On Fri, Mar 30, 2018 at 9:26 AM, Konstantin Knizhnik < > k.knizhnik@postgrespro.ru> wrote: > > > Our team in PostgresPro is also very interested in this discussion, > > because we are using logical decoding in multimaster. > > Right now in multimaster DDLs are replicated in this same way as in > > pglogical, using "logical messages" with correspondent DDL statements. > > There are a couple of problems related with this approach, for example mix > > of DDL with DML (create table as...) which we have addressed in multimaster. > > > > As an alternative we have considered possibility to replicate updates of > > catalog tables. In this case we need to address the following issues: > > 1. Map OIDs > > 2. Perform direct manipulation with files (i.e. truncate, alter table,...) > > > > One very important note here is that since truncate is not replicated there > are cases where people are relying on current behaviour. For example, we > are going to use logical replication specifically for this reason as a way > of replacing our current FDW-based push approach. If truncate is ever > replicated, it would be really nice to be able to specify not to replicate > it on the subscription level. Do you have some kind of API in mind for describing subscriptions? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 3/29/18 13:21, Jeremy Finzel wrote: > Although we are thrilled with some of the features already in logical > replication, this missing feature is the #1 reason that we don't plan to > take a serious look at built-in logical replication even for pg11, > because we have been able to use pglogical with our own extension > pgl_ddl_deploy in order to broadly deploy logical replication without > serious overhauls to our SDLC process, having schema changes managed > well. We really want a mechanism to put through DDL changes at the same > transactional point on the subscribers as we do on the publishers, which > also answers any complexities around deploying master-first or > slave-first in some interesting cases. > > Is there any particular vision for how the community might address this > need in the future? I think nobody has completely figured this out yet. Whatever is in pglogical and bdr and similar external projects are the best current compromises. But they have lots of problems, so I don't know if anyone is ready to propose something for in core yet. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 30, 2018 at 10:16 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/29/18 13:21, Jeremy Finzel wrote:
> Although we are thrilled with some of the features already in logical
> replication, this missing feature is the #1 reason that we don't plan to
> take a serious look at built-in logical replication even for pg11,
> because we have been able to use pglogical with our own extension
> pgl_ddl_deploy in order to broadly deploy logical replication without
> serious overhauls to our SDLC process, having schema changes managed
> well. We really want a mechanism to put through DDL changes at the same
> transactional point on the subscribers as we do on the publishers, which
> also answers any complexities around deploying master-first or
> slave-first in some interesting cases.
>
> Is there any particular vision for how the community might address this
> need in the future?
I think nobody has completely figured this out yet. Whatever is in
pglogical and bdr and similar external projects are the best current
compromises. But they have lots of problems, so I don't know if anyone
is ready to propose something for in core yet.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
IMO, it would be an acceptable and good first step to provide a function that will replicate a SQL command through the replication stream at the right point, even if there is still no automation around it.
On Fri, Mar 30, 2018 at 2:26 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Our team in PostgresPro is also very interested in this discussion, because we are using logical decoding in multimaster.
Right now in multimaster DDLs are replicated in this same way as in pglogical, using "logical messages" with correspondent DDL statements.
There are a couple of problems related with this approach, for example mix of DDL with DML (create table as...) which we have addressed in multimaster.
IMO, I wouldn't even allow CREATE TABLE AS especially as an initial feature. In our environment, I have rarely if even seen an application migration that uses CREATE TABLE AS, and it's an acceptable compromise for us to simply disallow it. The way I have written pgl_ddl_deploy is to disallow CREATE TABLE AS from being replicated. IMO that's a very small price to pay.
My whole point is that in most architectures, DBAs decide to deploy the same SQL on providers and subscribers. Yes it isn't perfect, but IMO, it is very helpful to try to automate that idea, as opposed to trying to actually replicate DDL at the low level. The latter is better, yes, but seems to have proven extremely difficult. Hence, why you see the advent of functions to pipe DDL through the replication stream.
Thanks,
Jeremy
On 31 March 2018 at 01:03, Jeremy Finzel <finzelj@gmail.com> wrote:
On Fri, Mar 30, 2018 at 10:16 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: On 3/29/18 13:21, Jeremy Finzel wrote:
> Although we are thrilled with some of the features already in logical
> replication, this missing feature is the #1 reason that we don't plan to
> take a serious look at built-in logical replication even for pg11,
> because we have been able to use pglogical with our own extension
> pgl_ddl_deploy in order to broadly deploy logical replication without
> serious overhauls to our SDLC process, having schema changes managed
> well. We really want a mechanism to put through DDL changes at the same
> transactional point on the subscribers as we do on the publishers, which
> also answers any complexities around deploying master-first or
> slave-first in some interesting cases.
>
> Is there any particular vision for how the community might address this
> need in the future?
I think nobody has completely figured this out yet. Whatever is in
pglogical and bdr and similar external projects are the best current
compromises. But they have lots of problems, so I don't know if anyone
is ready to propose something for in core yet.
IMO, it would be an acceptable and good first step to provide a function that will replicate a SQL command through the replication stream at the right point, even if there is still no automation around it.
I agree.
Using a regular table as a command queue like pglogical does probably won't be acceptable in core. But I'd suggest using a very similar approach, just with logical WAL messages as the replication mechanism. This is actually a regression in terms of how visible/debuggable it is; the queue table is great in that you can see what's happened recently. But it means we wouldn't need the hack pglogical does of special casing inserts into that table during apply of the normal change stream.
I guess we could add a system catalog that special cased by logical decoding, invoking a plugin callback for DDL. That way we'd still have a change history. But really, it's not like there's a DDL history table for postgres normally, and there's no particular reason logical replication should be different, even if it's very convenient in pglogical.
So I suggest a logical WAL message.
This would be a practical patch for you to tackle for Pg 12 if you're interested.
Presently some of the people involved in in-core logical rep have been able to dedicate less time due to other core patch needs and various other priorities.
If I get time to work on logical rep for pg12, I expect to be looking mainly at integrating hooks and extension points based on the work we've done in pglogical3, so we can use in-core logical rep as the core transport and wrap faster-iterating new functionality like multimaster around it. The pace at which we're able to iterate in core is presently a bit of a challenge in terms of logical replication enhancements.
Hi Jeremy, > My whole point is that in most architectures, DBAs decide to deploy the same > SQL on providers and subscribers. Yes it isn't perfect, but IMO, it is very > helpful to try to automate that idea, as opposed to trying to actually > replicate DDL at the low level. The latter is better, yes, but seems to > have proven extremely difficult. Hence, why you see the advent of functions > to pipe DDL through the replication stream. > The community is currently working on in the current commitfest to try and get logical decoding of 2PC in into the core. Once something like that gets in, for a majority of subset of DDLs (which works inside transaction blocks), one of the use cases of that functionality could be to trap these DDLs and convert them into implicit 2PC commands. Details need to be worked out, but we would get all the logical replication cluster nodes in sync with each other and issue a PREPARE transaction involving this DDL on all nodes in the logical replication cluster. If any of the nodes is not able to successfully prepare this DDL, then we can rollback or else commit the 2PC, thus moving the entire logical cluster consistently in terms of schema changes. Regards, Nikhils -- Nikhil Sontakke http://www.2ndQuadrant.com/ PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
On 31 March 2018 at 15:53, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:
Hi Jeremy,
> My whole point is that in most architectures, DBAs decide to deploy the same
> SQL on providers and subscribers. Yes it isn't perfect, but IMO, it is very
> helpful to try to automate that idea, as opposed to trying to actually
> replicate DDL at the low level. The latter is better, yes, but seems to
> have proven extremely difficult. Hence, why you see the advent of functions
> to pipe DDL through the replication stream.
>
The community is currently working on in the current commitfest to try
and get logical decoding of 2PC in into the core.
Once something like that gets in, for a majority of subset of DDLs
(which works inside transaction blocks), one of the use cases of that
functionality could be to trap these DDLs and convert them into
implicit 2PC commands. Details need to be worked out, but we would get
all the logical replication cluster nodes in sync with each other and
issue a PREPARE transaction involving this DDL on all nodes in the
logical replication cluster. If any of the nodes is not able to
successfully prepare this DDL, then we can rollback or else commit the
2PC, thus moving the entire logical cluster consistently in terms of
schema changes.
We'll still need a mechanism to transport them to downstreams (like WAL messages) and to send responses upstream. For responses I think we will finally want to add a backchannel to the logical replication protocol as I've wanted for a long while: downstream can send a COPY message on COPY BOTH proto back to upstream, which passes it to a callback on the output plugin for the output plugin to act on.
The main issue I had when I tried to prototype this before was IIRC not knowing how to set up the right snapshot in which to execute the callback.
On 2018-03-31 22:13:42 +0800, Craig Ringer wrote: > We'll still need a mechanism to transport them to downstreams (like WAL > messages) and to send responses upstream. For responses I think we will > finally want to add a backchannel to the logical replication protocol as > I've wanted for a long while: downstream can send a COPY message on COPY > BOTH proto back to upstream, which passes it to a callback on the output > plugin for the output plugin to act on. Not necessarily? You can just send out the prepare, wait for all clients to ack it, and then commit/rollback prepared. Greetings, Andres Freund
On 1 April 2018 at 00:57, Andres Freund <andres@anarazel.de> wrote:
On 2018-03-31 22:13:42 +0800, Craig Ringer wrote:
> We'll still need a mechanism to transport them to downstreams (like WAL
> messages) and to send responses upstream. For responses I think we will
> finally want to add a backchannel to the logical replication protocol as
> I've wanted for a long while: downstream can send a COPY message on COPY
> BOTH proto back to upstream, which passes it to a callback on the output
> plugin for the output plugin to act on.
Not necessarily? You can just send out the prepare, wait for all
clients to ack it, and then commit/rollback prepared.
We then lack any mechanism by which you can NACK, saying "I can't apply this".
So upstream will wait indefinitely. I guess we just expect the user to intervene and ROLLBACK if they decide a replica isn't going to get the job done, or have checked the replica's logs and found it can't apply it for some hopefully-sane reason.
It's not like we'd auto-ROLLBACK PREPARED in response to a nack from a downstream anyway, so all we're missing is probably info in the upstream logs about which replica(s) cannot apply it and why.
OK. So it'd be a nice-to-have, but not vital.
Hi, On 2018-04-02 23:07:17 +0800, Craig Ringer wrote: > We then lack any mechanism by which you can NACK, saying "I can't apply > this". Sure, but nothing forces this mechanism to be in-band. > So upstream will wait indefinitely. I guess we just expect the user to > intervene and ROLLBACK if they decide a replica isn't going to get the job > done, or have checked the replica's logs and found it can't apply it for > some hopefully-sane reason. > > It's not like we'd auto-ROLLBACK PREPARED in response to a nack from a > downstream anyway, so all we're missing is probably info in the upstream > logs about which replica(s) cannot apply it and why. > > OK. So it'd be a nice-to-have, but not vital. I'm not sure that an in-band mechanism that's the same for all potential users is flexible enough (actually unsure, not intimating it's wrong). It doesn't seem crazy to do these checks over a separate connection. That'd allow more flexible error handling etc. Greetings, Andres Freund