Thread: How to shorten a chain of logically replicated servers
Hi, I'm trying to figure out how to shorten a chain of logically replicating servers. Right now we have three servers replicating like so: A --> B --> C And I'd like to remove B from the chain of replication so that I only have: A --> C Of course, doing this without losing data is the goal. If the replication to C breaks temporarily, that's fine, so long as all the changes on A make it to C eventually. I'm not sure how to proceed with this. My best theory is: 1. In a transaction, DISABLE the replication from A to B and start a new PUBLICATION on A that C will subscribe to in step ③ below. The hope is that this will simultaneously stop sending changes to B while starting a log of new changes that can later be sent to C. 2. Let any changes queued on B flush to C. (How to know when they're all flushed?) 3. Subscribe C to the new PUBLICATION created in step ①. Create the subscription with copy_data=False. This should send all changes to C that hadn't been sent to B, without sending the complete tables. 4. DROP all replication to/from B (this is just cleanup; the incoming changes to B were disabled in step ①, and outgoing changes from B were flushed in step ②). Does this sound even close to the right approach? Logical replication can be a bit finicky, so I'd love to have some validation of the general approach before I go down this road. Thanks everybody and happy new year, Mike
Hi, I don't usually like to bump messages on this list, but since I sent mine on New Year's Eve, I figured I'd better. Anybody have any ideas about how to accomplish this? I'm pretty stumped (as you can probably see). On Tue, Dec 31, 2019 at 3:51 PM Mike Lissner <mlissner@michaeljaylissner.com> wrote: > > Hi, I'm trying to figure out how to shorten a chain of logically > replicating servers. Right now we have three servers replicating like > so: > > A --> B --> C > > And I'd like to remove B from the chain of replication so that I only have: > > A --> C > > Of course, doing this without losing data is the goal. If the > replication to C breaks temporarily, that's fine, so long as all the > changes on A make it to C eventually. > > I'm not sure how to proceed with this. My best theory is: > > 1. In a transaction, DISABLE the replication from A to B and start a > new PUBLICATION on A that C will subscribe to in step ③ below. The > hope is that this will simultaneously stop sending changes to B while > starting a log of new changes that can later be sent to C. > > 2. Let any changes queued on B flush to C. (How to know when they're > all flushed?) > > 3. Subscribe C to the new PUBLICATION created in step ①. Create the > subscription with copy_data=False. This should send all changes to C > that hadn't been sent to B, without sending the complete tables. > > 4. DROP all replication to/from B (this is just cleanup; the incoming > changes to B were disabled in step ①, and outgoing changes from B were > flushed in step ②). > > Does this sound even close to the right approach? Logical replication > can be a bit finicky, so I'd love to have some validation of the > general approach before I go down this road. > > Thanks everybody and happy new year, > > Mike
On Tue, 2019-12-31 at 15:51 -0800, Mike Lissner wrote: > Hi, I'm trying to figure out how to shorten a chain of logically > replicating servers. Right now we have three servers replicating like > so: > > A --> B --> C > > And I'd like to remove B from the chain of replication so that I only have: > > A --> C > > Of course, doing this without losing data is the goal. If the > replication to C breaks temporarily, that's fine, so long as all the > changes on A make it to C eventually. > > I'm not sure how to proceed with this. My best theory is: > > 1. In a transaction, DISABLE the replication from A to B and start a > new PUBLICATION on A that C will subscribe to in step ③ below. The > hope is that this will simultaneously stop sending changes to B while > starting a log of new changes that can later be sent to C. > > 2. Let any changes queued on B flush to C. (How to know when they're > all flushed?) > > 3. Subscribe C to the new PUBLICATION created in step ①. Create the > subscription with copy_data=False. This should send all changes to C > that hadn't been sent to B, without sending the complete tables. > > 4. DROP all replication to/from B (this is just cleanup; the incoming > changes to B were disabled in step ①, and outgoing changes from B were > flushed in step ②). > > Does this sound even close to the right approach? Logical replication > can be a bit finicky, so I'd love to have some validation of the > general approach before I go down this road. I don't think that will work. Any changes on A that take place between step 1 and step 3 wouldn't be replicated to C. You'd have to suspend all data modification on A in that interval. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
> You'd have to suspend all data modification on A in that interval. I know how to stop the DB completely, but I can't think of any obvious ways to make sure that it doesn't get any data modification for a period of time. Is there a trick here? This is feeling a bit hopeless. Thanks for the response, Laurenz. On Tue, Jan 7, 2020 at 3:11 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Tue, 2019-12-31 at 15:51 -0800, Mike Lissner wrote: > > Hi, I'm trying to figure out how to shorten a chain of logically > > replicating servers. Right now we have three servers replicating like > > so: > > > > A --> B --> C > > > > And I'd like to remove B from the chain of replication so that I only have: > > > > A --> C > > > > Of course, doing this without losing data is the goal. If the > > replication to C breaks temporarily, that's fine, so long as all the > > changes on A make it to C eventually. > > > > I'm not sure how to proceed with this. My best theory is: > > > > 1. In a transaction, DISABLE the replication from A to B and start a > > new PUBLICATION on A that C will subscribe to in step ③ below. The > > hope is that this will simultaneously stop sending changes to B while > > starting a log of new changes that can later be sent to C. > > > > 2. Let any changes queued on B flush to C. (How to know when they're > > all flushed?) > > > > 3. Subscribe C to the new PUBLICATION created in step ①. Create the > > subscription with copy_data=False. This should send all changes to C > > that hadn't been sent to B, without sending the complete tables. > > > > 4. DROP all replication to/from B (this is just cleanup; the incoming > > changes to B were disabled in step ①, and outgoing changes from B were > > flushed in step ②). > > > > Does this sound even close to the right approach? Logical replication > > can be a bit finicky, so I'd love to have some validation of the > > general approach before I go down this road. > > I don't think that will work. > > Any changes on A that take place between step 1 and step 3 wouldn't be > replicated to C. > > You'd have to suspend all data modification on A in that interval. > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com >
On Tue, 2020-01-07 at 23:17 -0800, Mike Lissner wrote: > > You'd have to suspend all data modification on A in that interval. > > I know how to stop the DB completely, but I can't think of any obvious > ways to make sure that it doesn't get any data modification for a > period of time. Is there a trick here? This is feeling a bit hopeless. The simplest solution would be to stop the applications that use PostgreSQL. You could block client connections using a "pg_hba.conf" entry (and kill the established connections). Another option can be to set "default_transaction_read_only = on", but that will only work if the clients don't override it explicitly. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
That's a good trick, thanks again for the help. Boy, this promises to be a dumb process! I'm unqualified to guess at what might make this easier, but it does seem like something that should have some kind of low-level tools that could do the job. On Wed, Jan 8, 2020 at 1:53 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Tue, 2020-01-07 at 23:17 -0800, Mike Lissner wrote: > > > You'd have to suspend all data modification on A in that interval. > > > > I know how to stop the DB completely, but I can't think of any obvious > > ways to make sure that it doesn't get any data modification for a > > period of time. Is there a trick here? This is feeling a bit hopeless. > > The simplest solution would be to stop the applications that use PostgreSQL. > > You could block client connections using a "pg_hba.conf" entry > (and kill the established connections). > > Another option can be to set "default_transaction_read_only = on", > but that will only work if the clients don't override it explicitly. > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com >