Thread: Logical Replication - behavior of TRUNCATE ... CASCADE
Hi,
In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even if the CASCADE option has been specified in publisher's TRUNCATE command.
/*
* Even if we used CASCADE on the upstream primary we explicitly default
* to replaying changes without further cascading. This might be later
* changeable with a user specified option.
*/
I tried the following use case to see if that's actually true:
1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher and subscriber.
2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk.
3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command.
4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When this command is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply worker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that on the subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receiving the tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fk are truncated.
Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receiving all the dependent relations in the remote rels from the publisher? Am I missing something?
In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even if the CASCADE option has been specified in publisher's TRUNCATE command.
/*
* Even if we used CASCADE on the upstream primary we explicitly default
* to replaying changes without further cascading. This might be later
* changeable with a user specified option.
*/
I tried the following use case to see if that's actually true:
1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher and subscriber.
2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk.
3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command.
4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When this command is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply worker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that on the subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receiving the tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fk are truncated.
Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receiving all the dependent relations in the remote rels from the publisher? Am I missing something?
The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people in cc.
On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even if theCASCADE option has been specified in publisher's TRUNCATE command. > /* > * Even if we used CASCADE on the upstream primary we explicitly default > * to replaying changes without further cascading. This might be later > * changeable with a user specified option. > */ > I tried the following use case to see if that's actually true: > 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher andsubscriber. > 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk. > 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command. > 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When thiscommand is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply workerignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that onthe subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receivingthe tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fkare truncated. > > Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receivingall the dependent relations in the remote rels from the publisher? Am I missing something? > > The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people in cc. Assume this case publisher: tbl_pk -> tbl_fk_pub subscriber: tbl_pk-> tbl_fk_sub Now, in this case, this comment is true right because we are not supposed to truncate tbl_fk_sub on the subscriber side and this should error out. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Mon, May 3, 2021 at 11:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Hi, > > > > In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even ifthe CASCADE option has been specified in publisher's TRUNCATE command. > > /* > > * Even if we used CASCADE on the upstream primary we explicitly default > > * to replaying changes without further cascading. This might be later > > * changeable with a user specified option. > > */ > > I tried the following use case to see if that's actually true: > > 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher andsubscriber. > > 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk. > > 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command. > > 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When thiscommand is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply workerignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that onthe subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receivingthe tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fkare truncated. > > > > Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receivingall the dependent relations in the remote rels from the publisher? Am I missing something? > > > > The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people incc. > > Assume this case > publisher: tbl_pk -> tbl_fk_pub > subscriber: tbl_pk-> tbl_fk_sub > > Now, in this case, this comment is true right because we are not > supposed to truncate tbl_fk_sub on the subscriber side and this should > error out. Here's what I tried, let me know if I'm wrong: On publisher: CREATE TABLE tbl_pk(id int primary key); CREATE TABLE tbl_fk(fkey int references tbl_pk(id)); INSERT INTO tbl_pk (SELECT x FROM generate_series(1,10) x); INSERT INTO tbl_fk (SELECT x % 10 + 1 FROM generate_series(5,25) x); DROP PUBLICATION testpub; CREATE PUBLICATION testpub FOR TABLE tbl_pk, tbl_fk; On subscriber: CREATE TABLE tbl_pk(id int primary key); CREATE TABLE tbl_fk(fkey int references tbl_pk(id)); DROP SUBSCRIPTION testsub; CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres user=bharath port=5432' PUBLICATION testpub; On both publisher and subscriber to ensure that the initial rows were replicated: SELECT count(id) FROM tbl_pk; -- non zero SELECT count(fkey) FROM tbl_fk; -- non zero On publisher: TRUNCATE tbl_pk CASCADE; SELECT count(id) FROM tbl_pk; -- 0 SELECT count(fkey) FROM tbl_fk; -- 0 On subscriber also we get to see 0 rows: SELECT count(id) FROM tbl_pk; -- 0 SELECT count(fkey) FROM tbl_fk; -- 0 But the comment says that tbl_fk shouldn't be truncated as it doesn't pass the cascade option to ExecuteTruncateGuts even though it was received from the publisher. This behaviour is not in accordance with the comment, right? If we see why this is so: the publisher sends both tbl_pk and tbl_fk rels to the subscriber and the TRUNCATE tbl_pk, tbl_fk; is allowed (see the code in heap_truncate_check_FKs) even if RESTRICT option is specified. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
sh,On Mon, May 3, 2021 at 12:37 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, May 3, 2021 at 11:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Hi, > > > > > > In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even ifthe CASCADE option has been specified in publisher's TRUNCATE command. > > > /* > > > * Even if we used CASCADE on the upstream primary we explicitly default > > > * to replaying changes without further cascading. This might be later > > > * changeable with a user specified option. > > > */ > > > I tried the following use case to see if that's actually true: > > > 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisherand subscriber. > > > 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk. > > > 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command. > > > 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. Whenthis command is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the applyworker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is thaton the subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receivingthe tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fkare truncated. > > > > > > Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receivingall the dependent relations in the remote rels from the publisher? Am I missing something? > > > > > > The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people incc. > > > > Assume this case > > publisher: tbl_pk -> tbl_fk_pub > > subscriber: tbl_pk-> tbl_fk_sub > > > > Now, in this case, this comment is true right because we are not > > supposed to truncate tbl_fk_sub on the subscriber side and this should > > error out. > > Here's what I tried, let me know if I'm wrong: > > On publisher: > CREATE TABLE tbl_pk(id int primary key); > CREATE TABLE tbl_fk(fkey int references tbl_pk(id)); > INSERT INTO tbl_pk (SELECT x FROM generate_series(1,10) x); > INSERT INTO tbl_fk (SELECT x % 10 + 1 FROM generate_series(5,25) x); > DROP PUBLICATION testpub; > CREATE PUBLICATION testpub FOR TABLE tbl_pk, tbl_fk; > > On subscriber: > CREATE TABLE tbl_pk(id int primary key); > CREATE TABLE tbl_fk(fkey int references tbl_pk(id)); > DROP SUBSCRIPTION testsub; > CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres > user=bharath port=5432' PUBLICATION testpub; > > On both publisher and subscriber to ensure that the initial rows were > replicated: > SELECT count(id) FROM tbl_pk; -- non zero > SELECT count(fkey) FROM tbl_fk; -- non zero > > On publisher: > TRUNCATE tbl_pk CASCADE; > SELECT count(id) FROM tbl_pk; -- 0 > SELECT count(fkey) FROM tbl_fk; -- 0 > > On subscriber also we get to see 0 rows: > SELECT count(id) FROM tbl_pk; -- 0 > SELECT count(fkey) FROM tbl_fk; -- 0 > > But the comment says that tbl_fk shouldn't be truncated as it doesn't > pass the cascade option to ExecuteTruncateGuts even though it was > received from the publisher. This behaviour is not in accordance with > the comment, right? I think you are comparing the user-exposed behavior with the internal code comments. The meaning of the comments is that it should not truncate any table on subscriber using cascade, because there might be some subscriber-specific relations that depend upon the primary table and those should not get truncated as a side-effect of the cascade. For example, you can slightly change your example as below > On subscriber: > CREATE TABLE tbl_pk(id int primary key); > CREATE TABLE tbl_fk_sub(fkey int references tbl_pk(id)); -> this table doesn't refer to tbl_pk on the publisher So now as part of the truncate tbl_pk the tbl_fk_subould not get truncated and that is what the comment is trying to say. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Mon, May 3, 2021 at 1:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I think you are comparing the user-exposed behavior with the internal > code comments. The meaning of the comments is that it should not > truncate any table on subscriber using cascade, because there might be > some subscriber-specific relations that depend upon the primary table > and those should not get truncated as a side-effect of the cascade. > > For example, you can slightly change your example as below > > On subscriber: > > CREATE TABLE tbl_pk(id int primary key); > > CREATE TABLE tbl_fk_sub(fkey int references tbl_pk(id)); -> this table doesn't refer to tbl_pk on the publisher > > So now as part of the truncate tbl_pk the tbl_fk_subould not get > truncated and that is what the comment is trying to say. Thanks. I was of the thinking that the subscriber table can not have references to other subscriber-local tables and they should also be having the same column constraints as the publisher table columns. But I was wrong. I tried the use case [1] where the subscriber table tbl_pk, that was subscribed to the changes from the publisher, is being referenced by another subscriber-local table tbl_fk. In this case, the comment and the code that sends only RESTRICT behaviour ignoring the upstream CASCADE option make sense. Having said that, isn't it good if we can provide a subscription (CREATE/ALTER) level option say "cascade"(similar to other options such as binary, synchronous_commit, stream) default being false, when set to true, we send upstream CASCADE option to ExecuteTruncateGuts in apply_handle_truncate? It will be useful to truncate all the dependent tables in the subscriber. Users will have to use it with caution though. Note that the comment already says this: /* * Even if we used CASCADE on the upstream primary we explicitly default * to replaying changes without further cascading. This might be later * changeable with a user specified option. */ Thoughts? [1] On publisher: DROP TABLE tbl_pk CASCADE; CREATE TABLE tbl_pk(id int primary key); INSERT INTO tbl_pk (SELECT x FROM generate_series(1,10) x); DROP PUBLICATION testpub; CREATE PUBLICATION testpub FOR TABLE tbl_pk; On subscriber: DROP TABLE tbl_pk CASCADE; CREATE TABLE tbl_pk(id int primary key); DROP TABLE tbl_fk; CREATE TABLE tbl_fk(id1 int references tbl_pk(id)); DROP SUBSCRIPTION testsub; CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres user=bharath port=5432' PUBLICATION testpub; INSERT INTO tbl_fk (SELECT x FROM generate_series(1,10) x); On publisher: TRUNCATE tbl_pk CASCADE; SELECT count(id) FROM tbl_pk; -- 0 On subscriber we get error, because the RESTRICT option is passed to ExecuteTruncateGuts in logical apply worker and the table tbl_pk is referenced by tbl_fk, so tbl_pk is not truncated. SELECT count(id) FROM tbl_pk; -- non-zero With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Having said that, isn't it good if we can provide a subscription > (CREATE/ALTER) level option say "cascade"(similar to other options > such as binary, synchronous_commit, stream) default being false, when > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in > apply_handle_truncate? It will be useful to truncate all the dependent > tables in the subscriber. Users will have to use it with caution > though. I think this could be a useful feature in some cases. Suppose subscriber has some table that is dependent on the subscribed table, in such case if the main table gets truncated it will always error out in subscriber, which is fine. But if user doesn’t want error and he is fine even if the dependent table gets truncated so I feel there should be some option to set that. I think the documentation should clearly say the impact of setting this to true. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Having said that, isn't it good if we can provide a subscription > > (CREATE/ALTER) level option say "cascade"(similar to other options > > such as binary, synchronous_commit, stream) default being false, when > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in > > apply_handle_truncate? It will be useful to truncate all the dependent > > tables in the subscriber. Users will have to use it with caution > > though. > > I think this could be a useful feature in some cases. Suppose > subscriber has some table that is dependent on the subscribed table, > in such case if the main table gets truncated it will always error out > in subscriber, which is fine. But if user doesn’t want error and he > is fine even if the dependent table gets truncated so I feel there > should be some option to set that. > Such a case is possible in theory but why would the user need it? We generally recommend having the same schema for relations between publishers and subscribers, so won't that mean that there is less chance of such cases? And after we have DDL replication, won't defining a different schema for replicated objects be difficult to maintain. Having said that, I see a different use case of such an option which is related to the proposal [1] where the patch provides a truncate option to truncate tables before initial sync. The cascade option could be useful in that feature to resolve some of the PK-FK issues raised in that thread. [1] - https://www.postgresql.org/message-id/CF3B6672-2A43-4204-A60A-68F359218A9B%40endpoint.com -- With Regards, Amit Kapila.
On Thu, May 20, 2021 at 5:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Having said that, isn't it good if we can provide a subscription > > > (CREATE/ALTER) level option say "cascade"(similar to other options > > > such as binary, synchronous_commit, stream) default being false, when > > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in > > > apply_handle_truncate? It will be useful to truncate all the dependent > > > tables in the subscriber. Users will have to use it with caution > > > though. > > > > I think this could be a useful feature in some cases. Suppose > > subscriber has some table that is dependent on the subscribed table, > > in such case if the main table gets truncated it will always error out > > in subscriber, which is fine. But if user doesn’t want error and he > > is fine even if the dependent table gets truncated so I feel there > > should be some option to set that. > > > > Such a case is possible in theory but why would the user need it? We > generally recommend having the same schema for relations between > publishers and subscribers, so won't that mean that there is less > chance of such cases? And after we have DDL replication, won't > defining a different schema for replicated objects be difficult to > maintain. This proposal can be useful even after the DDL replication feature gets in. I think having a use case like the following is quite possible(in theory) without any differences in the DDLs in both the publisher and subscriber tables: a subscriber table which is subscribed to a publisher, can act as primary key table for other subscriber tables called foreign key tables, refer to the use case I specified at [1]. In that case, TRUNCATE ... CASCADE on the publisher table, will not truncate the subscriber foreign key tables. The proposal here is to pass on the CASCADE option to the subscriber so that the subscriber foreign key tables too get truncated. And I agree that this is a very narrow problem to solve. I came across the comment before ExecuteTruncateGuts in logical/worker.c, so I thought it's worth doing it. [1] - https://www.postgresql.org/message-id/CALj2ACWDFU%3DqawNQkapKeWzVnN6x5zAUzH6hq_3Yybsn6Q89DA%40mail.gmail.com > Having said that, I see a different use case of such an option which > is related to the proposal [1] where the patch provides a truncate > option to truncate tables before initial sync. The cascade option > could be useful in that feature to resolve some of the PK-FK issues > raised in that thread. Thanks. I will respond in that thread. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Fri, May 21, 2021 at 3:39 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Having said that, I see a different use case of such an option which > > is related to the proposal [1] where the patch provides a truncate > > option to truncate tables before initial sync. The cascade option > > could be useful in that feature to resolve some of the PK-FK issues > > raised in that thread. > > Thanks. I will respond in that thread. Just for the records: I responded with my thoughts at [1]. [1] - https://www.postgresql.org/message-id/CALj2ACUOkYk9WBHLmqEp9JDM333jZXCvqNQnS%3D0ukRdnTOcftg%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Thu, May 20, 2021 at 5:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Having said that, isn't it good if we can provide a subscription > > > (CREATE/ALTER) level option say "cascade"(similar to other options > > > such as binary, synchronous_commit, stream) default being false, when > > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in > > > apply_handle_truncate? It will be useful to truncate all the dependent > > > tables in the subscriber. Users will have to use it with caution > > > though. > > > > I think this could be a useful feature in some cases. Suppose > > subscriber has some table that is dependent on the subscribed table, > > in such case if the main table gets truncated it will always error out > > in subscriber, which is fine. But if user doesn’t want error and he > > is fine even if the dependent table gets truncated so I feel there > > should be some option to set that. > > > > Such a case is possible in theory but why would the user need it? We > generally recommend having the same schema for relations between > publishers and subscribers, so won't that mean that there is less > chance of such cases? And after we have DDL replication, won't > defining a different schema for replicated objects be difficult to > maintain. I agree we suggest having the same schema but we still allow something extra on the subscriber side, e.g if the publisher table has T(a,b) then we allow the subscriber to have T(a,b,c) right? Since this is logical replication is always good that we don't enforce on what schema/dependent table subscriber can have unless it is of utmost necessity. That's the reason we enforce that at least the table name of the publisher should match on subscriber and at least the column name of the publisher should be there on the subscriber, but there is no restriction on the subscriber to have a few extra columns. I think if we allow subscribers to have some extra FK table on the subscribed table then that will only improve the flexibility. In the current case since we don't want to truncate the subscriber's local table, we are restricting the cascade but then we are restricting the subscriber to have any FK table on the subscribed table which I think is a restriction and it will be good to allow this based on some parameter. Sad that, if we assume that the subscriber schema/dependent table should always be the same as primary then shouldn't we just do the CASCADE truncate if the publisher is doing that. We are not doing that to protect the truncation of some of the local tables on subscribers. I think DDL replication should not change this. DDL replication will replicate all object from publisher to subscriber but that doesn't mean that subscriber can not create some extra tables which are dependent on the replicated table. I think such an extension is good for logical replication. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Sat, May 22, 2021 at 10:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, May 20, 2021 at 5:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy > > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > > > Having said that, isn't it good if we can provide a subscription > > > > (CREATE/ALTER) level option say "cascade"(similar to other options > > > > such as binary, synchronous_commit, stream) default being false, when > > > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in > > > > apply_handle_truncate? It will be useful to truncate all the dependent > > > > tables in the subscriber. Users will have to use it with caution > > > > though. > > > > > > I think this could be a useful feature in some cases. Suppose > > > subscriber has some table that is dependent on the subscribed table, > > > in such case if the main table gets truncated it will always error out > > > in subscriber, which is fine. But if user doesn’t want error and he > > > is fine even if the dependent table gets truncated so I feel there > > > should be some option to set that. > > > > > > > Such a case is possible in theory but why would the user need it? We > > generally recommend having the same schema for relations between > > publishers and subscribers, so won't that mean that there is less > > chance of such cases? And after we have DDL replication, won't > > defining a different schema for replicated objects be difficult to > > maintain. > > I agree we suggest having the same schema but we still allow something > extra on the subscriber side, e.g if the publisher table has T(a,b) > then we allow the subscriber to have T(a,b,c) right? Since this is > logical replication is always good that we don't enforce on what > schema/dependent table subscriber can have unless it is of utmost > necessity. That's the reason we enforce that at least the table name > of the publisher should match on subscriber and at least the column > name of the publisher should be there on the subscriber, but there is > no restriction on the subscriber to have a few extra columns. I think > if we allow subscribers to have some extra FK table on the subscribed > table then that will only improve the flexibility. In the current > case since we don't want to truncate the subscriber's local table, we > are restricting the cascade but then we are restricting the subscriber > to have any FK table on the subscribed table which I think is a > restriction and it will be good to allow this based on some parameter. > I don't deny that this can allow some additional cases than we allow today but was just not sure whether users really need it. If we want to go with such an option then as mentioned earlier, we should consider another proposal for subscriber-side truncate [1] because we might need a cascade operation there as well but for a slightly different purpose. > Sad that, if we assume that the subscriber schema/dependent table > should always be the same as primary then shouldn't we just do the > CASCADE truncate if the publisher is doing that. We are not doing > that to protect the truncation of some of the local tables on > subscribers. > > I think DDL replication should not change this. DDL replication will > replicate all object from publisher to subscriber but that doesn't > mean that subscriber can not create some extra tables which are > dependent on the replicated table. I think such an extension is good > for logical replication. > Fair enough. I mentioned DDL replication because once we have that then probably lesser users want to manually perform schema alterations for the replicated objects. [1] - https://www.postgresql.org/message-id/CF3B6672-2A43-4204-A60A-68F359218A9B%40endpoint.com -- With Regards, Amit Kapila.
On Mon, May 24, 2021 at 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > I don't deny that this can allow some additional cases than we allow > today but was just not sure whether users really need it. If we want > to go with such an option then as mentioned earlier, we should > consider another proposal for subscriber-side truncate [1] because we > might need a cascade operation there as well but for a slightly > different purpose. I'm thinking how we can utilize the truncate option proposed at [1] for the idea here. Because, currently the truncate option(proposed at [1]) is boolean, (of course we can change this to take "cascade", "restrict" options). But how can we differentiate the usage of the truncate option at [1] for two purposes 1) for before copy data/initial table sync operation and 2) for the replication of TRUNCATE command as proposed here in this thread. Any thoughts? [1] - https://www.postgresql.org/message-id/CALj2ACVWNBttS-YcMYuUPK25tsuqtbpuNjT-CBRQLUHXezWERw%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Mon, May 24, 2021 at 2:18 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, May 24, 2021 at 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I don't deny that this can allow some additional cases than we allow > > today but was just not sure whether users really need it. If we want > > to go with such an option then as mentioned earlier, we should > > consider another proposal for subscriber-side truncate [1] because we > > might need a cascade operation there as well but for a slightly > > different purpose. > > I'm thinking how we can utilize the truncate option proposed at [1] > for the idea here. Because, currently the truncate option(proposed at > [1]) is boolean, (of course we can change this to take "cascade", > "restrict" options). But how can we differentiate the usage of the > truncate option at [1] for two purposes 1) for before copy > data/initial table sync operation and 2) for the replication of > TRUNCATE command as proposed here in this thread. Any thoughts? > I think we can do this as a separate patch. Let's not try to combine both patches. -- With Regards, Amit Kapila.