Thread: Logical Replication - behavior of TRUNCATE ... CASCADE

Logical Replication - behavior of TRUNCATE ... CASCADE

From
Bharath Rupireddy
Date:
Hi,

In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even if the CASCADE option has been specified in publisher's TRUNCATE command.
    /*
     * Even if we used CASCADE on the upstream primary we explicitly default
     * to replaying changes without further cascading. This might be later
     * changeable with a user specified option.
     */
I tried the following use case to see if that's actually true:
1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher and subscriber.
2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk.
3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command.
4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When this command is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply worker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that on the subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receiving the tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fk are truncated.

Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receiving all the dependent relations in the remote rels from the publisher? Am I missing something?

The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people in cc.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Dilip Kumar
Date:
On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
> In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even if
theCASCADE option has been specified in publisher's TRUNCATE command. 
>     /*
>      * Even if we used CASCADE on the upstream primary we explicitly default
>      * to replaying changes without further cascading. This might be later
>      * changeable with a user specified option.
>      */
> I tried the following use case to see if that's actually true:
> 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher
andsubscriber. 
> 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk.
> 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command.
> 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When
thiscommand is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply
workerignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that
onthe subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also
receivingthe tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and
tbl_fkare truncated. 
>
> Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are
receivingall the dependent relations in the remote rels from the publisher? Am I missing something? 
>
> The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people in
cc.

Assume this case
publisher: tbl_pk -> tbl_fk_pub
subscriber: tbl_pk-> tbl_fk_sub

Now, in this case, this comment is true right because we are not
supposed to truncate tbl_fk_sub on the subscriber side and this should
error out.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Bharath Rupireddy
Date:
On Mon, May 3, 2021 at 11:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> > In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even
ifthe CASCADE option has been specified in publisher's TRUNCATE command. 
> >     /*
> >      * Even if we used CASCADE on the upstream primary we explicitly default
> >      * to replaying changes without further cascading. This might be later
> >      * changeable with a user specified option.
> >      */
> > I tried the following use case to see if that's actually true:
> > 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher
andsubscriber. 
> > 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk.
> > 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command.
> > 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When
thiscommand is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply
workerignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that
onthe subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also
receivingthe tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and
tbl_fkare truncated. 
> >
> > Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are
receivingall the dependent relations in the remote rels from the publisher? Am I missing something? 
> >
> > The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people
incc. 
>
> Assume this case
> publisher: tbl_pk -> tbl_fk_pub
> subscriber: tbl_pk-> tbl_fk_sub
>
> Now, in this case, this comment is true right because we are not
> supposed to truncate tbl_fk_sub on the subscriber side and this should
> error out.

Here's what I tried, let me know if I'm wrong:

On publisher:
CREATE TABLE tbl_pk(id int primary key);
CREATE TABLE tbl_fk(fkey int references tbl_pk(id));
INSERT INTO tbl_pk (SELECT x FROM generate_series(1,10) x);
INSERT INTO tbl_fk (SELECT x % 10 + 1 FROM generate_series(5,25) x);
DROP PUBLICATION testpub;
CREATE PUBLICATION testpub FOR TABLE tbl_pk, tbl_fk;

On subscriber:
CREATE TABLE tbl_pk(id int primary key);
CREATE TABLE tbl_fk(fkey int references tbl_pk(id));
DROP SUBSCRIPTION testsub;
CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres
user=bharath port=5432' PUBLICATION testpub;

On both publisher and subscriber to ensure that the initial rows were
replicated:
SELECT count(id) FROM tbl_pk; -- non zero
SELECT count(fkey) FROM tbl_fk; -- non zero

On publisher:
TRUNCATE tbl_pk CASCADE;
SELECT count(id) FROM tbl_pk; -- 0
SELECT count(fkey) FROM tbl_fk; -- 0

On subscriber also we get to see 0 rows:
SELECT count(id) FROM tbl_pk; -- 0
SELECT count(fkey) FROM tbl_fk; -- 0

But the comment says that tbl_fk shouldn't be truncated as it doesn't
pass the cascade option to ExecuteTruncateGuts even though it was
received from the publisher. This behaviour is not in accordance with
the comment, right?
If we see why this is so: the publisher sends both tbl_pk and tbl_fk
rels to the subscriber and the TRUNCATE tbl_pk, tbl_fk; is allowed
(see the code in heap_truncate_check_FKs) even if RESTRICT option is
specified.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Dilip Kumar
Date:
sh,On Mon, May 3, 2021 at 12:37 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 11:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even
ifthe CASCADE option has been specified in publisher's TRUNCATE command. 
> > >     /*
> > >      * Even if we used CASCADE on the upstream primary we explicitly default
> > >      * to replaying changes without further cascading. This might be later
> > >      * changeable with a user specified option.
> > >      */
> > > I tried the following use case to see if that's actually true:
> > > 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both
publisherand subscriber. 
> > > 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk.
> > > 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command.
> > > 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated.
Whenthis command is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the
applyworker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is
thaton the subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also
receivingthe tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and
tbl_fkare truncated. 
> > >
> > > Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are
receivingall the dependent relations in the remote rels from the publisher? Am I missing something? 
> > >
> > > The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people
incc. 
> >
> > Assume this case
> > publisher: tbl_pk -> tbl_fk_pub
> > subscriber: tbl_pk-> tbl_fk_sub
> >
> > Now, in this case, this comment is true right because we are not
> > supposed to truncate tbl_fk_sub on the subscriber side and this should
> > error out.
>
> Here's what I tried, let me know if I'm wrong:
>
> On publisher:
> CREATE TABLE tbl_pk(id int primary key);
> CREATE TABLE tbl_fk(fkey int references tbl_pk(id));
> INSERT INTO tbl_pk (SELECT x FROM generate_series(1,10) x);
> INSERT INTO tbl_fk (SELECT x % 10 + 1 FROM generate_series(5,25) x);
> DROP PUBLICATION testpub;
> CREATE PUBLICATION testpub FOR TABLE tbl_pk, tbl_fk;
>
> On subscriber:
> CREATE TABLE tbl_pk(id int primary key);
> CREATE TABLE tbl_fk(fkey int references tbl_pk(id));
> DROP SUBSCRIPTION testsub;
> CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres
> user=bharath port=5432' PUBLICATION testpub;
>
> On both publisher and subscriber to ensure that the initial rows were
> replicated:
> SELECT count(id) FROM tbl_pk; -- non zero
> SELECT count(fkey) FROM tbl_fk; -- non zero
>
> On publisher:
> TRUNCATE tbl_pk CASCADE;
> SELECT count(id) FROM tbl_pk; -- 0
> SELECT count(fkey) FROM tbl_fk; -- 0
>
> On subscriber also we get to see 0 rows:
> SELECT count(id) FROM tbl_pk; -- 0
> SELECT count(fkey) FROM tbl_fk; -- 0
>
> But the comment says that tbl_fk shouldn't be truncated as it doesn't
> pass the cascade option to ExecuteTruncateGuts even though it was
> received from the publisher. This behaviour is not in accordance with
> the comment, right?

I think you are comparing the user-exposed behavior with the internal
code comments.  The meaning of the comments is that it should not
truncate any table on subscriber using cascade, because there might be
some subscriber-specific relations that depend upon the primary table
and those should not get truncated as a side-effect of the cascade.

For example, you can slightly change your example as below
> On subscriber:
> CREATE TABLE tbl_pk(id int primary key);
> CREATE TABLE tbl_fk_sub(fkey int references tbl_pk(id));   -> this table doesn't refer to tbl_pk on the publisher

So now as part of the truncate tbl_pk the tbl_fk_subould not get
truncated and that is what the comment is trying to say.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Bharath Rupireddy
Date:
On Mon, May 3, 2021 at 1:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I think you are comparing the user-exposed behavior with the internal
> code comments.  The meaning of the comments is that it should not
> truncate any table on subscriber using cascade, because there might be
> some subscriber-specific relations that depend upon the primary table
> and those should not get truncated as a side-effect of the cascade.
>
> For example, you can slightly change your example as below
> > On subscriber:
> > CREATE TABLE tbl_pk(id int primary key);
> > CREATE TABLE tbl_fk_sub(fkey int references tbl_pk(id));   -> this table doesn't refer to tbl_pk on the publisher
>
> So now as part of the truncate tbl_pk the tbl_fk_subould not get
> truncated and that is what the comment is trying to say.

Thanks. I was of the thinking that the subscriber table can not have
references to other subscriber-local tables and they should also be
having the same column constraints as the publisher table columns.
But I was wrong. I tried the use case [1] where the subscriber table
tbl_pk, that was subscribed to the changes from the publisher, is
being referenced by another subscriber-local table tbl_fk. In this
case, the comment and the code that sends only RESTRICT behaviour
ignoring the upstream CASCADE option make sense.

Having said that, isn't it good if we can provide a subscription
(CREATE/ALTER) level option say "cascade"(similar to other options
such as binary, synchronous_commit, stream)  default being false, when
set to true, we send upstream CASCADE option to ExecuteTruncateGuts in
apply_handle_truncate? It will be useful to truncate all the dependent
tables in the subscriber. Users will have to use it with caution
though.

Note that the comment already says this:
    /*
     * Even if we used CASCADE on the upstream primary we explicitly default
     * to replaying changes without further cascading. This might be later
     * changeable with a user specified option.
     */

Thoughts?

[1]
On publisher:
DROP TABLE tbl_pk CASCADE;
CREATE TABLE tbl_pk(id int primary key);
INSERT INTO tbl_pk (SELECT x FROM generate_series(1,10) x);
DROP PUBLICATION testpub;
CREATE PUBLICATION testpub FOR TABLE tbl_pk;

On subscriber:
DROP TABLE tbl_pk CASCADE;
CREATE TABLE tbl_pk(id int primary key);
DROP TABLE tbl_fk;
CREATE TABLE tbl_fk(id1 int references tbl_pk(id));
DROP SUBSCRIPTION testsub;
CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres
user=bharath port=5432' PUBLICATION testpub;
INSERT INTO tbl_fk (SELECT x FROM generate_series(1,10) x);

On publisher:
TRUNCATE tbl_pk CASCADE;
SELECT count(id) FROM tbl_pk; -- 0

On subscriber we get error, because the RESTRICT option is passed to
ExecuteTruncateGuts in logical apply worker and the table tbl_pk is
referenced by tbl_fk, so tbl_pk is not truncated.
SELECT count(id) FROM tbl_pk; -- non-zero

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Dilip Kumar
Date:
On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Having said that, isn't it good if we can provide a subscription
> (CREATE/ALTER) level option say "cascade"(similar to other options
> such as binary, synchronous_commit, stream)  default being false, when
> set to true, we send upstream CASCADE option to ExecuteTruncateGuts in
> apply_handle_truncate? It will be useful to truncate all the dependent
> tables in the subscriber. Users will have to use it with caution
> though.

I think this could be a useful feature in some cases.  Suppose
subscriber has some table that is dependent on the subscribed table,
in such case if the main table gets truncated it will always error out
in subscriber, which is fine.  But if user doesn’t want error and he
is fine even if the dependent table gets truncated so I feel there
should be some option to set that.  I think the documentation should
clearly say the impact of setting this to true.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Amit Kapila
Date:
On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Having said that, isn't it good if we can provide a subscription
> > (CREATE/ALTER) level option say "cascade"(similar to other options
> > such as binary, synchronous_commit, stream)  default being false, when
> > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in
> > apply_handle_truncate? It will be useful to truncate all the dependent
> > tables in the subscriber. Users will have to use it with caution
> > though.
>
> I think this could be a useful feature in some cases.  Suppose
> subscriber has some table that is dependent on the subscribed table,
> in such case if the main table gets truncated it will always error out
> in subscriber, which is fine.  But if user doesn’t want error and he
> is fine even if the dependent table gets truncated so I feel there
> should be some option to set that.
>

Such a case is possible in theory but why would the user need it? We
generally recommend having the same schema for relations between
publishers and subscribers, so won't that mean that there is less
chance of such cases? And after we have DDL replication, won't
defining a different schema for replicated objects be difficult to
maintain.

Having said that, I see a different use case of such an option which
is related to the proposal [1] where the patch provides a truncate
option to truncate tables before initial sync. The cascade option
could be useful in that feature to resolve some of the PK-FK issues
raised in that thread.

[1] - https://www.postgresql.org/message-id/CF3B6672-2A43-4204-A60A-68F359218A9B%40endpoint.com

--
With Regards,
Amit Kapila.



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Bharath Rupireddy
Date:
On Thu, May 20, 2021 at 5:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > Having said that, isn't it good if we can provide a subscription
> > > (CREATE/ALTER) level option say "cascade"(similar to other options
> > > such as binary, synchronous_commit, stream)  default being false, when
> > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in
> > > apply_handle_truncate? It will be useful to truncate all the dependent
> > > tables in the subscriber. Users will have to use it with caution
> > > though.
> >
> > I think this could be a useful feature in some cases.  Suppose
> > subscriber has some table that is dependent on the subscribed table,
> > in such case if the main table gets truncated it will always error out
> > in subscriber, which is fine.  But if user doesn’t want error and he
> > is fine even if the dependent table gets truncated so I feel there
> > should be some option to set that.
> >
>
> Such a case is possible in theory but why would the user need it? We
> generally recommend having the same schema for relations between
> publishers and subscribers, so won't that mean that there is less
> chance of such cases? And after we have DDL replication, won't
> defining a different schema for replicated objects be difficult to
> maintain.

This proposal can be useful even after the DDL replication feature
gets in. I think having a use case like the following is quite
possible(in theory) without any differences in the DDLs in both the
publisher and subscriber tables: a subscriber table which is
subscribed to a publisher, can act as primary key table for other
subscriber tables called foreign key tables, refer to the use case I
specified at [1]. In that case, TRUNCATE ... CASCADE on the publisher
table, will not truncate the subscriber foreign key tables. The
proposal here is to pass on the CASCADE option to the subscriber so
that the subscriber foreign key tables too get truncated. And I agree
that this is a very narrow problem to solve. I came across the comment
before ExecuteTruncateGuts in logical/worker.c, so I thought it's
worth doing it.

[1] - https://www.postgresql.org/message-id/CALj2ACWDFU%3DqawNQkapKeWzVnN6x5zAUzH6hq_3Yybsn6Q89DA%40mail.gmail.com

> Having said that, I see a different use case of such an option which
> is related to the proposal [1] where the patch provides a truncate
> option to truncate tables before initial sync. The cascade option
> could be useful in that feature to resolve some of the PK-FK issues
> raised in that thread.

Thanks. I will respond in that thread.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Bharath Rupireddy
Date:
On Fri, May 21, 2021 at 3:39 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> > Having said that, I see a different use case of such an option which
> > is related to the proposal [1] where the patch provides a truncate
> > option to truncate tables before initial sync. The cascade option
> > could be useful in that feature to resolve some of the PK-FK issues
> > raised in that thread.
>
> Thanks. I will respond in that thread.

Just for the records: I responded with my thoughts at [1].

[1] - https://www.postgresql.org/message-id/CALj2ACUOkYk9WBHLmqEp9JDM333jZXCvqNQnS%3D0ukRdnTOcftg%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Dilip Kumar
Date:
On Thu, May 20, 2021 at 5:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > Having said that, isn't it good if we can provide a subscription
> > > (CREATE/ALTER) level option say "cascade"(similar to other options
> > > such as binary, synchronous_commit, stream)  default being false, when
> > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in
> > > apply_handle_truncate? It will be useful to truncate all the dependent
> > > tables in the subscriber. Users will have to use it with caution
> > > though.
> >
> > I think this could be a useful feature in some cases.  Suppose
> > subscriber has some table that is dependent on the subscribed table,
> > in such case if the main table gets truncated it will always error out
> > in subscriber, which is fine.  But if user doesn’t want error and he
> > is fine even if the dependent table gets truncated so I feel there
> > should be some option to set that.
> >
>
> Such a case is possible in theory but why would the user need it? We
> generally recommend having the same schema for relations between
> publishers and subscribers, so won't that mean that there is less
> chance of such cases? And after we have DDL replication, won't
> defining a different schema for replicated objects be difficult to
> maintain.

I agree we suggest having the same schema but we still allow something
extra on the subscriber side, e.g if the publisher table has T(a,b)
then we allow the subscriber to have T(a,b,c) right?  Since this is
logical replication is always good that we don't enforce on what
schema/dependent table subscriber can have unless it is of utmost
necessity.  That's the reason we enforce that at least the table name
of the publisher should match on subscriber and at least the column
name of the publisher should be there on the subscriber, but there is
no restriction on the subscriber to have a few extra columns.  I think
if we allow subscribers to have some extra FK table on the subscribed
table then that will only improve the flexibility.  In the current
case since we don't want to truncate the subscriber's local table, we
are restricting the cascade but then we are restricting the subscriber
to have any FK table on the subscribed table which I think is a
restriction and it will be good to allow this based on some parameter.

Sad that, if we assume that the subscriber schema/dependent table
should always be the same as primary then shouldn't we just do the
CASCADE truncate if the publisher is doing that.  We are not doing
that to protect the truncation of some of the local tables on
subscribers.

I think DDL replication should not change this.  DDL replication will
replicate all object from publisher to subscriber but that doesn't
mean that subscriber can not create some extra tables which are
dependent on the replicated table.  I think such an extension is good
for logical replication.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Amit Kapila
Date:
On Sat, May 22, 2021 at 10:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, May 20, 2021 at 5:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, May 7, 2021 at 6:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, May 3, 2021 at 6:08 PM Bharath Rupireddy
> > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > >
> > > > Having said that, isn't it good if we can provide a subscription
> > > > (CREATE/ALTER) level option say "cascade"(similar to other options
> > > > such as binary, synchronous_commit, stream)  default being false, when
> > > > set to true, we send upstream CASCADE option to ExecuteTruncateGuts in
> > > > apply_handle_truncate? It will be useful to truncate all the dependent
> > > > tables in the subscriber. Users will have to use it with caution
> > > > though.
> > >
> > > I think this could be a useful feature in some cases.  Suppose
> > > subscriber has some table that is dependent on the subscribed table,
> > > in such case if the main table gets truncated it will always error out
> > > in subscriber, which is fine.  But if user doesn’t want error and he
> > > is fine even if the dependent table gets truncated so I feel there
> > > should be some option to set that.
> > >
> >
> > Such a case is possible in theory but why would the user need it? We
> > generally recommend having the same schema for relations between
> > publishers and subscribers, so won't that mean that there is less
> > chance of such cases? And after we have DDL replication, won't
> > defining a different schema for replicated objects be difficult to
> > maintain.
>
> I agree we suggest having the same schema but we still allow something
> extra on the subscriber side, e.g if the publisher table has T(a,b)
> then we allow the subscriber to have T(a,b,c) right?  Since this is
> logical replication is always good that we don't enforce on what
> schema/dependent table subscriber can have unless it is of utmost
> necessity.  That's the reason we enforce that at least the table name
> of the publisher should match on subscriber and at least the column
> name of the publisher should be there on the subscriber, but there is
> no restriction on the subscriber to have a few extra columns.  I think
> if we allow subscribers to have some extra FK table on the subscribed
> table then that will only improve the flexibility.  In the current
> case since we don't want to truncate the subscriber's local table, we
> are restricting the cascade but then we are restricting the subscriber
> to have any FK table on the subscribed table which I think is a
> restriction and it will be good to allow this based on some parameter.
>

I don't deny that this can allow some additional cases than we allow
today but was just not sure whether users really need it. If we want
to go with such an option then as mentioned earlier, we should
consider another proposal for subscriber-side truncate [1] because we
might need a cascade operation there as well but for a slightly
different purpose.

> Sad that, if we assume that the subscriber schema/dependent table
> should always be the same as primary then shouldn't we just do the
> CASCADE truncate if the publisher is doing that.  We are not doing
> that to protect the truncation of some of the local tables on
> subscribers.
>
> I think DDL replication should not change this.  DDL replication will
> replicate all object from publisher to subscriber but that doesn't
> mean that subscriber can not create some extra tables which are
> dependent on the replicated table.  I think such an extension is good
> for logical replication.
>

Fair enough. I mentioned DDL replication because once we have that
then probably lesser users want to manually perform schema alterations
for the replicated objects.

[1] - https://www.postgresql.org/message-id/CF3B6672-2A43-4204-A60A-68F359218A9B%40endpoint.com

--
With Regards,
Amit Kapila.



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Bharath Rupireddy
Date:
On Mon, May 24, 2021 at 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> I don't deny that this can allow some additional cases than we allow
> today but was just not sure whether users really need it. If we want
> to go with such an option then as mentioned earlier, we should
> consider another proposal for subscriber-side truncate [1] because we
> might need a cascade operation there as well but for a slightly
> different purpose.

I'm thinking how we can utilize the truncate option proposed at [1]
for the idea here. Because, currently the truncate option(proposed at
[1]) is boolean, (of course we can change this to take "cascade",
"restrict" options). But how can we differentiate the usage of the
truncate option at [1] for two purposes 1) for before copy
data/initial table sync operation and 2) for the replication of
TRUNCATE command as proposed here in this thread. Any thoughts?

[1] - https://www.postgresql.org/message-id/CALj2ACVWNBttS-YcMYuUPK25tsuqtbpuNjT-CBRQLUHXezWERw%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From
Amit Kapila
Date:
On Mon, May 24, 2021 at 2:18 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, May 24, 2021 at 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > I don't deny that this can allow some additional cases than we allow
> > today but was just not sure whether users really need it. If we want
> > to go with such an option then as mentioned earlier, we should
> > consider another proposal for subscriber-side truncate [1] because we
> > might need a cascade operation there as well but for a slightly
> > different purpose.
>
> I'm thinking how we can utilize the truncate option proposed at [1]
> for the idea here. Because, currently the truncate option(proposed at
> [1]) is boolean, (of course we can change this to take "cascade",
> "restrict" options). But how can we differentiate the usage of the
> truncate option at [1] for two purposes 1) for before copy
> data/initial table sync operation and 2) for the replication of
> TRUNCATE command as proposed here in this thread. Any thoughts?
>

I think we can do this as a separate patch. Let's not try to combine
both patches.

-- 
With Regards,
Amit Kapila.