Thread: Added schema level support for publication.

Added schema level support for publication.

From

vignesh C

Date:

07 January 2021, 16:33:39

Hi,

This feature adds schema option while creating publication. Users will
be able to specify one or more schemas while creating publication,
when the user specifies schema option, then the data changes for the
tables present in the schema specified by the user will be replicated
to the subscriber. Few examples have been listed below:

Create a publication that publishes all changes for all the tables
present in production schema:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;

Create a publication that publishes all changes for all the tables
present in marketing and sales schemas:
CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;

Add some schemas to the publication:
ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;

Drop some schema from the publication:
ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;

Attached is a POC patch for the same. I felt this feature would be quite useful.
Thoughts?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachment

v1-0001-Added-schema-level-support-for-publication.patch

Re: Added schema level support for publication.

From

Amit Kapila

Date:

08 January 2021, 11:05:08

On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
>
> This feature adds schema option while creating publication. Users will
> be able to specify one or more schemas while creating publication,
> when the user specifies schema option, then the data changes for the
> tables present in the schema specified by the user will be replicated
> to the subscriber. Few examples have been listed below:
>
> Create a publication that publishes all changes for all the tables
> present in production schema:
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
>
> Create a publication that publishes all changes for all the tables
> present in marketing and sales schemas:
> CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
>
> Add some schemas to the publication:
> ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
>
> Drop some schema from the publication:
> ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
>
> Attached is a POC patch for the same. I felt this feature would be quite useful.
>

What do we do if the user Drops the schema? Do we automatically remove
it from the publication?

I see some use of such a feature but you haven't described the use
case or how did you arrive at the conclusion that it would be quite
useful?

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

09 January 2021, 11:51:06

On Fri, Jan 8, 2021 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > This feature adds schema option while creating publication. Users will
> > be able to specify one or more schemas while creating publication,
> > when the user specifies schema option, then the data changes for the
> > tables present in the schema specified by the user will be replicated
> > to the subscriber. Few examples have been listed below:
> >
> > Create a publication that publishes all changes for all the tables
> > present in production schema:
> > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
> >
> > Create a publication that publishes all changes for all the tables
> > present in marketing and sales schemas:
> > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
> >
> > Add some schemas to the publication:
> > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
> >
> > Drop some schema from the publication:
> > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
> >
> > Attached is a POC patch for the same. I felt this feature would be quite useful.
> >
>
> What do we do if the user Drops the schema? Do we automatically remove
> it from the publication?
>
I have not yet handled this scenario yet, I will handle this and
adding of tests in the next patch.

> I see some use of such a feature but you haven't described the use
> case or how did you arrive at the conclusion that it would be quite
> useful?
>
Currently there are a couple of options "FOR All TABLES" and "FOR
TABLE" when a user creates a publication, 1) either to subscribe to
the changes of all the tables or 2) subscribe to a few tables. There
is no option for users to subscribe to relations present in the
schemas. User has to manually identify the list of tables present in
the schema and specify the list of tables in that schema using the
"FOR TABLE" option. Similarly if a user wants to subscribe to n number
of schemas, the user has to do this for the required schemas, this is
a tedious process. This feature helps the user to take care of this
internally using schema option.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

09 January 2021, 14:38:16

On Sat, Jan 9, 2021 at 5:21 PM vignesh C <vignesh21@gmail.com> wrote:
> > What do we do if the user Drops the schema? Do we automatically remove
> > it from the publication?
> >
> I have not yet handled this scenario yet, I will handle this and
> adding of tests in the next patch.
>
> > I see some use of such a feature but you haven't described the use
> > case or how did you arrive at the conclusion that it would be quite
> > useful?
> >
> Currently there are a couple of options "FOR All TABLES" and "FOR
> TABLE" when a user creates a publication, 1) either to subscribe to
> the changes of all the tables or 2) subscribe to a few tables. There
> is no option for users to subscribe to relations present in the
> schemas. User has to manually identify the list of tables present in
> the schema and specify the list of tables in that schema using the
> "FOR TABLE" option. Similarly if a user wants to subscribe to n number
> of schemas, the user has to do this for the required schemas, this is
> a tedious process. This feature helps the user to take care of this
> internally using schema option.

I think this feature can be useful, in case a user has a lot of tables
to publish inside a schema. Having said that, I wonder if this feature
mandates users to create the same schema with same
permissions/authorizations manually on the subscriber, because logical
replication doesn't propagate any ddl's so are the schema or schema
changes? Or is it that the list of tables from the publisher can go
into a different schema on the subscriber?

Since the schema can have other objects such as data types, functions,
operators, I'm sure with your feature, non-table objects will be
skipped.

As Amit pointed out earlier, the behaviour when schema dropped, I
think we should also consider when schema is altered, say altered to a
different name, maybe we should change that in the publication too.

In general, what happens if we have some temporary tables or foreign
tables inside the schema, will they be allowed to send the data to
subscribers?

And, with this feature, since there can be many huge tables inside a
schema, the initial table sync phase of the replication can take a
while.

Say a user has created a publication for a schema with hundreds of
tables in it, at some point later, can he stop replicating a single or
some tables from that schema?

IMO, it's better to have the syntax - CREATE PUBLICATION
production_publication FOR ALL TABLES IN SCHEMA production - just
added IN between for all tables and schema.

Say a user has a schema with 121 tables in it, and wants to replicate
only 120 or 199 or even lesser tables out of it, so can we have some
skip option to the new syntax, something like below?
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
production WITH skip = marketing, accounts, sales;  --> meaning is,
replicate all the tables in the schema production except marketing,
accounts, sales tables.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

09 January 2021, 14:43:52

On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> I think this feature can be useful, in case a user has a lot of tables
> to publish inside a schema. Having said that, I wonder if this feature
> mandates users to create the same schema with same
> permissions/authorizations manually on the subscriber, because logical
> replication doesn't propagate any ddl's so are the schema or schema
> changes? Or is it that the list of tables from the publisher can go
> into a different schema on the subscriber?
>
> Since the schema can have other objects such as data types, functions,
> operators, I'm sure with your feature, non-table objects will be
> skipped.
>
> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>
> In general, what happens if we have some temporary tables or foreign
> tables inside the schema, will they be allowed to send the data to
> subscribers?
>
> And, with this feature, since there can be many huge tables inside a
> schema, the initial table sync phase of the replication can take a
> while.
>
> Say a user has created a publication for a schema with hundreds of
> tables in it, at some point later, can he stop replicating a single or
> some tables from that schema?
>
> IMO, it's better to have the syntax - CREATE PUBLICATION
> production_publication FOR ALL TABLES IN SCHEMA production - just
> added IN between for all tables and schema.
>
> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales;  --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.

One more point - if the publication is created for a schema with no or
some initial tables, will all the future tables that may get added to
the schema will be replicated too?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Dilip Kumar

Date:

10 January 2021, 09:58:58

On Sat, Jan 9, 2021 at 8:14 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > I think this feature can be useful, in case a user has a lot of tables
> > to publish inside a schema. Having said that, I wonder if this feature
> > mandates users to create the same schema with same
> > permissions/authorizations manually on the subscriber, because logical
> > replication doesn't propagate any ddl's so are the schema or schema
> > changes? Or is it that the list of tables from the publisher can go
> > into a different schema on the subscriber?
> >
> > Since the schema can have other objects such as data types, functions,
> > operators, I'm sure with your feature, non-table objects will be
> > skipped.
> >
> > As Amit pointed out earlier, the behaviour when schema dropped, I
> > think we should also consider when schema is altered, say altered to a
> > different name, maybe we should change that in the publication too.
> >
> > In general, what happens if we have some temporary tables or foreign
> > tables inside the schema, will they be allowed to send the data to
> > subscribers?
> >
> > And, with this feature, since there can be many huge tables inside a
> > schema, the initial table sync phase of the replication can take a
> > while.
> >
> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
> > IMO, it's better to have the syntax - CREATE PUBLICATION
> > production_publication FOR ALL TABLES IN SCHEMA production - just
> > added IN between for all tables and schema.
> >
> > Say a user has a schema with 121 tables in it, and wants to replicate
> > only 120 or 199 or even lesser tables out of it, so can we have some
> > skip option to the new syntax, something like below?
> > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> > production WITH skip = marketing, accounts, sales;  --> meaning is,
> > replicate all the tables in the schema production except marketing,
> > accounts, sales tables.
>
> One more point - if the publication is created for a schema with no or
> some initial tables, will all the future tables that may get added to
> the schema will be replicated too?
>

I expect this should be the behavior.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

vignesh C

Date:

10 January 2021, 17:51:16

Thanks for your comments Bharath, please find my opinion below.

On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> I think this feature can be useful, in case a user has a lot of tables
> to publish inside a schema. Having said that, I wonder if this feature
> mandates users to create the same schema with same
> permissions/authorizations manually on the subscriber, because logical
> replication doesn't propagate any ddl's so are the schema or schema
> changes? Or is it that the list of tables from the publisher can go
> into a different schema on the subscriber?
>

DDL's will not be propagated to the subscriber. Users have to create
the schema & tables in the subscriber. No change in
Permissions/authorizations handling, it will be the same as the
existing behavior for relations.

> Since the schema can have other objects such as data types, functions,
> operators, I'm sure with your feature, non-table objects will be
> skipped.
>

Yes, only table data will be sent to subscribers, non-table objects
will be skipped.

> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>

I agree that when schema is altered the renamed schema should be
reflected in the publication.

> In general, what happens if we have some temporary tables or foreign
> tables inside the schema, will they be allowed to send the data to
> subscribers?
>

Temporary tables & foreign tables will not be added to the publications.

> And, with this feature, since there can be many huge tables inside a
> schema, the initial table sync phase of the replication can take a
> while.
>

Yes this is required.

> Say a user has created a publication for a schema with hundreds of
> tables in it, at some point later, can he stop replicating a single or
> some tables from that schema?
>

There is no provision for this currently.

> IMO, it's better to have the syntax - CREATE PUBLICATION
> production_publication FOR ALL TABLES IN SCHEMA production - just
> added IN between for all tables and schema.
>

I'm ok with the proposed syntax, I would like others' opinion too
before making the change.

> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales;  --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.
>

Yes this is a good use case, will include this change.

Thanks for the comments, I will handle the comments and post a patch for this.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

vignesh C

Date:

10 January 2021, 17:52:39

On Sat, Jan 9, 2021 at 8:14 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> One more point - if the publication is created for a schema with no or
> some initial tables, will all the future tables that may get added to
> the schema will be replicated too?
>

I agree on this, when a relation is added to the schema it should be
added to the publication.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

11 January 2021, 06:15:47

On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > I think this feature can be useful, in case a user has a lot of tables
> > to publish inside a schema. Having said that, I wonder if this feature
> > mandates users to create the same schema with same
> > permissions/authorizations manually on the subscriber, because logical
> > replication doesn't propagate any ddl's so are the schema or schema
> > changes? Or is it that the list of tables from the publisher can go
> > into a different schema on the subscriber?
> >
>
> DDL's will not be propagated to the subscriber. Users have to create
> the schema & tables in the subscriber. No change in
> Permissions/authorizations handling, it will be the same as the
> existing behavior for relations.

Looks like the existing behaviour already requires users to create the
schema on the subscriber when publishing the tables from that schema.
Otherwise, an error is thrown on the subscriber [1].

[1] on publisher:
CREATE SCHEMA myschema;
CREATE TABLE myschema.t1(a1 int, b1 int);
INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
CREATE PUBLICATION testpub FOR TABLE myschema.t1;

on subscriber:
postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
dbname=postgres user=bharath port=5432' PUBLICATION testpub;
ERROR:  schema "myschema" does not exist
CREATE SCHEMA myschema;
CREATE TABLE myschema.t1(a1 int, b1 int);
postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
dbname=postgres user=bharath port=5432' PUBLICATION testpub;
NOTICE:  created replication slot "testsub" on publisher
CREATE SUBSCRIPTION

> > Since the schema can have other objects such as data types, functions,
> > operators, I'm sure with your feature, non-table objects will be
> > skipped.
> >
>
> Yes, only table data will be sent to subscribers, non-table objects
> will be skipped.

Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
for all the tables in the database, does this i.e. skips non-table
objects and temporary tables, foreign tables and so on. So, your
feature also can behave the same way, but within the scope of the
given schema/s.

> > As Amit pointed out earlier, the behaviour when schema dropped, I
> > think we should also consider when schema is altered, say altered to a
> > different name, maybe we should change that in the publication too.
> >
>
> I agree that when schema is altered the renamed schema should be
> reflected in the publication.

I think, it's not only making sure that the publisher side has the new
altered schema, but also the subscriber needs those alters. Having
said that, since these alters come under DDL changes and in logical
replication we don't publish the scheme changes to the subscriber, we
may not need to anything extra for informing the schema alters to the
subscriber from the publisher, the users might have to do the same
schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION;  should work for them? If this understanding is
correct, then we should document this.

> > In general, what happens if we have some temporary tables or foreign
> > tables inside the schema, will they be allowed to send the data to
> > subscribers?
> >
>
> Temporary tables & foreign tables will not be added to the publications.

Yes the existing logical replication framework doesn't allow
replication of temporary, unlogged, foreign tables and other non-table
relations such as materialized views, indexes etc [1]. The CREATE
PUBLICATION statement either fails in check_publication_add_relation
or before that.

CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
single table tab1 is any of the above restricted tables, seems fine.
But, if there's a list of tables with CREATE PUBLICATION testpub FOR
TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
first encounter of the restricted table, say at temp_tab2. Whereas,
CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
tables and continue to add the accepted tables into the publication
within the database.

IMHO, if there's a list of tables specified with FOR TABLE, then
instead of throwing an error in case of any restricted table, we can
issue a warning and continue with the addition of accepted tables into
the publication. If done, this behaviour will be in sync with FOR ALL
TABLES;

Thoughts? If okay, I can work on a patch.

Related to error messages: when foreign table is specified in CREATE
PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
[1], how about the error message "ERROR: foerign table "f1" cannot be
replicated". In general, it would be good if we could have the error
messages something like in [2] instead of the existing [1].

Thoughts? If okay, I can work on a patch.

[1]
t1 is a temporary table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  table "t1" cannot be replicated
DETAIL:  Temporary and unlogged relations cannot be replicated.

t1 is an unlogged table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  table "t1" cannot be replicated
DETAIL:  Temporary and unlogged relations cannot be replicated.

f1 is a foreign table:
postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
ERROR:  "f1" is not a table
DETAIL:  Only tables can be added to publications.

mv1 is a mat view:
postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
ERROR:  "mv1" is not a table

idx1 is an index:
postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
ERROR:  "idx1" is an index

[2]
t1 is a temporary table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  temporary table "t1" cannot be replicated
DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.

t1 is an unlogged table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  unlogged table "t1" cannot be replicated
DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.

f1 is a foreign table:
postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
ERROR:  foreign table "f1" cannot be replicated
DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.

> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
>
> There is no provision for this currently.

The documentation [1] says, we can ALTER PUBLICATION testpub DROP
TABLE t1; which removes the table from the list of published tables,
but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
PUBLICATION; for the changes to become effective on the subscriber. I
have done some testing for this case:
1) created publication for table t1, see \d+ t1, the associated
publication is visible in the output
2) created subscription on the subscriber, initial available data from
the publisher for table t1 is received
3) insert into table t1 on the publisher
4) inserted data in (3) is received in the subscriber table t1
5) alter publication to drop the table t1 on the publisher, see \d+
t1, there will not be any associated publication in the output
6) execute alter subscription refresh publication on the subscriber,
with the expectation that it should not receive the data from the
publisher for the table t1 since it's dropped from the publication in
(5)
7) insert into table t1 on the publisher
8) still the newly inserted data in (7) from the publisher, will be
received into the table t1 in the subscriber

IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
the above use case, it looks like a bug to me. If I'm wrong, can
someone correct me?

Thoughts?

[1] - https://www.postgresql.org/docs/devel/sql-alterpublication.html

> > IMO, it's better to have the syntax - CREATE PUBLICATION
> > production_publication FOR ALL TABLES IN SCHEMA production - just
> > added IN between for all tables and schema.
> >
>
> I'm ok with the proposed syntax, I would like others' opinion too
> before making the change.

Thanks.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

japin

Date:

11 January 2021, 07:59:49

On Mon, 11 Jan 2021 at 14:15, Bharath Rupireddy wrote:
> On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
>> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
>> <bharath.rupireddyforpostgres@gmail.com> wrote:
>> > I think this feature can be useful, in case a user has a lot of tables
>> > to publish inside a schema. Having said that, I wonder if this feature
>> > mandates users to create the same schema with same
>> > permissions/authorizations manually on the subscriber, because logical
>> > replication doesn't propagate any ddl's so are the schema or schema
>> > changes? Or is it that the list of tables from the publisher can go
>> > into a different schema on the subscriber?
>> >
>>
>> DDL's will not be propagated to the subscriber. Users have to create
>> the schema & tables in the subscriber. No change in
>> Permissions/authorizations handling, it will be the same as the
>> existing behavior for relations.
>
> Looks like the existing behaviour already requires users to create the
> schema on the subscriber when publishing the tables from that schema.
> Otherwise, an error is thrown on the subscriber [1].
>
> [1] on publisher:
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
> CREATE PUBLICATION testpub FOR TABLE myschema.t1;
>
> on subscriber:
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> ERROR:  schema "myschema" does not exist
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> NOTICE:  created replication slot "testsub" on publisher
> CREATE SUBSCRIPTION
>
>> > Since the schema can have other objects such as data types, functions,
>> > operators, I'm sure with your feature, non-table objects will be
>> > skipped.
>> >
>>
>> Yes, only table data will be sent to subscribers, non-table objects
>> will be skipped.
>
> Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
> for all the tables in the database, does this i.e. skips non-table
> objects and temporary tables, foreign tables and so on. So, your
> feature also can behave the same way, but within the scope of the
> given schema/s.
>
>> > As Amit pointed out earlier, the behaviour when schema dropped, I
>> > think we should also consider when schema is altered, say altered to a
>> > different name, maybe we should change that in the publication too.
>> >
>>
>> I agree that when schema is altered the renamed schema should be
>> reflected in the publication.
>
> I think, it's not only making sure that the publisher side has the new
> altered schema, but also the subscriber needs those alters. Having
> said that, since these alters come under DDL changes and in logical
> replication we don't publish the scheme changes to the subscriber, we
> may not need to anything extra for informing the schema alters to the
> subscriber from the publisher, the users might have to do the same
> schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
> REFRESH PUBLICATION;  should work for them? If this understanding is
> correct, then we should document this.
>
>> > In general, what happens if we have some temporary tables or foreign
>> > tables inside the schema, will they be allowed to send the data to
>> > subscribers?
>> >
>>
>> Temporary tables & foreign tables will not be added to the publications.
>
> Yes the existing logical replication framework doesn't allow
> replication of temporary, unlogged, foreign tables and other non-table
> relations such as materialized views, indexes etc [1]. The CREATE
> PUBLICATION statement either fails in check_publication_add_relation
> or before that.
>
> CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
> single table tab1 is any of the above restricted tables, seems fine.
> But, if there's a list of tables with CREATE PUBLICATION testpub FOR
> TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
> unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
> first encounter of the restricted table, say at temp_tab2. Whereas,
> CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
> tables and continue to add the accepted tables into the publication
> within the database.
>
> IMHO, if there's a list of tables specified with FOR TABLE, then
> instead of throwing an error in case of any restricted table, we can
> issue a warning and continue with the addition of accepted tables into
> the publication. If done, this behaviour will be in sync with FOR ALL
> TABLES;
>
> Thoughts? If okay, I can work on a patch.
>
> Related to error messages: when foreign table is specified in CREATE
> PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
> [1], how about the error message "ERROR: foerign table "f1" cannot be
> replicated". In general, it would be good if we could have the error
> messages something like in [2] instead of the existing [1].
>
> Thoughts? If okay, I can work on a patch.
>
> [1]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  "f1" is not a table
> DETAIL:  Only tables can be added to publications.
>
> mv1 is a mat view:
> postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
> ERROR:  "mv1" is not a table
>
> idx1 is an index:
> postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
> ERROR:  "idx1" is an index
>
> [2]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  temporary table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  unlogged table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  foreign table "f1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
>> > Say a user has created a publication for a schema with hundreds of
>> > tables in it, at some point later, can he stop replicating a single or
>> > some tables from that schema?
>> >
>>
>> There is no provision for this currently.
>
> The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> TABLE t1; which removes the table from the list of published tables,
> but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> PUBLICATION; for the changes to become effective on the subscriber. I
> have done some testing for this case:
> 1) created publication for table t1, see \d+ t1, the associated
> publication is visible in the output
> 2) created subscription on the subscriber, initial available data from
> the publisher for table t1 is received
> 3) insert into table t1 on the publisher
> 4) inserted data in (3) is received in the subscriber table t1
> 5) alter publication to drop the table t1 on the publisher, see \d+
> t1, there will not be any associated publication in the output
> 6) execute alter subscription refresh publication on the subscriber,
> with the expectation that it should not receive the data from the
> publisher for the table t1 since it's dropped from the publication in
> (5)
> 7) insert into table t1 on the publisher
> 8) still the newly inserted data in (7) from the publisher, will be
> received into the table t1 in the subscriber
>
> IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> the above use case, it looks like a bug to me. If I'm wrong, can
> someone correct me?
>

Yes, if we modify the publication, we should refresh the subscription on
each subscriber.  It looks strange for me, especially for partitioned
tables [1].

> Thoughts?
>

Can we trace the different between publication and subscription, and
auto-refresh subscription on subscriber?

[1]
https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

11 January 2021, 09:06:51

On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
> >> > Say a user has created a publication for a schema with hundreds of
> >> > tables in it, at some point later, can he stop replicating a single or
> >> > some tables from that schema?
> >> >
> >>
> >> There is no provision for this currently.
> >
> > The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> > TABLE t1; which removes the table from the list of published tables,
> > but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> > PUBLICATION; for the changes to become effective on the subscriber. I
> > have done some testing for this case:
> > 1) created publication for table t1, see \d+ t1, the associated
> > publication is visible in the output
> > 2) created subscription on the subscriber, initial available data from
> > the publisher for table t1 is received
> > 3) insert into table t1 on the publisher
> > 4) inserted data in (3) is received in the subscriber table t1
> > 5) alter publication to drop the table t1 on the publisher, see \d+
> > t1, there will not be any associated publication in the output
> > 6) execute alter subscription refresh publication on the subscriber,
> > with the expectation that it should not receive the data from the
> > publisher for the table t1 since it's dropped from the publication in
> > (5)
> > 7) insert into table t1 on the publisher
> > 8) still the newly inserted data in (7) from the publisher, will be
> > received into the table t1 in the subscriber
> >
> > IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> > the above use case, it looks like a bug to me. If I'm wrong, can
> > someone correct me?
> >
>
> Yes, if we modify the publication, we should refresh the subscription on
> each subscriber.  It looks strange for me, especially for partitioned
> tables [1].
>
> > Thoughts?
> >
>
> Can we trace the different between publication and subscription, and
> auto-refresh subscription on subscriber?
>
> [1]
> https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com

As Amit stated in your thread [1], DDLs like creation of the new
tables or partitions, schema changes etc. on the publisher can not be
replicated automatically by the logical replication framework to the
subscriber. Users have to perform those DDLs on the subscribers by
themselves.

If your point is to at least issue the ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION; from the publication whenever the publication is
altered i.e. added or dropped tables, IMO, we cannot do this, because
running this command on the subscriber only makes sense, after user
runs the same DDLs (which were run on the publisher) also on the
subscriber. To illustrate this:
1) create a new table or partition on the publisher and add it to
publisher, note that the same table has not yet been created on the
subscriber
2) imagine the publisher issuing an auto refresh command to all the
subscribers, then, no point in that right, because the new table or
the partition is not yet created on all the subscribers.

So, IMO, we can not have an auto refresh mechanism, until we have the
feature to replicate the DDL changes to all the subscribers.

What I stated in my earlier mail [1] is that even though we drop the
table from the publication in the publisher and run a refresh
publication on the subscriber, still the data is being replicated from
the publisher to the subscriber table. I just wanted to know whether
this is the expected behaviour or what exactly means. a user running
ALTER PUBLICATION mypub DROP TABLE mytable;

[1] - https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Li Japin

Date:

11 January 2021, 10:59:43

On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
Say a user has created a publication for a schema with hundreds of
tables in it, at some point later, can he stop replicating a single or
some tables from that schema?

There is no provision for this currently.

The documentation [1] says, we can ALTER PUBLICATION testpub DROP
TABLE t1; which removes the table from the list of published tables,
but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
PUBLICATION; for the changes to become effective on the subscriber. I
have done some testing for this case:
1) created publication for table t1, see \d+ t1, the associated
publication is visible in the output
2) created subscription on the subscriber, initial available data from
the publisher for table t1 is received
3) insert into table t1 on the publisher
4) inserted data in (3) is received in the subscriber table t1
5) alter publication to drop the table t1 on the publisher, see \d+
t1, there will not be any associated publication in the output
6) execute alter subscription refresh publication on the subscriber,
with the expectation that it should not receive the data from the
publisher for the table t1 since it's dropped from the publication in
(5)
7) insert into table t1 on the publisher
8) still the newly inserted data in (7) from the publisher, will be
received into the table t1 in the subscriber

IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
the above use case, it looks like a bug to me. If I'm wrong, can
someone correct me?

Yes, if we modify the publication, we should refresh the subscription on
each subscriber. It looks strange for me, especially for partitioned
tables [1].

Thoughts?

Can we trace the different between publication and subscription, and
auto-refresh subscription on subscriber?

[1]
https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com

As Amit stated in your thread [1], DDLs like creation of the new
tables or partitions, schema changes etc. on the publisher can not be
replicated automatically by the logical replication framework to the
subscriber. Users have to perform those DDLs on the subscribers by
themselves.

Yeah, DDLs is not supported now. On publisher, the partitions are added to the

publication automatically. However, even if we created the partitions on subscriber,

it will not sync the new partitions, because it likes normal table, we must execute

ALTER SUBSCRIPTION my_test REFRESH PUBLICATION;

I preferred it will automatically add to subscription when we create the new partitions

if the partitions is already in publication.

If your point is to at least issue the ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION; from the publication whenever the publication is
altered i.e. added or dropped tables, IMO, we cannot do this, because
running this command on the subscriber only makes sense, after user
runs the same DDLs (which were run on the publisher) also on the
subscriber. To illustrate this:
1) create a new table or partition on the publisher and add it to
publisher, note that the same table has not yet been created on the
subscriber
2) imagine the publisher issuing an auto refresh command to all the
subscribers, then, no point in that right, because the new table or
the partition is not yet created on all the subscribers.

So, IMO, we can not have an auto refresh mechanism, until we have the
feature to replicate the DDL changes to all the subscribers.

Thanks for clarification.

What I stated in my earlier mail [1] is that even though we drop the
table from the publication in the publisher and run a refresh
publication on the subscriber, still the data is being replicated from
the publisher to the subscriber table. I just wanted to know whether
this is the expected behaviour or what exactly means. a user running
ALTER PUBLICATION mypub DROP TABLE mytable;

[1] - https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com

Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data

between step (5) and (6), it will not ship the new records, however, if we insert

data between step (5) and (6), it will ship the new records.

(1) created publication for table t1, t2

postgres[8765]=# CREATE TABLE t1 (a int);

CREATE TABLE

postgres[8765]=# CREATE TABLE t2 (a int);

CREATE TABLE

postgres[8765]=# INSERT INTO t1 VALUES (1);

INSERT 0 1

postgres[8765]=# INSERT INTO t2 VALUES (1);

INSERT 0 1

postgres[8765]=# CREATE PUBLICATION mypub1 FOR TABLE t1;

CREATE PUBLICATION

postgres[8765]=# CREATE PUBLICATION mypub2 FOR TABLE t2;

CREATE PUBLICATION

(2) created subscription on the subscriber

postgres[9812]=# CREATE TABLE t1 (a int);

CREATE TABLE

postgres[9812]=# CREATE TABLE t2 (a int);

CREATE TABLE

postgres[9812]=# CREATE SUBSCRIPTION mysub1 CONNECTION 'host=localhost port=8765 dbname=postgres' PUBLICATION mypub1;

NOTICE: created replication slot "mysub1" on publisher

CREATE SUBSCRIPTION

postgres[9812]=# CREATE SUBSCRIPTION mysub2 CONNECTION 'host=localhost port=8765 dbname=postgres' PUBLICATION mypub2;

NOTICE: created replication slot "mysub2" on publisher

CREATE SUBSCRIPTION

postgres[9812]=# TABLE t1;

---

(1 row)

postgres[9812]=# TABLE t2;

---

(1 row)

(3) insert into table t1, t2 on the publisher

postgres[8765]=# INSERT INTO t1 VALUES (2);

INSERT 0 1

postgres[8765]=# INSERT INTO t2 VALUES (2);

INSERT 0 1

(4) inserted data in (3) is received in the subscriber table t1, t2

postgres[9812]=# TABLE t1;

---

(2 rows)

postgres[9812]=# TABLE t2;

---

(2 rows)

(5) alter publication to drop table, we insert a record into t1 on publisher

postgres[8765]=# ALTER PUBLICATION mypub1 DROP TABLE t1;

ALTER PUBLICATION

postgres[8765]=# ALTER PUBLICATION mypub2 DROP TABLE t2;

ALTER PUBLICATION

postgres[8765]=# INSERT INTO t1 VALUES (3);

INSERT 0 1

(6) check the data on subscriber

postgres[9812]=# TABLE t1;

---

(3 rows)

postgres[9812]=# TABLE t2;

---

(2 rows)

(7) refresh subscription on the subscriber

postgres[9812]=# ALTER SUBSCRIPTION mysub1 REFRESH PUBLICATION;

ALTER SUBSCRIPTION

postgres[9812]=# ALTER SUBSCRIPTION mysub2 REFRESH PUBLICATION;

ALTER SUBSCRIPTION

(8) insert into table t1, t2 on the publisher

postgres[8765]=# INSERT INTO t1 VALUES (4);

INSERT 0 1

postgres[8765]=# INSERT INTO t2 VALUES (4);

INSERT 0 1

(9) the newly inserted data in (5), (7) for table t1 shipped to subscriber, however

t2 doesn’t

postgres[9812]=# TABLE t1;

---

(4 rows)

postgres[9812]=# TABLE t2;

---

(2 rows)

It might be a bug.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

11 January 2021, 11:57:22

On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote:
>
>
> On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
>
> Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
> between step (5) and (6), it will not ship the new records, however, if we insert
> data between step (5) and (6), it will ship the new records.
>
>
..
> It might be a bug.
>

Can you check pg_publication_rel and pg_subscription_rel? Also, this
is not related to the feature proposed in this thread, so it is better
to start a new thread to conclude whether this is a bug or not.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

11 January 2021, 11:58:21

On Mon, Jan 11, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote:
> >
> >
> > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
> >
> > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
> > between step (5) and (6), it will not ship the new records, however, if we insert
> > data between step (5) and (6), it will ship the new records.
> >
> >
> ..
> > It might be a bug.
> >
>
> Can you check pg_publication_rel and pg_subscription_rel? Also, this
> is not related to the feature proposed in this thread, so it is better
> to start a new thread to conclude whether this is a bug or not.

Thanks Amit, sure I will verify and start a new thread.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

11 January 2021, 13:22:50

On Mon, Jan 11, 2021 at 5:28 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote:
> > >
> > >
> > > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
> > >
> > > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
> > > between step (5) and (6), it will not ship the new records, however, if we insert
> > > data between step (5) and (6), it will ship the new records.
> > >
> > >
> > ..
> > > It might be a bug.
> > >
> >
> > Can you check pg_publication_rel and pg_subscription_rel? Also, this
> > is not related to the feature proposed in this thread, so it is better
> > to start a new thread to conclude whether this is a bug or not.
>
> Thanks Amit, sure I will verify and start a new thread.

I started a new thread [1] for this, please have a look.

[1] - https://www.postgresql.org/message-id/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Rahila Syed

Date:

12 January 2021, 11:46:02

Hi Vignesh,

I had a look at the patch, please consider following comments.

On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:

Hi,

This feature adds schema option while creating publication. Users will
be able to specify one or more schemas while creating publication,
when the user specifies schema option, then the data changes for the
tables present in the schema specified by the user will be replicated
to the subscriber. Few examples have been listed below:

Create a publication that publishes all changes for all the tables
present in production schema:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;

Should it be FOR TABLES IN SCHEMA instead of FOR ALL TABLES SCHEMA?

Create a publication that publishes all changes for all the tables
present in marketing and sales schemas:
CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;

Add some schemas to the publication:
ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;

As per current implementation this command fails even if one of the schemas does not

exist. I think this is counterintuitive, it should throw a warning and continue adding the rest.

Drop some schema from the publication:
ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;

Same for drop schema, if one of these schemas does not exist in publication,

the entire DROP operation is aborted.

Thank you,

Rahila Syed

Re: Added schema level support for publication.

From

vignesh C

Date:

12 January 2021, 14:53:11

On Mon, Jan 11, 2021 at 11:45 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > I think this feature can be useful, in case a user has a lot of tables
> > > to publish inside a schema. Having said that, I wonder if this feature
> > > mandates users to create the same schema with same
> > > permissions/authorizations manually on the subscriber, because logical
> > > replication doesn't propagate any ddl's so are the schema or schema
> > > changes? Or is it that the list of tables from the publisher can go
> > > into a different schema on the subscriber?
> > >
> >
> > DDL's will not be propagated to the subscriber. Users have to create
> > the schema & tables in the subscriber. No change in
> > Permissions/authorizations handling, it will be the same as the
> > existing behavior for relations.
>
> Looks like the existing behaviour already requires users to create the
> schema on the subscriber when publishing the tables from that schema.
> Otherwise, an error is thrown on the subscriber [1].
>
> [1] on publisher:
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
> CREATE PUBLICATION testpub FOR TABLE myschema.t1;
>
> on subscriber:
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> ERROR:  schema "myschema" does not exist
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> NOTICE:  created replication slot "testsub" on publisher
> CREATE SUBSCRIPTION
>

Yes this feature will also have the same behavior, DDL creation should
be taken care of by DBA similar to how it is handled may be using
pg_dump or use sql scripts/statements to update.

> > > Since the schema can have other objects such as data types, functions,
> > > operators, I'm sure with your feature, non-table objects will be
> > > skipped.
> > >
> >
> > Yes, only table data will be sent to subscribers, non-table objects
> > will be skipped.
>
> Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
> for all the tables in the database, does this i.e. skips non-table
> objects and temporary tables, foreign tables and so on. So, your
> feature also can behave the same way, but within the scope of the
> given schema/s.
>

Yes, it will support only normal tables. Non table objects, foreign
tables & temporary tables will not be supported.

> > > As Amit pointed out earlier, the behaviour when schema dropped, I
> > > think we should also consider when schema is altered, say altered to a
> > > different name, maybe we should change that in the publication too.
> > >
> >
> > I agree that when schema is altered the renamed schema should be
> > reflected in the publication.
>
> I think, it's not only making sure that the publisher side has the new
> altered schema, but also the subscriber needs those alters. Having
> said that, since these alters come under DDL changes and in logical
> replication we don't publish the scheme changes to the subscriber, we
> may not need to anything extra for informing the schema alters to the
> subscriber from the publisher, the users might have to do the same
> schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
> REFRESH PUBLICATION;  should work for them? If this understanding is
> correct, then we should document this.
>

Yes, alter schema changes will be reflected in the publication, the
corresponding change needs to be done by the user on the subscriber
side. Once a user does ALTER SUBSCRIPTION testsub REFRESH PUBLICATION,
the new altered schema changes will be reflected in the subscriber. I
will update the documentation that user need to take care for
subscription refresh.

> > > In general, what happens if we have some temporary tables or foreign
> > > tables inside the schema, will they be allowed to send the data to
> > > subscribers?
> > >
> >
> > Temporary tables & foreign tables will not be added to the publications.
>
> Yes the existing logical replication framework doesn't allow
> replication of temporary, unlogged, foreign tables and other non-table
> relations such as materialized views, indexes etc [1]. The CREATE
> PUBLICATION statement either fails in check_publication_add_relation
> or before that.
>
> CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
> single table tab1 is any of the above restricted tables, seems fine.
> But, if there's a list of tables with CREATE PUBLICATION testpub FOR
> TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
> unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
> first encounter of the restricted table, say at temp_tab2. Whereas,
> CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
> tables and continue to add the accepted tables into the publication
> within the database.
>
> IMHO, if there's a list of tables specified with FOR TABLE, then
> instead of throwing an error in case of any restricted table, we can
> issue a warning and continue with the addition of accepted tables into
> the publication. If done, this behaviour will be in sync with FOR ALL
> TABLES;
>
> Thoughts? If okay, I can work on a patch.
>

I feel we can start a new thread for this to seek opinion if this base
change is required and reach consensus.

> Related to error messages: when foreign table is specified in CREATE
> PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
> [1], how about the error message "ERROR: foerign table "f1" cannot be
> replicated". In general, it would be good if we could have the error
> messages something like in [2] instead of the existing [1].
>
> Thoughts? If okay, I can work on a patch.
>

I feel we can start a new thread for this to seek opinion and reach consensus.

> [1]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  "f1" is not a table
> DETAIL:  Only tables can be added to publications.
>
> mv1 is a mat view:
> postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
> ERROR:  "mv1" is not a table
>
> idx1 is an index:
> postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
> ERROR:  "idx1" is an index
>
> [2]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  temporary table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  unlogged table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  foreign table "f1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> > > Say a user has created a publication for a schema with hundreds of
> > > tables in it, at some point later, can he stop replicating a single or
> > > some tables from that schema?
> > >
> >
> > There is no provision for this currently.
>
> The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> TABLE t1; which removes the table from the list of published tables,
> but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> PUBLICATION; for the changes to become effective on the subscriber. I
> have done some testing for this case:
> 1) created publication for table t1, see \d+ t1, the associated
> publication is visible in the output
> 2) created subscription on the subscriber, initial available data from
> the publisher for table t1 is received
> 3) insert into table t1 on the publisher
> 4) inserted data in (3) is received in the subscriber table t1
> 5) alter publication to drop the table t1 on the publisher, see \d+
> t1, there will not be any associated publication in the output
> 6) execute alter subscription refresh publication on the subscriber,
> with the expectation that it should not receive the data from the
> publisher for the table t1 since it's dropped from the publication in
> (5)
> 7) insert into table t1 on the publisher
> 8) still the newly inserted data in (7) from the publisher, will be
> received into the table t1 in the subscriber
>
> IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> the above use case, it looks like a bug to me. If I'm wrong, can
> someone correct me?
>
> Thoughts?

I think you started a new thread for this, let's conclude on this there.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

vignesh C

Date:

15 January 2021, 09:46:48

Thanks Rahila for your comments, please find my thoughts below.

On Tue, Jan 12, 2021 at 5:16 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
> I had a look at the patch, please consider following comments.
>
> On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> Hi,
>>
>> This feature adds schema option while creating publication. Users will
>> be able to specify one or more schemas while creating publication,
>> when the user specifies schema option, then the data changes for the
>> tables present in the schema specified by the user will be replicated
>> to the subscriber. Few examples have been listed below:
>>
>> Create a publication that publishes all changes for all the tables
>> present in production schema:
>> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
>>
> Should it be FOR TABLES IN SCHEMA instead of FOR ALL TABLES SCHEMA?
>

For adding tables into publication we have syntax like:
CREATE PUBLICATION mypub FOR TABLE tbl1, tbl2;
For all tables we have syntax like:
CREATE PUBLICATION mypub FOR ALL TABLES;

Initial syntax that I proposed was:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;

I feel the below syntax is better, as it is consistent with others:
CREATE PUBLICATION mypub FOR SCHEMA sch1, sch2;

>>
>> Create a publication that publishes all changes for all the tables
>> present in marketing and sales schemas:
>> CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
>>
>> Add some schemas to the publication:
>> ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
>>
> As per current implementation this command fails even if one of the schemas does not
> exist. I think this is counterintuitive, it should throw a warning and continue adding the rest.
>

We have the similar behavior in case of adding non-existent table while creating a publication:
CREATE PUBLICATION mypub3 FOR TABLE non_existent_table;
ERROR: relation "non_existent_table" does not exist
I feel we can keep the behavior similarly to maintain the consistency.

>>
>> Drop some schema from the publication:
>> ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
>>
> Same for drop schema, if one of these schemas does not exist in publication,
> the entire DROP operation is aborted.

We have similar behavior in case of dropping non-existent table while altering publication
alter publication mypub5 drop table test1,testx;
ERROR: relation "testx" does not exist
I feel we can keep the behavior similarly to maintain the consistency.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

vignesh C

Date:

18 January 2021, 13:45:25

On Sat, Jan 9, 2021 at 5:21 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jan 8, 2021 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > This feature adds schema option while creating publication. Users will
> > > be able to specify one or more schemas while creating publication,
> > > when the user specifies schema option, then the data changes for the
> > > tables present in the schema specified by the user will be replicated
> > > to the subscriber. Few examples have been listed below:
> > >
> > > Create a publication that publishes all changes for all the tables
> > > present in production schema:
> > > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
> > >
> > > Create a publication that publishes all changes for all the tables
> > > present in marketing and sales schemas:
> > > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
> > >
> > > Add some schemas to the publication:
> > > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
> > >
> > > Drop some schema from the publication:
> > > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
> > >
> > > Attached is a POC patch for the same. I felt this feature would be quite useful.
> > >
> >
> > What do we do if the user Drops the schema? Do we automatically remove
> > it from the publication?
> >
> I have not yet handled this scenario yet, I will handle this and
> adding of tests in the next patch.
>

I have handled the above scenario(drop schema should automatically
remove the schema entry from publication schema relation) & addition
of tests in the new v2 patch attached.
Thoughts?


Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachment

v2-0001-Added-schema-level-support-for-publication.patch

Re: Added schema level support for publication.

From

vignesh C

Date:

19 January 2021, 11:41:55

On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
>
> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>

This scenario is handled now in the patch posted at [1].

>
> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales; --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.

I have not yet handled this, I'm working on this and will try post a patch for this in the next version.

[1] - https://www.postgresql.org/message-id/CALDaNm02%3Dk8K_ZSN7_dyVHyMTW4B5hOaeo2PzdWG%3Da7GtLH0oA%40mail.gmail.com

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

Rahila Syed

Date:

20 January 2021, 12:57:47

Hi Vignesh,

I have handled the above scenario(drop schema should automatically
remove the schema entry from publication schema relation) & addition
of tests in the new v2 patch attached.
Thoughts?

Please see some initial comments:

1. I think there should be more tests to show that the schema data is actually replicated

to the subscriber. Currently, I am not seeing the data being replicated when I use FOR SCHEMA.

2. How does replication behave when a table is added or removed from a subscribed schema

using ALTER TABLE SET SCHEMA?

3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't

specify one, this can be handy to replicate current schema tables.

4. + The fourth, fifth and sixth variants change which schemas are part of the

+ publication. The <literal>SET TABLE</literal> clause will replace the list
+ of schemas in the publication with the specified one. The <literal>ADD

There is a typo above s/SET TABLE/SET SCHEMA

Thank you,

Rahila Syed

Re: Added schema level support for publication.

From

vignesh C

Date:

22 January 2021, 04:31:35

Thanks Rahila for your comments. Please find my thoughts below:

On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
>>
>> I have handled the above scenario(drop schema should automatically
>> remove the schema entry from publication schema relation) & addition
>> of tests in the new v2 patch attached.
>> Thoughts?
>
>
> Please see some initial comments:
>
> 1. I think there should be more tests to show that the schema data is actually replicated
> to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
>
I will fix this issue and include more tests in my next version of the patch.

> 2. How does replication behave when a table is added or removed from a subscribed schema
> using ALTER TABLE SET SCHEMA?
>
I would like to keep the behavior similar to the table behavior. I
will post more details for this along with my next version of the
patch.

> 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> specify one, this can be handy to replicate current schema tables.
>
It looks like a good use case, I will check on the feasibility of this
and try to implement this.

> 4. +   The fourth, fifth and sixth variants change which schemas are part of the
> +   publication.  The <literal>SET TABLE</literal> clause will replace the list
> +   of schemas in the publication with the specified one.  The <literal>ADD
>
> There is a typo above s/SET TABLE/SET SCHEMA
I will fix this in the next version of the patch.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From

vignesh C

Date:

31 January 2021, 18:02:23

On Fri, Jan 22, 2021 at 10:01 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks Rahila for your comments. Please find my thoughts below:
>
> On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
> >
> > Hi Vignesh,
> >
> >>
> >> I have handled the above scenario(drop schema should automatically
> >> remove the schema entry from publication schema relation) & addition
> >> of tests in the new v2 patch attached.
> >> Thoughts?
> >
> >
> > Please see some initial comments:
> >
> > 1. I think there should be more tests to show that the schema data is actually replicated
> > to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
> >
> I will fix this issue and include more tests in my next version of the patch.

Modified to handle this and also added a few more tests.

> > 2. How does replication behave when a table is added or removed from a subscribed schema
> > using ALTER TABLE SET SCHEMA?
> >
> I would like to keep the behavior similar to the table behavior. I
> will post more details for this along with my next version of the
> patch.
>

If a table is set to a different schema, after the schema change table
data will not be sent to the subscriber.
When a new table is added to the published schema, the table data will
be sent by the publisher, subscriber will not apply the changes. If
the change needs to be reflected, subscriber's publication should be
refreshed using "alter subscription mysub1 refresh publication". This
relation will be reflected in the subscriber relation when the
subscriber's publication is refreshed.
If a table is dropped, there is no impact on subscriber, This relation
will be present in pg_subscriber_rel after refreshing subscriber
publication.

> > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> > specify one, this can be handy to replicate current schema tables.
> >
> It looks like a good use case, I will check on the feasibility of this
> and try to implement this.

This can be done, I will handle this later.

> > 4. +   The fourth, fifth and sixth variants change which schemas are part of the
> > +   publication.  The <literal>SET TABLE</literal> clause will replace the list
> > +   of schemas in the publication with the specified one.  The <literal>ADD
> >
> > There is a typo above s/SET TABLE/SET SCHEMA
> I will fix this in the next version of the patch.

Modified it.
I have separated the tests and documentation into a separate patch to
make review easier. Attached v3 patch with the fixes.
Thoughts?

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

24 May 2021, 12:30:47

On Sun, Jan 31, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jan 22, 2021 at 10:01 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks Rahila for your comments. Please find my thoughts below:
> >
> > On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
> > >
> > > Hi Vignesh,
> > >
> > >>
> > >> I have handled the above scenario(drop schema should automatically
> > >> remove the schema entry from publication schema relation) & addition
> > >> of tests in the new v2 patch attached.
> > >> Thoughts?
> > >
> > >
> > > Please see some initial comments:
> > >
> > > 1. I think there should be more tests to show that the schema data is actually replicated
> > > to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
> > >
> > I will fix this issue and include more tests in my next version of the patch.
>
> Modified to handle this and also added a few more tests.
>
> > > 2. How does replication behave when a table is added or removed from a subscribed schema
> > > using ALTER TABLE SET SCHEMA?
> > >
> > I would like to keep the behavior similar to the table behavior. I
> > will post more details for this along with my next version of the
> > patch.
> >
>
> If a table is set to a different schema, after the schema change table
> data will not be sent to the subscriber.
> When a new table is added to the published schema, the table data will
> be sent by the publisher, subscriber will not apply the changes. If
> the change needs to be reflected, subscriber's publication should be
> refreshed using "alter subscription mysub1 refresh publication". This
> relation will be reflected in the subscriber relation when the
> subscriber's publication is refreshed.
> If a table is dropped, there is no impact on subscriber, This relation
> will be present in pg_subscriber_rel after refreshing subscriber
> publication.
>
> > > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> > > specify one, this can be handy to replicate current schema tables.
> > >
> > It looks like a good use case, I will check on the feasibility of this
> > and try to implement this.
>
> This can be done, I will handle this later.
>
> > > 4. +   The fourth, fifth and sixth variants change which schemas are part of the
> > > +   publication.  The <literal>SET TABLE</literal> clause will replace the list
> > > +   of schemas in the publication with the specified one.  The <literal>ADD
> > >
> > > There is a typo above s/SET TABLE/SET SCHEMA
> > I will fix this in the next version of the patch.
>
> Modified it.
> I have separated the tests and documentation into a separate patch to
> make review easier. Attached v3 patch with the fixes.
> Thoughts?

The earlier patch does not apply on the head. The v4 patch attached
has the following changes:
a) Rebased it on head. b) Removed pubschemas, pubtables columns and
replaced it with pubtype in pg_publication table. c) List the schemas
in describe publication. d) List the publication in list schemas. e)
Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
"FOR SCHEMA" in create publication and alter publication. g) Included
the newly added structure type to typedefs.lst

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

25 May 2021, 09:17:05

On Monday, May 24, 2021 at 8:31 PM vignesh C <vignesh21@gmail.com> wrote:
> The earlier patch does not apply on the head. The v4 patch attached
> has the following changes:
> a) Rebased it on head. b) Removed pubschemas, pubtables columns and
> replaced it with pubtype in pg_publication table. c) List the schemas
> in describe publication. d) List the publication in list schemas. e)
> Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
> "FOR SCHEMA" in create publication and alter publication. g) Included
> the newly added structure type to typedefs.lst

Thanks for your patch.

I ran "make check-world" after applying your patch but it failed on my machine. I saw the following log:
--------------
parallel group (2 tests):  subscription publication
     publication                  ... FAILED      108 ms
     subscription                 ... ok           87 ms


diff -U3 /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out
--- /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out      2021-05-25
15:44:52.261683712+0800
 
+++ /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out       2021-05-25
15:48:41.393672595+0800
 
@@ -359,10 +359,10 @@
     "public"

 \dn public;
- List of schemas
-  Name  |  Owner
---------+---------
- public | vignesh
+List of schemas
+  Name  | Owner
+--------+-------
+ public | fnst
 Publications:
     "testpub3_forschema"
--------------

I think the owner of CURRENT_SCHEMA should not be written into publication.out because the result is related to the
user.
 
Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default owner before this test case. Thoughts?

Regards
Tang

Re: Added schema level support for publication.

From

vignesh C

Date:

05 June 2021, 13:32:28

On Tue, May 25, 2021 at 2:47 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, May 24, 2021 at 8:31 PM vignesh C <vignesh21@gmail.com> wrote:
> > The earlier patch does not apply on the head. The v4 patch attached
> > has the following changes:
> > a) Rebased it on head. b) Removed pubschemas, pubtables columns and
> > replaced it with pubtype in pg_publication table. c) List the schemas
> > in describe publication. d) List the publication in list schemas. e)
> > Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
> > "FOR SCHEMA" in create publication and alter publication. g) Included
> > the newly added structure type to typedefs.lst
>
> Thanks for your patch.
>
> I ran "make check-world" after applying your patch but it failed on my machine. I saw the following log:
> --------------
> parallel group (2 tests):  subscription publication
>      publication                  ... FAILED      108 ms
>      subscription                 ... ok           87 ms
>
>
> diff -U3 /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out
> --- /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out      2021-05-25
15:44:52.261683712+0800
 
> +++ /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out       2021-05-25
15:48:41.393672595+0800
 
> @@ -359,10 +359,10 @@
>      "public"
>
>  \dn public;
> - List of schemas
> -  Name  |  Owner
> ---------+---------
> - public | vignesh
> +List of schemas
> +  Name  | Owner
> +--------+-------
> + public | fnst
>  Publications:
>      "testpub3_forschema"
> --------------
>
> I think the owner of CURRENT_SCHEMA should not be written into publication.out because the result is related to the
user.
> Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default owner before this test case. Thoughts?

Thanks for identifying and reporting this issue. I have \dn with the
equivalent query to display only the publication name. The updated
patch has the fix for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

11 June 2021, 12:52:29

On Sat, Jun 5, 2021 at 7:02 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.

Currently, FOR ALL TABLES is there to add all the tables(existing and
future) in the current database in which the publication is created. I
wonder before providing FOR SCHEMA capability, we better target FOR
DATABASE first, something like CREATE PUBLICATION ... FOR DATABASE
foo, bar, baz, qux; Of course users with the proper permissions on the
specified databases can add them to the publication. This can help to
add all the tables in other databases as well. Then, the CREATE
PUBLICATION ... FOR SCHEMA foo, bar, baz, qux; makes more sense.
Because, my understanding is that: database is a collection of tables,
schema is a collection of databases. I may be wrong here, but it's
just a thought. What do you think?

With Regards,
Bharath Rupireddy.

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

11 June 2021, 13:13:44

On Fri, Jun 11, 2021, 6:22 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Sat, Jun 5, 2021 at 7:02 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.

Currently, FOR ALL TABLES is there to add all the tables(existing and
future) in the current database in which the publication is created. I
wonder before providing FOR SCHEMA capability, we better target FOR
DATABASE first, something like CREATE PUBLICATION ... FOR DATABASE
foo, bar, baz, qux; Of course users with the proper permissions on the
specified databases can add them to the publication. This can help to
add all the tables in other databases as well. Then, the CREATE
PUBLICATION ... FOR SCHEMA foo, bar, baz, qux; makes more sense.
Because, my understanding is that: database is a collection of tables,
schema is a collection of databases. I may be wrong here, but it's
just a thought. What do you think?

Please ignore above comment. I was confused about what a database and schema is in postgres. I'm sorry for the noise.

https://www.postgresql.org/docs/devel/ddl-schemas.html

Regards,

Bharath Rupireddy.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

16 June 2021, 11:29:45

On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> And, with this feature, since there can be many huge tables inside a
> schema, the initial table sync phase of the replication can take a
> while.
>
> Say a user has created a publication for a schema with hundreds of
> tables in it, at some point later, can he stop replicating a single or
> some tables from that schema?
>

Isn't this applies to FOR ALL TABLES syntax as well where the user
might want to exclude one or a few tables? I am not sure if it is a
good idea to deal with this as part of this patch.


-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Ajin Cherian

Date:

16 June 2021, 11:42:22

On Sat, Jun 5, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote:

> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.
>

The patch no longer applies, I think a recent change to tab-complete
has broken it.

regards,
Ajin Cherian
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

16 June 2021, 14:41:08

On Wed, Jun 16, 2021 at 5:12 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Sat, Jun 5, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> > Thanks for identifying and reporting this issue. I have \dn with the
> > equivalent query to display only the publication name. The updated
> > patch has the fix for the same.
> >
>
> The patch no longer applies, I think a recent change to tab-complete
> has broken it.

Thanks for reporting it, the attached patch is a rebased version of
the patch with few review comment fixes, I will reply with the comment
fixes to the respective mails.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

17 June 2021, 01:05:16

On Wed, Jun 16, 2021 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > And, with this feature, since there can be many huge tables inside a
> > schema, the initial table sync phase of the replication can take a
> > while.
> >
> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
>
> Isn't this applies to FOR ALL TABLES syntax as well where the user
> might want to exclude one or a few tables? I am not sure if it is a
> good idea to deal with this as part of this patch.

Yes, this applies to "FOR ALL TABLES" publications also, i had worked for this, the patch for the same is attached at [1]. I had created a separate patch for this so that it is easier for reviewing.

[1] - https://www.postgresql.org/message-id/CALDaNm10g2h29a-oFHsadk-Du6RDhnVQT_vfTGqR82DsjxQLqQ%40mail.gmail.com

Regards,

Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

17 June 2021, 04:26:44

On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
>>
>> I have handled the above scenario(drop schema should automatically
>> remove the schema entry from publication schema relation) & addition
>> of tests in the new v2 patch attached.
>> Thoughts?
>
>
> Please see some initial comments:
>
> 1. I think there should be more tests to show that the schema data is actually replicated
> to the subscriber. Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
>
> 2. How does replication behave when a table is added or removed from a subscribed schema
> using ALTER TABLE SET SCHEMA?
>
> 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> specify one, this can be handy to replicate current schema tables.

I felt supporting a syntax like below will be useful for supporting current schema:
create publication testpub for schema schema_name
or
create publication testpub for schema CURRENT_SCHEMA

Let the user specify CURRENT_SCHEMA explicitly if required instead of not specifying anything.

I have implemented similar lines in the V6 patch at [1]. Thoughts?

[1] - https://www.postgresql.org/message-id/CALDaNm10g2h29a-oFHsadk-Du6RDhnVQT_vfTGqR82DsjxQLqQ%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

Ajin Cherian

Date:

18 June 2021, 11:54:48

On Thu, Jun 17, 2021 at 12:41 AM vignesh C <vignesh21@gmail.com> wrote:

> Thanks for reporting it, the attached patch is a rebased version of
> the patch with few review comment fixes, I will reply with the comment
> fixes to the respective mails.
>

I've applied the patch, it applies cleand and ran "make check" and
tests run fine.

Some comments for patch 1:

In the commit message, some grammar mistakes:

"Changes was done in
pg_dump to handle pubtype updation in pg_publication table while the database
gets upgraded."

-------------- change to --

Changes were done in
pg_dump to handle pubtype updation in pg_publication table while the database
gets upgraded.

=============

Prototypes present in pg_publication.h was moved to publicationcmds.h so
that minimal datastructures ...

----------------- change to --

Prototypes present in pg_publication.h were moved to publicationcmds.h so
that minimal datastructures ..

========================

In patch 1:

In getObjectDescription()

+ if (!nspname)
+ {
+ pfree(pubname);
+ pfree(nspname);
+ ReleaseSysCache(tup);

Why free nspname if it is NULL?

Same comment in getObjectIdentityParts()
============================

In GetAllTablesPublicationRelations()

+ ScanKeyData key[2];
  TableScanDesc scan;
  HeapTuple tuple;
  List    *result = NIL;
+ int keycount = 0;

  classRel = table_open(RelationRelationId, AccessShareLock);

- ScanKeyInit(&key[0],
+ ScanKeyInit(&key[keycount++],

Here you have init key[1], but the code below in the same function
inits key[0]. I am not sure if this will affect that code below.

if (pubviaroot)
{
ScanKeyInit(&key[0],
Anum_pg_class_relkind,
BTEqualStrategyNumber, F_CHAREQ,
CharGetDatum(RELKIND_PARTITIONED_TABLE));
=================================

in UpdatePublicationTypeTupleValue():

+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+
+ /* Update the catalog. */
+ CatalogTupleUpdate(rel, &tup->t_self, tup);

Not sure if this tup needs to be freed or if the memory context will
take care of it.
=====================

regards,
Ajin Cherian
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

21 June 2021, 05:16:08

Thanks for the comments.

On Fri, Jun 18, 2021 at 5:25 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 12:41 AM vignesh C <vignesh21@gmail.com> wrote:
>
> > Thanks for reporting it, the attached patch is a rebased version of
> > the patch with few review comment fixes, I will reply with the comment
> > fixes to the respective mails.
> >
>
> I've applied the patch, it applies cleand and ran "make check" and
> tests run fine.
>
> Some comments for patch 1:
>
> In the commit message, some grammar mistakes:
>
> "Changes was done in
> pg_dump to handle pubtype updation in pg_publication table while the database
> gets upgraded."
>
> -------------- change to --
>
> Changes were done in
> pg_dump to handle pubtype updation in pg_publication table while the database
> gets upgraded.
>

I will modify this.

> =============
>
> Prototypes present in pg_publication.h was moved to publicationcmds.h so
> that minimal datastructures ...
>
> ----------------- change to --
>
> Prototypes present in pg_publication.h were moved to publicationcmds.h so
> that minimal datastructures ..
>
> ========================

I will modify this.

>
> In patch 1:
>
> In getObjectDescription()
>
> + if (!nspname)
> + {
> + pfree(pubname);
> + pfree(nspname);
> + ReleaseSysCache(tup);
>
> Why free nspname if it is NULL?
>
> Same comment in getObjectIdentityParts()

I will modify this.

> ============================
>
> In GetAllTablesPublicationRelations()
>
> + ScanKeyData key[2];
>   TableScanDesc scan;
>   HeapTuple tuple;
>   List    *result = NIL;
> + int keycount = 0;
>
>   classRel = table_open(RelationRelationId, AccessShareLock);
>
> - ScanKeyInit(&key[0],
> + ScanKeyInit(&key[keycount++],
>
> Here you have init key[1], but the code below in the same function
> inits key[0]. I am not sure if this will affect that code below.
>
> if (pubviaroot)
> {
> ScanKeyInit(&key[0],
> Anum_pg_class_relkind,
> BTEqualStrategyNumber, F_CHAREQ,
> CharGetDatum(RELKIND_PARTITIONED_TABLE));

I felt this is ok as we specify the keycount to be 1, so only the
key[0] will be used. Thoughts?
ScanKeyInit(&key[0],
Anum_pg_class_relkind,
BTEqualStrategyNumber, F_CHAREQ,
CharGetDatum(RELKIND_PARTITIONED_TABLE));

scan = table_beginscan_catalog(classRel, 1, key);

> =================================
>
> in UpdatePublicationTypeTupleValue():
>
> + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
> + replaces);
> +
> + /* Update the catalog. */
> + CatalogTupleUpdate(rel, &tup->t_self, tup);
>
> Not sure if this tup needs to be freed or if the memory context will
> take care of it.

I felt this is ok, as the cleanup is handled in the caller function
"AlterPublication", thoughts?
/* Cleanup. */
heap_freetuple(tup);
table_close(rel, RowExclusiveLock);

I will post an update patch for the fixes shortly.

Regards,
Vignesh

Re: Added schema level support for publication.

From

Ajin Cherian

Date:

22 June 2021, 01:27:21

On Mon, Jun 21, 2021 at 3:16 PM vignesh C <vignesh21@gmail.com> wrote:


> I felt this is ok as we specify the keycount to be 1, so only the
> key[0] will be used. Thoughts?
> ScanKeyInit(&key[0],
> Anum_pg_class_relkind,
> BTEqualStrategyNumber, F_CHAREQ,
> CharGetDatum(RELKIND_PARTITIONED_TABLE));
>
> scan = table_beginscan_catalog(classRel, 1, key);
>

It maybe fine, just doesn't look correct when you look at the function
as a whole.

> > =================================
> >
> > in UpdatePublicationTypeTupleValue():
> >
> > + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
> > + replaces);
> > +
> > + /* Update the catalog. */
> > + CatalogTupleUpdate(rel, &tup->t_self, tup);
> >
> > Not sure if this tup needs to be freed or if the memory context will
> > take care of it.
>
> I felt this is ok, as the cleanup is handled in the caller function
> "AlterPublication", thoughts?
> /* Cleanup. */
> heap_freetuple(tup);
> table_close(rel, RowExclusiveLock);

that should be fine.

regards,
Ajin Cherian
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

22 June 2021, 04:15:04

On Tue, Jun 22, 2021 at 6:57 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Mon, Jun 21, 2021 at 3:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> > I felt this is ok as we specify the keycount to be 1, so only the
> > key[0] will be used. Thoughts?
> > ScanKeyInit(&key[0],
> > Anum_pg_class_relkind,
> > BTEqualStrategyNumber, F_CHAREQ,
> > CharGetDatum(RELKIND_PARTITIONED_TABLE));
> >
> > scan = table_beginscan_catalog(classRel, 1, key);
> >
>
> It maybe fine, just doesn't look correct when you look at the function
> as a whole.

I have added a local variable for this to avoid confusion.

Updated patch has the fix for this, this also includes the fixes for
the other comments you had given.
I have removed the skip table patches to keep the focus on the main
patch, once this patch gets into committable shape, I will focus on
the skip table patch.

Regards,
Vignesh

On Wed, Jun 23, 2021 at 2:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Jun 22, 2021 at 2:15 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Updated patch has the fix for this, this also includes the fixes for
> > the other comments you had given.
> > I have removed the skip table patches to keep the focus on the main
> > patch, once this patch gets into committable shape, I will focus on
> > the skip table patch.
> >
>
> I have the following initial comments on the v7 patches:
>
> v7-0001
>
> (1) The patch comment is pretty rough and needs work.
>
> I suggest something like the following:
>
> This patch adds schema-level support for publication.
> A new schema option allows one or more schemas to be specified, whose tables
> are selected by the publisher for sending the data to the subscriber.
>
> pg_publication maintains information about the publication. Previously, the
> "puballtables" bool column was used to indicate if the publication was the
> "FOR ALL TABLES" type (if true) or the "FOR TABLE" type (if false). With the
> introduction of the "FOR SCHEMA" publication type, it is not easy to determine
> the publication type. Therefore, a new column "pubtype" has been added to the
> pg_publication relation to indicate the publication type.
> There was the possibility of avoiding addition of this new column, but that
> would require checking puballtables of pg_publication and checking
> pg_publication_rel for table type publication and then checking
> pg_publication_schema for schema type publication. Instead, I preferred to add
> the "pubtype" column, which makes things easier, and also will help support
> new options in the future.
> A new system table "pg_publication_schema" has been added, to maintain the
> schemas that the user wants to publish through the publication. The
> schema/publication/publication_schema dependency was created to handle the
> corresponding renaming/removal of schemas to the publication/publication_schema
> when the schema is renamed/dropped. The Decoder identifies if the relation is
> part of the publication and replicates it to the subscriber. Changes were made
> to pg_dump to handle pubtype updation in the pg_publication table when the
> database is upgraded.
>
> Prototypes present in pg_publication.h have been moved to publicationcmds.h so
> that minimal data structures are exported to pg_dump and psql clients, as the
> rest of the information need not be exported.
>
> CATALOG_VERSION_NO needs to be updated while committing, as this feature
> involves a catalog change.
>
> TODO: version checks for psql/pg_dump need to be changed from 140000 to 150000
> once the ongoing release is completed.
>

Modified.

> (2) src/backend/catalog/objectaddress.c
> getObjectIdentityParts(), case OCLASS_PUBLICATION_SCHEMA
>
> Shouldn't pubname/nspname be pfree()d, if objargs/objname are NULL?
>

Modified

> (3) src/backend/catalog/pg_publication.c
>
> (i)
> GetAllTablesPublications
>
>
> BEFORE:
> + * Gets list of relations published.
> AFTER:
> + * Gets the list of relations published.
>
> There are several other cases of newly-added "Gets list of ..." comments.
>

Modified

> (ii)
> GetAllTablesPublicationRelations
>
> BEFORE:
> + * Gets list of all relation published by FOR SCHEMA publication(s).
> AFTER:
> + * Gets the list of all relations published by FOR SCHEMA publication(s).
>

Modified

> (4) src/backend/commands/publicationcmds.c
>
> Missing function header for function "UpdatePublicationTypeTupleValue".
>

Included function header.

> (5) src/backend/parser/gram.y
>
> Violation of function header comment format:
>
> BEFORE:
> +/* makeSchemaSpec
> + * Create a SchemaSpec with the given type
> + */
>
> AFTER:
> +/*
> + * makeSchemaSpec
> + * Create a SchemaSpec with the given type
> + */
>

Modified it to:
/*
 * makeSchemaSpec - Create a SchemaSpec with the given type
 */

> (6) src/bin/pg_dump/pg_dump.c
>
> The following code is within a loop that processes schemas.
> I think that (in the comment) "Clean up and return" should instead say
> "Clean up and process the next schema"
> Also, should say "Schema is not a member".
>
> + /*
> + * Schema is not member of any publications. Clean up and return.
> + */
> + PQclear(res);
> + continue;
>

Modified.

> (7) src/bin/psql/describe.c
>
> Missing function header for function "addFooterToPublicationDesc".
>

Included it.

> v7-0002
>
> (1) doc/src/sgml/catalogs.sgml
> Typo
>
> BEFORE:
> +       respctively. The publication type cannot be changed in other cases.
> AFTER:
> +       respectively. The publication type cannot be changed in other cases.
>

Modified

> (2) doc/src/sgml/ref/create_publication.sgml
>
> BEFORE:
> +   Create a publication that publishes all changes for all the tables
> present in
> +production schema:
> AFTER:
> +   Create a publication that publishes all changes for all the tables
> present in
> +the schema "production":
>
>
> BEFORE:
> +   Create a publication that publishes all changes for all the tables
> present in
> +marketing and sales schemas:
> AFTER:
> +   Create a publication that publishes all changes for all the tables
> present in
> +the schemas "marketing" and "sales":
>

Modified.

> (3) src/test/regress/expected/publication.out
>
> BEFORE:
> +-- Drop schema that is not preset in the publication
> AFTER:
> +-- Drop schema that is not present in the publication
>
> BEFORE:
> +--- Check create publication on a object which is not schema
> AFTER:
> +--- Check create publication on an object which is not schema
>

Modified.

> (4) src/test/regress/sql/publication.sql
>
> BEFORE:
> +-- Drop schema that is not preset in the publication
> AFTER:
> +-- Drop schema that is not present in the publication
>

Modified.

Thanks for reviewing and providing the comments, Attached v8 patches
have the fixes for the comments.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

24 June 2021, 18:17:44

On Thu, Jun 24, 2021 at 2:12 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Jun 24, 2021 at 4:41 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > I applied your V7* patch and complied it. The following warnings came out, please take a look.
> >
>
> I encountered the following warnings when building with the v7 patches applied:
> (looks like missing #include "catalog/objectaddress.h")
>
> pg_prewarm.c:109:29: warning: implicit declaration of function
> ‘get_relkind_objtype’; did you mean ‘get_element_type’?
> [-Wimplicit-function-declaration]
> aclcheck_error(aclresult,
> get_relkind_objtype(rel->rd_rel->relkind), get_rel_name(relOid));
> ^~~~~~~~~~~~~~~~~~~
> get_element_type
>
> heap_surgery.c:391:9: warning: implicit declaration of function
> ‘get_relkind_objtype’; did you mean ‘get_publication_type’?
> [-Wimplicit-function-declaration]
> get_relkind_objtype(rel->rd_rel->relkind),
> ^~~~~~~~~~~~~~~~~~~
> get_publication_type
>
> pgrowlocks.c:136:29: warning: implicit declaration of function
> ‘get_relkind_objtype’ [-Wimplicit-function-declaration]
> aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
>

Thanks for reporting these warnings, I have fixed this in the v8 patch attached at [1].

[1] - https://www.postgresql.org/message-id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.gmail.com

Regards,

Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

24 June 2021, 18:25:02

On Thu, Jun 24, 2021 at 12:10 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> Hi
>
> I applied your V7* patch and complied it. The following warnings came out, please take a look.
>
> >pg_publication.c:688:22: warning: ‘tables’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> >   funcctx->user_fctx = (void *) tables;
> >   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
> >describe.c: In function ‘describePublications’:
> >describe.c:6479:35: warning: ‘pubtype’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> >   else if (has_pubtype && pubtype == PUBTYPE_SCHEMA)
>
> For the warning in pg_publication.c, maybe we can replace the following 'else if' with 'else'.
> +               else if (publication->pubtype == PUBTYPE_SCHEMA)
>
> For the warning in describe.c, initialization of 'pubtype' is needed.

Thanks for reporting these warnings, I have fixed this in the v8 patch
attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

29 June 2021, 03:25:18

On Friday, June 25, 2021 2:25 AM vignesh C <vignesh21@gmail.com>wrote:
> 
> Thanks for reporting these warnings, I have fixed this in the v8 patch
> attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.
> gmail.com
> 

Thanks for your patch. The warnings are fixed.

But I found an issue while using your V8 patch, which is similar to [1]. The case is as below:
Drop a schema from publication and refresh publication at subscriber, then insert into publisher table,  the inserts
stillreplicated to subscriber. The expect result is that the data is no longer replicated.

For example:
------publisher------
create schema s1;
create table s1.t1 (a int primary key);
create publication pub for schema s1;

------subscriber------
create schema s1;
create table s1.t1 (a int primary key);
create subscription sub connection 'dbname=postgres port=5432' publication pub;

------publisher------
insert into s1.t1 values (1);

------subscriber------
postgres=# select * from s1.t1;
 a
---
 1
(1 row)

------publisher------
alter publication pub drop schema s1;
insert into s1.t1 values (2);

------subscriber------
postgres=# select * from s1.t1;
 a
---
 1
 2
(2 rows)

The similar issue [1] (related to "ALTER PUBLICATION .. DROP TABLE") was fixed by modifying in
rel_sync_cache_publication_cbcallback, which is related to PUBLICATIONRELMAP syscache.  In my case, I think it used
PUBLICATIONSCHEMAMAPsyscache, and no callback was registered for it. Should we register a callback for it or fix it in
otherways?

[1]https://www.postgresql.org/message-id/flat/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com

Regards
Tang

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

30 June 2021, 09:49:43

On Tuesday, June 29, 2021 11:25 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> 
> Thanks for your patch. The warnings are fixed.
> 
> But I found an issue while using your V8 patch, which is similar to [1]. The case
> is as below:
> Drop a schema from publication and refresh publication at subscriber, then
> insert into publisher table,  the inserts still replicated to subscriber. The expect
> result is that the data is no longer replicated.
> 

I also saw a problem while using "ALTER PUBLICATION ... ADD SCHEMA ...". The case is as below:
Add a schema to publication, then refresh publication at subscriber. Insert into publisher table,  the inserts couldn't
replicateto subscriber.

Steps to reproduce the case:
------publisher------
CREATE PUBLICATION testpub FOR SCHEMA public;

------subscriber------
CREATE SUBSCRIPTION testsub CONNECTION 'dbname=postgres port=5432' PUBLICATION testpub;

------publisher------
CREATE SCHEMA s1;
CREATE TABLE s1.t1 (a int PRIMARY KEY);
insert into s1.t1 values (1);
ALTER PUBLICATION testpub ADD SCHEMA s1;

------subscriber------
CREATE SCHEMA s1;
CREATE TABLE s1.t1 (a int PRIMARY KEY);
ALTER SUBSCRIPTION testsub REFRESH PUBLICATION;
postgres=# SELECT * FROM s1.t1;
 a
---
 1
(1 row)

------publisher------
insert into s1.t1 values (2);

------subscriber------
postgres=# SELECT * FROM s1.t1;
 a
---
 1
(1 row)

when I executed "ALTER PUBLICATION ... ADD TABLE ...",  rel_sync_cache_publication_cb callback function set
replicate_validto false, then it would validate the entry in get_rel_sync_entry function, and marked the pubactions to
true.so it worked ok.

In the case of  "ALTER PUBLICATION ... ADD SCHEMA ...", replicate_valid would not be set to false. Because of this, the
pubactionswere still false in get_rel_sync_entry function.

So I think the reason for it is similar to the one I reported before [1].

[1]
https://www.postgresql.org/message-id/OS0PR01MB61134B20314DE45795DD384CFB029%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards
Tang

Re: Added schema level support for publication.

From

vignesh C

Date:

30 June 2021, 11:43:21

On Tue, Jun 29, 2021 at 8:55 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, June 25, 2021 2:25 AM vignesh C <vignesh21@gmail.com>wrote:
> >
> > Thanks for reporting these warnings, I have fixed this in the v8 patch
> > attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.
> > gmail.com
> >
>
> Thanks for your patch. The warnings are fixed.
>
> But I found an issue while using your V8 patch, which is similar to [1]. The case is as below:
> Drop a schema from publication and refresh publication at subscriber, then insert into publisher table, the inserts still replicated to subscriber. The expect result is that the data is no longer replicated.
>
> For example:
> ------publisher------
> create schema s1;
> create table s1.t1 (a int primary key);
> create publication pub for schema s1;
>
> ------subscriber------
> create schema s1;
> create table s1.t1 (a int primary key);
> create subscription sub connection 'dbname=postgres port=5432' publication pub;
>
> ------publisher------
> insert into s1.t1 values (1);
>
> ------subscriber------
> postgres=# select * from s1.t1;
> a
> ---
> 1
> (1 row)
>
> ------publisher------
> alter publication pub drop schema s1;
> insert into s1.t1 values (2);
>
> ------subscriber------
> postgres=# select * from s1.t1;
> a
> ---
> 1
> 2
> (2 rows)
>

Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at [1].

[1] - https://www.postgresql.org/message-id/OS0PR01MB61138278359AE713EA24140DFB019%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

01 July 2021, 01:25:23

On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:

> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at [1].

Thanks for your patch. I confirmed that the two issues I reported has been fixed.

Regards

Tang

Re: Added schema level support for publication.

From

Bharath Rupireddy

Date:

01 July 2021, 12:12:52

On Tue, Jun 22, 2021 at 10:11 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > I have removed the skip table patches to keep the focus on the main
> > patch, once this patch gets into committable shape, I will focus on
> > the skip table patch.
>
> IMO it's a good idea to start a new thread for the "skip table"
> feature so that we can discuss it separately. If required you can
> specify in that thread that the idea of the "skip table" can be
> applied to the "add schema level support" feature.

Hi Vignesh,

I will find sometime to review the v9 patch set. I'm curious to know
whether the latest v9 patch set has the changes for the "skip table"
feature? Or is it being discussed separately?

Regards,
Bharath Rupireddy.

Re: Added schema level support for publication.

From

vignesh C

Date:

01 July 2021, 13:30:55

On Thu, Jul 1, 2021 at 5:43 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Jun 22, 2021 at 10:11 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > > I have removed the skip table patches to keep the focus on the main
> > > patch, once this patch gets into committable shape, I will focus on
> > > the skip table patch.
> >
> > IMO it's a good idea to start a new thread for the "skip table"
> > feature so that we can discuss it separately. If required you can
> > specify in that thread that the idea of the "skip table" can be
> > applied to the "add schema level support" feature.
>
> Hi Vignesh,
>
> I will find sometime to review the v9 patch set. I'm curious to know
> whether the latest v9 patch set has the changes for the "skip table"
> feature? Or is it being discussed separately?

The v9 patch does not include the "Skip table" feature, I'm planning
to focus on that once the current patch is stabilized.

Regards,
Vignesh

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

02 July 2021, 04:48:54

On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported
at[1].
 

A comment on v9:

src/bin/psql/describe.c

+                                              if (pset.sversion >= 15000)

I think it should be 150000.

Regards
Tang

Re: Added schema level support for publication.

From

vignesh C

Date:

02 July 2021, 09:40:02

On Fri, Jul 2, 2021 at 10:18 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you
reportedat [1].
 
>
> A comment on v9:
>
> src/bin/psql/describe.c
>
> +                                              if (pset.sversion >= 15000)
>
> I think it should be 150000.

Thanks for reporting this, I will fix this in the next version of the patch.

Regards,
Vignesh

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

08 July 2021, 03:46:42

On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported
at[1].
 

Hi,

I had a look at the patch, please consider following comments.

(1)
-            if (pub->alltables)
+            if (pub->pubtype == PUBTYPE_ALLTABLES)
             {
                 publish = true;
                 if (pub->pubviaroot && am_partition)
                     publish_as_relid = llast_oid(get_partition_ancestors(relid));
             }
 
+            if (pub->pubtype == PUBTYPE_SCHEMA)
+            {
+                Oid            schemaId = get_rel_namespace(relid);
+                List       *pubschemas = GetPublicationSchemas(pub->oid);
+
+                if (list_member_oid(pubschemas, schemaId))
+                {

It might be better use "else if" for the second check here.
Like: else if (pub->pubtype == PUBTYPE_SCHEMA)

Besides, we already have the {schemaoid, pubid} set here, it might be
better to scan the cache PUBLICATIONSCHEMAMAP instead of invoking
GetPublicationSchemas() which will scan the whole table.


(2)
+        /* Identify which schemas should be dropped. */
+        foreach(oldlc, oldschemaids)
+        {
+            Oid            oldschemaid = lfirst_oid(oldlc);
+            ListCell   *newlc;
+            bool        found = false;
+
+            foreach(newlc, schemaoidlist)
+            {
+                Oid            newschemaid = lfirst_oid(newlc);
+
+                if (newschemaid == oldschemaid)
+                {
+                    found = true;
+                    break;
+                }
+            }

It seems we can use " if (list_member_oid(schemaoidlist, oldschemaid)) "
to replace the second foreach loop.

(3)
there are some testcases change in 0001 patch, it might be better move them
to 0002 patch.


(4)
+        case OBJECT_PUBLICATION_SCHEMA:
+            objnode = (Node *) list_make2(linitial(name), linitial(args));
+            break;
         case OBJECT_USER_MAPPING:
             objnode = (Node *) list_make2(linitial(name), linitial(args));
             break;

Does it looks better to merge these two switch cases ?
Like:
case OBJECT_PUBLICATION_SCHEMA:
case OBJECT_USER_MAPPING:
    objnode = (Node *) list_make2(linitial(name), linitial(args));
    break;

best regards,
houzj

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

09 July 2021, 03:28:30

On Thursday, July 8, 2021 11:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote
> On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also
> fixes the other issue you reported at [1].
> 
> Hi,
> 
> I had a look at the patch, please consider following comments.

Some more commets.

Currently, postgres caches publication actions info in the
RelationData::rd_pubactions, but after applying the patch, it seems
rd_pubactions is not initialized when using schema level publication.

It cound result in some unexpected behaviour when checking if command can be
executed with current replica identity.

----
CheckCmdReplicaIdentity
...
    pubactions = GetRelationPublicationActions(rel);
    if (cmd == CMD_UPDATE && pubactions->pubupdate)
        ereport(ERROR,
                (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                 errmsg("cannot update table \"%s\" because it does not have a replica identity and publishes
updates",
                        RelationGetRelationName(rel)),
                 errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.")));
----
Based on the above code, we access rd_pubactions from relcache in function
GetRelationPublicationActions(), if we don't have correct publication
information, we won't get the error in publication, instead we could lead to
the failure in subscription side.

To fix, I think (1) the patch can modify the function
GetRelationPublicationActions() to get the schema level publication related to
the relation, then merge the publication action to the rd_pubactions.

In addition, (2) it seems we also need to add the relcache invalidation code
about schema level publication when alter publication options.

I attached a rough code diff about (1) and (2), the diff is based on the v9 patchset.
I hope it can help fix the above issues.

Best regards,
Hou zhijie

Attachment

fix_patch

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

09 July 2021, 06:42:42

On Fri, Jul 9, 2021 at 1:28 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Currently, postgres caches publication actions info in the
> RelationData::rd_pubactions, but after applying the patch, it seems
> rd_pubactions is not initialized when using schema level publication.
>
> It cound result in some unexpected behaviour when checking if command can be
> executed with current replica identity.
>

While testing this patch, I'm finding that for a FOR SCHEMA
publication, UPDATEs and DELETEs on tables belonging to that schema
are not getting replicated (but INSERTs and TRUNCATEs are).
Could this be related to the issues that Hou-san has identified?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

12 July 2021, 09:20:38

On Fri, Jul 9, 2021 at 8:58 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, July 8, 2021 11:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote
> > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also
> > fixes the other issue you reported at [1].
> >
> > Hi,
> >
> > I had a look at the patch, please consider following comments.
>
> Some more commets.
>
> Currently, postgres caches publication actions info in the
> RelationData::rd_pubactions, but after applying the patch, it seems
> rd_pubactions is not initialized when using schema level publication.
>
> It cound result in some unexpected behaviour when checking if command can be
> executed with current replica identity.
>
> ----
> CheckCmdReplicaIdentity
> ...
>         pubactions = GetRelationPublicationActions(rel);
>         if (cmd == CMD_UPDATE && pubactions->pubupdate)
>                 ereport(ERROR,
>                                 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>                                  errmsg("cannot update table \"%s\" because it does not have a replica identity and
publishesupdates",
 
>                                                 RelationGetRelationName(rel)),
>                                  errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.")));
> ----
> Based on the above code, we access rd_pubactions from relcache in function
> GetRelationPublicationActions(), if we don't have correct publication
> information, we won't get the error in publication, instead we could lead to
> the failure in subscription side.
>
> To fix, I think (1) the patch can modify the function
> GetRelationPublicationActions() to get the schema level publication related to
> the relation, then merge the publication action to the rd_pubactions.
>
> In addition, (2) it seems we also need to add the relcache invalidation code
> about schema level publication when alter publication options.
>
> I attached a rough code diff about (1) and (2), the diff is based on the v9 patchset.
> I hope it can help fix the above issues.

Thanks for identifying the issues and also providing the fix for it.
Those changes are required, I have taken your changes into my patch.
The Attached patch has the changes for the same.

Regards,
Vignesh

On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, this issue is fixed in the v10 patch
> > attached at [1].
> > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
>
> Thanks for fixing it.
>
> By applying your V10 patch, I saw three problems, please have a look.
>
> 1. An issue about pg_dump.
> When public schema was published, the publication was created in the output file, but public schema was not added to
it.(Other schemas could be added as expected.) 
>
> I looked into it and found that selectDumpableNamespace function marks DUMP_COMPONENT_DEFINITION as needless when the
schemais public, leading to schema public is ignored in getPublicationSchemas. So we'd better check whether schemas
shouldbe dumped in another way. 
>
> I tried to fix it with the following change, please have a look. (Maybe we also need to add some comments for it.)
>
> diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> index f6b4f12648..a327d2568b 100644
> --- a/src/bin/pg_dump/pg_dump.c
> +++ b/src/bin/pg_dump/pg_dump.c
> @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, NamespaceInfo nspinfo[], int numSchemas)
>                  * Ignore publication membership of schemas whose definitions are not
>                  * to be dumped.
>                  */
> -               if (!(nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
> +               if (!((nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
> +                       || (strcmp(nsinfo->dobj.name, "public") == 0 && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
>                         continue;
>
>                 pg_log_info("reading publication membership for schema \"%s\"",

I felt it is intentionally done like that as the pubic schema is
created by default, hence it is not required to dump else we will get
errors while restoring. Thougths?

> 2. improper behavior for system schema
> I found I could create publication for system schema, such as pg_catalog. I think it's better to report an error
messagehere, just like creating publication for system table is unallowed. 

Modified.

> 3. fix for dumpPublicationSchema
> Should we add a declaration for it and add const decorations to the it's second parameter? Like other similar
functions.

Modified to include const, declaration is not required as the function
definition is before function call, so not making this change.

Thanks for your comments, the attached v11 patch fixes the issues.

Regards,
Vignesh

On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > patch attached at [1].
> > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > >
> > > Thanks for fixing it.
> > >
> > > By applying your V10 patch, I saw three problems, please have a look.
> > >
> > > 1. An issue about pg_dump.
> > > When public schema was published, the publication was created in the
> > > output file, but public schema was not added to it. (Other schemas
> > > could be added as expected.)
> > >
> > > I looked into it and found that selectDumpableNamespace function marks
> > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > leading to schema public is ignored in getPublicationSchemas. So we'd better
> > check whether schemas should be dumped in another way.
> > >
> > > I tried to fix it with the following change, please have a look.
> > > (Maybe we also need to add some comments for it.)
> > >
> > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > index f6b4f12648..a327d2568b 100644
> > > --- a/src/bin/pg_dump/pg_dump.c
> > > +++ b/src/bin/pg_dump/pg_dump.c
> > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > NamespaceInfo nspinfo[], int numSchemas)
> > >                  * Ignore publication membership of schemas whose
> > definitions are not
> > >                  * to be dumped.
> > >                  */
> > > -               if (!(nsinfo->dobj.dump &
> > DUMP_COMPONENT_DEFINITION))
> > > +               if (!((nsinfo->dobj.dump &
> > DUMP_COMPONENT_DEFINITION)
> > > +                       || (strcmp(nsinfo->dobj.name, "public") == 0
> > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > >                         continue;
> > >
> > >                 pg_log_info("reading publication membership for schema
> > > \"%s\"",
> >
> > I felt it is intentionally done like that as the pubic schema is created by default,
> > hence it is not required to dump else we will get errors while restoring.
> > Thougths?
>
> Thanks for the new patches and I also looked at this issue.
>
> For user defined schema and publication:
> --------------------------
> create schema s1;
> create publication pub2 for SCHEMA s1;
> --------------------------
>
> pg_dump will only generate the following SQLs:
>
> ------pg_dump result------
> CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
> ALTER PUBLICATION pub2 ADD SCHEMA s1;
> --------------------------
>
> But for the public schema:
> --------------------------
> create publication pub for SCHEMA public;
> --------------------------
>
> pg_dump will only generate the following SQL:
>
> ------pg_dump result------
> CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
> --------------------------
>
> It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
> means the public schema won't be published after restoring. So, I think we'd
> better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?

Thanks for reporting this issue, this issue is fixed in the v12 patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

16 July 2021, 10:13:11

On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for your comments, the attached v11 patch fixes the issues.
> >
>
> Thanks for your work on this.
>
> I have some minor review comments on the documentation:
>
> (1) wrong link (uses altersubscription instead of alterpublication)
> doc/src/sgml/catalogs.sgml
>
> BEFORE:
> + created as an empty publication type. When a table or schema is added to
> + the publication using <link linkend="sql-altersubscription">
> + <command>ALTER PUBLICATION</command></link> then the publication type
>
> AFTER:
> + created as an empty publication type. When a table or schema is added to
> + the publication using <link linkend="sql-alterpublication">
> + <command>ALTER PUBLICATION</command></link> then the publication type

Modified.

> (2) Improve wording and suggest "or" instead of "and"
> doc/src/sgml/catalogs.sgml
>
> BEFORE:
> + If a publication is created without specifying any of
> + <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> and
> + <literal>FOR SCHEMA</literal> option, then the publication will be
>
> AFTER:
> + If a publication is created without specifying any of the
> + <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> or
> + <literal>FOR SCHEMA</literal> options, then the publication will be

Modified.

> (3) space at start of literal
> doc/src/sgml/catalogs.sgml
>
> + and <literal> FOR SCHEMA</literal>, so for such publications there will be a

Modified.

> (4) Should say "variants of this command change ..." ?
>
> + The fourth, fifth and sixth variants change which schemas are part of the

Modified.

>
> Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
> (PubType is getting set to 'e' instead of 's'
>
> test_pub=# create publication pub1;
> CREATE PUBLICATION
> test_pub=# create table myschema.test(i int);
> CREATE TABLE
> test_pub=# alter publication pub1 set schema myschema;
> ALTER PUBLICATION
> test_pub=# \dRp pub1
> List of publications
> Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> Via root | PubType
> ------+-------+------------+---------+---------+---------+-----------+----------+---------
> pub1 | gregn | f | t | t | t | t |
> f | e
> (1 row)
>
> test_pub=# alter publication pub1 add table test;
> ALTER PUBLICATION
> test_pub=# \dRp pub1
> List of publications
> Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> Via root | PubType
> ------+-------+------------+---------+---------+---------+-----------+----------+---------
> pub1 | gregn | f | t | t | t | t |
> f | t
> (1 row)
>
>
> When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
> publication, it seems OK.

Modified.

Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

19 July 2021, 03:13:14

On Friday, July 16, 2021 6:13 PM vignesh C <vignesh21@gmail.com> wrote:

> On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote:

> >

> > Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...

> > (PubType is getting set to 'e' instead of 's'

> >

> > test_pub=# create publication pub1;

> > CREATE PUBLICATION

> > test_pub=# create table myschema.test(i int);

> > CREATE TABLE

> > test_pub=# alter publication pub1 set schema myschema;

> > ALTER PUBLICATION

> > test_pub=# \dRp pub1

> > List of publications

> > Via root | PubType

> > ------+-------+------------+---------+---------+---------+-----------+----------+---------

> > pub1 | gregn | f | t | t | t | t |

> > f | e

> > (1 row)

> >

> > test_pub=# alter publication pub1 add table test;

> > ALTER PUBLICATION

> > test_pub=# \dRp pub1

> > List of publications

> > Via root | PubType

> > ------+-------+------------+---------+---------+---------+-----------+----------+---------

> > pub1 | gregn | f | t | t | t | t |

> > f | t

> > (1 row)

> >

> > When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty

> > publication, it seems OK.

> Modified.

Thanks for your patch. But there is a problem about "ALTER PUBLICATION … SET TABLE …", which is similar to the issue Greg reported at [1].

For example:

postgres=# CREATE TABLE public.t1 (a int);

CREATE TABLE

postgres=# CREATE PUBLICATION pub1;

CREATE PUBLICATION

postgres=# ALTER PUBLICATION pub1 SET TABLE public.t1;

ALTER PUBLICATION

postgres=# \dRp

List of publications

------+----------+------------+---------+---------+---------+-----------+----------+---------

pub1 | postgres | f | t | t | t | t | f | e

(1 row)

I think PubType in this case should be 't' instead of 'e'. Please have a look.

[1] - https://www.postgresql.org/message-id/CAJcOf-ddXvY%3DOFC54CshdMa1bswzFjG9qokjC0aFeiS%3D6CNRzw%40mail.gmail.com

Regards,

Tang

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

19 July 2021, 04:02:04

On Friday, July 16, 2021 6:10 PM vignesh C <vignesh21@gmail.com>
> On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> > > <tanghy.fnst@fujitsu.com> wrote:
> > > >
> > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > > > >
> > > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > > patch attached at [1].
> > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > > >
> > > > Thanks for fixing it.
> > > >
> > > > By applying your V10 patch, I saw three problems, please have a look.
> > > >
> > > > 1. An issue about pg_dump.
> > > > When public schema was published, the publication was created in the
> > > > output file, but public schema was not added to it. (Other schemas
> > > > could be added as expected.)
> > > >
> > > > I looked into it and found that selectDumpableNamespace function marks
> > > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > > leading to schema public is ignored in getPublicationSchemas. So we'd better
> > > check whether schemas should be dumped in another way.
> > > >
> > > > I tried to fix it with the following change, please have a look.
> > > > (Maybe we also need to add some comments for it.)
> > > >
> > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > > index f6b4f12648..a327d2568b 100644
> > > > --- a/src/bin/pg_dump/pg_dump.c
> > > > +++ b/src/bin/pg_dump/pg_dump.c
> > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > > NamespaceInfo nspinfo[], int numSchemas)
> > > >                  * Ignore publication membership of schemas whose
> > > definitions are not
> > > >                  * to be dumped.
> > > >                  */
> > > > -               if (!(nsinfo->dobj.dump &
> > > DUMP_COMPONENT_DEFINITION))
> > > > +               if (!((nsinfo->dobj.dump &
> > > DUMP_COMPONENT_DEFINITION)
> > > > +                       || (strcmp(nsinfo->dobj.name, "public") == 0
> > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > > >                         continue;
> > > >
> > > >                 pg_log_info("reading publication membership for schema
> > > > \"%s\"",
> > >
> > > I felt it is intentionally done like that as the pubic schema is created by default,
> > > hence it is not required to dump else we will get errors while restoring.
> > > Thougths?
> >
> > Thanks for the new patches and I also looked at this issue.
> >
> > For user defined schema and publication:
> > --------------------------
> > create schema s1;
> > create publication pub2 for SCHEMA s1;
> > --------------------------
> >
> > pg_dump will only generate the following SQLs:
> >
> > ------pg_dump result------
> > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
> > ALTER PUBLICATION pub2 ADD SCHEMA s1;
> > --------------------------
> >
> > But for the public schema:
> > --------------------------
> > create publication pub for SCHEMA public;
> > --------------------------
> >
> > pg_dump will only generate the following SQL:
> >
> > ------pg_dump result------
> > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
> > --------------------------
> >
> > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
> > means the public schema won't be published after restoring. So, I think we'd
> > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?
> 
> Thanks for reporting this issue, this issue is fixed in the v12 patch attached.
> 

I tested your v12 patch and found a problem in the following case.

Step 1:
postgres=# create schema s1;
CREATE SCHEMA
postgres=# create table s1.t1 (a int);
CREATE TABLE
postgres=# create publication pub_t for table s1.t1;
CREATE PUBLICATION
postgres=# create publication pub_s for schema s1;
CREATE PUBLICATION

Step 2:
pg_dump -N s1

I dumped and excluded schema s1, pg_dump generated the following SQL:
-------------------------------
ALTER PUBLICATION pub_s ADD SCHEMA s1;

I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD TABLE s1.t1" was not generated in my case.
Thoughts?

Regards
Tang

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

19 July 2021, 09:10:57

On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Modified.
>
> Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
> [1]  -
https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>

There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
After that command, it still regards it as an empty (e) publication,
so I can then ALTER PUBLICATION ... ADD SCHEMA ...

e.g.

test_pub=# create schema myschema;
CREATE SCHEMA
test_pub=# CREATE TABLE myschema.test (key int, value text, data jsonb);
CREATE TABLE
test_pub=# create publication pub1;
CREATE PUBLICATION
test_pub=# \dRp+ pub1
                                 Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
root | Pubtype
-------+------------+---------+---------+---------+-----------+----------+---------
 gregn | f          | t       | t       | t       | t         | f        | e
(1 row)

test_pub=# alter publication pub1 set table myschema.test;
ALTER PUBLICATION
test_pub=# \dRp+ pub1
                                 Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
root | Pubtype
-------+------------+---------+---------+---------+-----------+----------+---------
 gregn | f          | t       | t       | t       | t         | f        | e
(1 row)

test_pub=# alter publication pub1 add schema myschema;
ALTER PUBLICATION
test_pub=# \dRp+ pub1
                                 Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
root | Pubtype
-------+------------+---------+---------+---------+-----------+----------+---------
 gregn | f          | t       | t       | t       | t         | f        | s
Schemas:
    "myschema"


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Added schema level support for publication.

From

Rahila Syed

Date:

21 July 2021, 09:44:23

On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:

On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Modified.
>
> Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
> [1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>

There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
After that command, it still regards it as an empty (e) publication,
so I can then ALTER PUBLICATION ... ADD SCHEMA ...

One issue here is that the code to update publication type is missing
in AlterPublicationTables for SET TABLE command.

More broadly, I am not clear about the behaviour of the patch when a
publication is created to publish only certain tables, and is later altered to publish
a whole schema. I think such behaviour is legitimate. However,
AFAIU as per current code we can't update the publication type
from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

I have some review comments as follows:
1.
In ConvertSchemaSpecListToOidList(List *schemas) function:
+ search_path = fetch_search_path(false);
+ nspname = get_namespace_name(linitial_oid(search_path));
+ if (nspname == NULL) /* recently-deleted namespace? */
+ ereport(ERROR,
+ errcode(ERRCODE_UNDEFINED_SCHEMA),
+ errmsg("no schema has been selected"));
+
+ schemoid = get_namespace_oid(nspname, false);
+ break;

The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
doing Schema oid = liinital_oid(search_path)); should be enough.

2. In the same function should there be an if else condition block instead of a switch case as
there are only two cases.

Thank you,

Rahila Syed

Re: Added schema level support for publication.

From

vignesh C

Date:

21 July 2021, 17:29:44

On Mon, Jul 19, 2021 at 8:43 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, July 16, 2021 6:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> >
>
> > On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote:
>
> > >
>
> > > Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
>
> > > (PubType is getting set to 'e' instead of 's'
>
> > >
>
> > > test_pub=# create publication pub1;
>
> > > CREATE PUBLICATION
>
> > > test_pub=# create table myschema.test(i int);
>
> > > CREATE TABLE
>
> > > test_pub=# alter publication pub1 set schema myschema;
>
> > > ALTER PUBLICATION
>
> > > test_pub=# \dRp pub1
>
> > >                                    List of publications
>
> > >  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
>
> > > Via root | PubType
>
> > > ------+-------+------------+---------+---------+---------+-----------+----------+---------
>
> > >  pub1 | gregn | f          | t       | t       | t       | t         |
>
> > > f        | e
>
> > > (1 row)
>
> > >
>
> > > test_pub=# alter publication pub1 add table test;
>
> > > ALTER PUBLICATION
>
> > > test_pub=# \dRp pub1
>
> > >                                    List of publications
>
> > >  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
>
> > > Via root | PubType
>
> > > ------+-------+------------+---------+---------+---------+-----------+----------+---------
>
> > >  pub1 | gregn | f          | t       | t       | t       | t         |
>
> > > f        | t
>
> > > (1 row)
>
> > >
>
> > >
>
> > > When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
>
> > > publication, it seems OK.
>
> >
>
> > Modified.
>
> >
>
>
>
> Thanks for your patch. But there is a problem about "ALTER PUBLICATION … SET TABLE …", which is similar to the issue
Gregreported at [1]. 
>
>
>
> For example:
>
> postgres=# CREATE TABLE public.t1 (a int);
>
> CREATE TABLE
>
> postgres=# CREATE PUBLICATION pub1;
>
> CREATE PUBLICATION
>
> postgres=# ALTER PUBLICATION pub1 SET TABLE public.t1;
>
> ALTER PUBLICATION
>
> postgres=# \dRp
>
>                                     List of publications
>
> Name |  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType
>
> ------+----------+------------+---------+---------+---------+-----------+----------+---------
>
> pub1 | postgres | f          | t       | t       | t       | t         | f        | e
>
> (1 row)
>
>
>
> I think PubType in this case should be 't' instead of 'e'. Please have a look.

Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
I have changed relation name pg_publication_schema to
pg_publication_sch so that the names are in similar lines with
pg_publication_rel relation and similar changes were done for variable
names too.

Regards,
Vignesh

On Thu, Jul 22, 2021 at 9:12 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote
> > > I think PubType in this case should be 't' instead of 'e'. Please have a look.
> >
> > Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
> > I have changed relation name pg_publication_schema to pg_publication_sch
> > so that the names are in similar lines with pg_publication_rel relation and similar
> > changes were done for variable names too.
>
> Hi,
>
> Thanks for the new version patches.
> I had a few comments.
>
> 1)
> +
> +       appendPQExpBuffer(query, "ALTER PUBLICATION %s ", fmtId(pubrinfo->pubname));
> +       appendPQExpBuffer(query, "ADD SCHEMA %s;\n", fmtId(schemainfo->dobj.name));
> +
>
> It seems we can combine these two function call.
> like appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
>                                            fmtId(pubrinfo->pubname),
>                                            fmtId(schemainfo->dobj.name));
>

Modified.

> 2)
> +                       footers[0] = pstrdup("Publications:");
> +
>
> This word seems need to be translated.
> footers[0] = pstrdup(_("Publications:"));

Modified

> 3)
> I think it might be better to add a testcase to cover the issue
> reported before [1].
>
> [1] https://www.postgresql.org/message-id/CAJcOf-dsKOYKmdrU5nwWeFoHvhiACbmw_KU%3DJQMEeDp6WwijqA%40mail.gmail.com

Added

> 4)
> Personally, the new name pg_publication_sch is not very easy to understand.
> (Maybe it's because I am not a native english speaker. If others feel ok,
> please ignore this comment)

I have changed it to pg_publication_schema as earlier, as Greg also
felt the other name was better.

Thanks for the comments, the attached v14 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

23 July 2021, 12:16:30

On Thu, Jul 22, 2021 at 9:38 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Jul 22, 2021 at 1:42 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Personally, the new name pg_publication_sch is not very easy to understand.
> > (Maybe it's because I am not a native english speaker. If others feel ok,
> > please ignore this comment)
> >
>
> I was actually thinking the same thing.
> I prefer the full SCHEMA/schema, even for all the internal
> variables/definitions which have been changed since the last patch
> version.
> I think Vignesh was trying to be consistent with pg_publication_rel
> and pg_subscription_rel, but maybe "rel" is better understood to be an
> abbreviation for "relation" than "sch" for "schema"?
> Thoughts from others?

I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better. This is fixed in the v14 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com

Regards,

Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

23 July 2021, 12:17:51

On Fri, Jul 23, 2021 at 6:26 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Thursday, July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
> > I have changed relation name pg_publication_schema to
> > pg_publication_sch so that the names are in similar lines with
> > pg_publication_rel relation and similar changes were done for variable
> > names too.
>
> Thanks for your fixing. The issue is fixed as you said.
>
> After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ...
SETSCHEMA ...", I would get the following error message:
 
>
> postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1;
> ERROR:  duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index"
> DETAIL:  Key (psnspcid, pspubid)=(16406, 16405) already exists.
>
> I think the error message is pretty hard to understand. Maybe we can do sth to improve this scenario.
>
> Here is two proposal:
> 1. Don't report error message, just add some code to make the above command to be executed successfully,
>    just like "ALTER PUBLICATION ... SET TABLE ..." as follolws:
>
> postgres=# ALTER PUBLICATION pub2 SET TABLE t1,t1;
> ALTER PUBLICATION
> postgres=# \dRp+ pub2
>                                    Publication pub2
>   Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype
> ----------+------------+---------+---------+---------+-----------+----------+---------
>  postgres | f          | t       | t       | t       | t         | f        | t
> Tables:
>     "public.t1"
>
> 2. Report a easily understandable message like: Schema s1 specified more than once
>

I have changed it to not report any error, this issue is fixed in the
v14 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

26 July 2021, 02:11:29

On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> I have changed it to not report any error, this issue is fixed in the
> v14 patch attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g
> mail.com
> 

Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit
#678f5448c2d869),please rebase it.

Besides, I tested your patch on old HEAD and confirmed that the issue I reported was fixed.

Regards
Tang

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

26 July 2021, 02:47:50

On Fri, Jul 23, 2021 at 10:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better.
Thisis fixed in the v14 patch attached at [1].

> [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com
>

Thanks, I think it looks better.

On another issue, there seems to be problems with pg_dump support for
FOR SCHEMA publications.

For example:

   CREATE PUBLICATION p FOR SCHEMA myschema;

pg_dump is dumping:

   CREATE PUBLICATION p WITH (publish = 'insert, update, delete, truncate');
   ...
   ALTER PUBLICATION p ADD SCHEMA p;

Obviously that last line should instead be "ALTER PUBLICATION p ADD
SCHEMA myschema;"

I think the bug is caused by the following:

+ appendPQExpBuffer(query,
+   "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
+   fmtId(pubsinfo->pubname), fmtId(schemainfo->dobj.name));

The comment for fmtId() says:

    *  Note that the returned string must be used before calling fmtId again,
    *  since we re-use the same return buffer each time.

So I think there was a reason why the ALTER PUBLICATION and ADD SCHEMA
were previously formatted separately (and so should NOT have been
combined).
It should be restored to how it was in the previous patch version.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

26 July 2021, 05:21:38

On Mon, Jul 26, 2021 at 8:17 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Jul 23, 2021 at 10:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was
better.This is fixed in the v14 patch attached at [1].
 
> > [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com
> >
>
> Thanks, I think it looks better.
>
> On another issue, there seems to be problems with pg_dump support for
> FOR SCHEMA publications.
>
> For example:
>
>    CREATE PUBLICATION p FOR SCHEMA myschema;
>
> pg_dump is dumping:
>
>    CREATE PUBLICATION p WITH (publish = 'insert, update, delete, truncate');
>    ...
>    ALTER PUBLICATION p ADD SCHEMA p;
>
> Obviously that last line should instead be "ALTER PUBLICATION p ADD
> SCHEMA myschema;"
>
> I think the bug is caused by the following:
>
> + appendPQExpBuffer(query,
> +   "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
> +   fmtId(pubsinfo->pubname), fmtId(schemainfo->dobj.name));
>
> The comment for fmtId() says:
>
>     *  Note that the returned string must be used before calling fmtId again,
>     *  since we re-use the same return buffer each time.
>
> So I think there was a reason why the ALTER PUBLICATION and ADD SCHEMA
> were previously formatted separately (and so should NOT have been
> combined).
> It should be restored to how it was in the previous patch version.

Thanks for the comment, this is modified in the v15 patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

26 July 2021, 05:25:06

On Mon, Jul 26, 2021 at 7:41 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have changed it to not report any error, this issue is fixed in the
> > v14 patch attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g
> > mail.com
> >
>
> Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869), please rebase it.
>

Thanks for reporting it, it is rebased at the v15 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm27LRWF9ney%3DcVeD-0jc2%2BJ5Y0wNQhighZB%3DAat4VbNBA%40mail.gmail.com

Regards,

Vignesh

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

26 July 2021, 23:41:46

On Mon, Jul 26, 2021 at 3:21 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comment, this is modified in the v15 patch attached.
>

I have several minor review comments.

(1) src/backend/catalog/objectaddress.c
Should start comment sentences with an uppercase letter, for consistency.

+ /* fetch publication name and schema oid from input list */

I also notice that some 1-sentence comments end with "." (full-stop)
and others don't. It seems to alternate all over the place, and so is
quite noticeable.
Unfortunately, it already seems to be like this in much of the code
that this patch patches.
Ideally (at least my personal preference is) 1-sentence comments
should not end with a ".".

(2) src/backend/catalog/pg_publication.c
errdetail message

I think the following should say "Temporary schemas ..." (since the
existing error message for tables says "System tables cannot be added
to publications.").

+   errdetail("Temporary schema cannot be added to publications.")));

(3) src/backend/commands/publicationcmds.c
PublicationAddTables

I think that the Assert below is not correct (i.e. not restrictive enough).
Although the condition passes, it is allowing, for example,
stmt->for_all_tables==true if stmt->schemas==NIL, and that doesn't
seem to be correct.
I suggest the following change:

BEFORE:
+ Assert(!stmt || !stmt->for_all_tables || !stmt->schemas);
AFTER:
+ Assert(!stmt || (!stmt->for_all_tables && !stmt->schemas));

(4) src/backend/commands/publicationcmds.c
PublicationAddSchemas

Similarly, I think that the Assert below is not restrictive enough,
and think it should be changed:

BEFORE:
+ Assert(!stmt || !stmt->for_all_tables || !stmt->tables);
AFTER:
+ Assert(!stmt || (!stmt->for_all_tables && !stmt->tables));

(5) src/bin/pg_dump/common.c

Spelling mistake.

BEFORE:
+ pg_log_info("reading publciation schemas");
AFTER:
+ pg_log_info("reading publication schemas");

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

27 July 2021, 06:51:22

On Monday, July 26, 2021 1:25 PM vignesh C <vignesh21@gmail.com> wrote:

> On Mon, Jul 26, 2021 at 7:41 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:

> >

> > On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:

> > >

> > > I have changed it to not report any error, this issue is fixed in the

> > > v14 patch attached at [1].

> > > [1] - https://www.postgresql.org/message-

> > > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g

> > > mail.com

> > >

> >

> > Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869), please rebase it.

> >

> Thanks for reporting it, it is rebased at the v15 patch attached at [1].

> [1] - https://www.postgresql.org/message-id/CALDaNm27LRWF9ney%3DcVeD-0jc2%2BJ5Y0wNQhighZB%3DAat4VbNBA%40mail.gmail.com

Thanks for your new patch. I applied your patch and it succeeded.

Here are some comments on the tests in your patch. The attached file included changes about the comments, please have a look.

1. src/test/regress/sql/publication.sql

There are some existing tests to verify that we can't add table to ‘for all tables publication', should we add some tests about adding schema to ‘for all tables publication’ / ‘for table publication’?

I added some cases in the attached file.

Besides, the following existing comment seems not suitable. It uses 'SET' but the comment says 'add'. And since we can add table or schema, I think we should point out what we add is table, thoughts?

-- fail - can't add to for all tables publication

ALTER PUBLICATION testpub_foralltables SET TABLE pub_test.testpub_nopk;

2. src/test/subscription/t/001_rep_changes.pl

2.1

+# Insert some data into few tables and verify that inserted data is replicated.

+$node_publisher->safe_psql('postgres', "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))");

+$node_publisher->safe_psql('postgres', "INSERT INTO sch2.tab1 VALUES(generate_series(11,20))");

+$node_publisher->wait_for_catchup('tap_sub_schema');

There are two spaces between "INTO" and "sch1.tab1".

2.2

Most of publication names are lowercase, and some of them are uppercase. I think it will be better if all of them are lowercase.

I modified them in the attached file.

2.3

+# Verify that the subscription relation list is updated after refresh.

+$result = $node_subscriber->safe_psql('postgres',

+ "SELECT count(*) FROM pg_subscription_rel WHERE srsubid IN (SELECT oid FROM pg_subscription WHERE subname = 'tap_sub_schema')");

+is($result, qq(5),

+ 'check subscription relation status was dropped on subscriber');

Should it be 'check subscription relation status is not yet dropped on subscriber' here?

2.4

+# Drop publications as we don't need them anymore

+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema");

There is only one publication, so the comment here should be 'Drop publication as we don't need it anymore'.

There are some existing test cases about publication for table and publication for all tables in 002_pg_dump.pl, so I think we could add some test cases about publication for schema.

I tried to add some cases in the attached file.

Regards

Tang

Attachment

test_patch

Re: Added schema level support for publication.

From

vignesh C

Date:

28 July 2021, 10:44:23

On Tue, Jul 27, 2021 at 12:21 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote
>
> Thanks for your new patch. I applied your patch and it succeeded.
>
> Here are some comments on the tests in your patch. The attached file included changes about the comments, please have
alook. 
>
>
>
> 1.  src/test/regress/sql/publication.sql
>
> There are some existing tests to verify that we can't add table to ‘for all tables publication', should we add some
testsabout adding schema to ‘for all tables publication’ / ‘for table publication’? 
>
> I added some cases in the attached file.

I have taken the changes.

>
> Besides, the following existing comment seems not suitable. It uses 'SET' but the comment says 'add'. And since we
canadd table or schema, I think we should point out what we add is table, thoughts? 
>
> -- fail - can't add to for all tables publication
>
> ALTER PUBLICATION testpub_foralltables SET TABLE pub_test.testpub_nopk;
>

Since this is in base code, you might want to post a patch on a separate thread.

> 2. src/test/subscription/t/001_rep_changes.pl
>
> 2.1
>
> +# Insert some data into few tables and verify that inserted data is replicated.
>
> +$node_publisher->safe_psql('postgres', "INSERT INTO  sch1.tab1 VALUES(generate_series(11,20))");
>
> +$node_publisher->safe_psql('postgres', "INSERT INTO  sch2.tab1 VALUES(generate_series(11,20))");
>
> +
>
> +$node_publisher->wait_for_catchup('tap_sub_schema');
>
>
>
> There are two spaces between "INTO" and "sch1.tab1".

I have taken the changes.

> 2.2
>
> Most of publication names are lowercase, and some of them are uppercase. I think it will be better if all of them are
lowercase.
>
> I modified them in the attached file.

I have taken the changes.

> 2.3
>
> +# Verify that the subscription relation list is updated after refresh.
>
> +$result = $node_subscriber->safe_psql('postgres',
>
> +        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid IN (SELECT oid FROM pg_subscription WHERE subname =
'tap_sub_schema')");
>
> +is($result, qq(5),
>
> +        'check subscription relation status was dropped on subscriber');
>
>
>
> Should it be 'check subscription relation status is not yet dropped on subscriber' here?

I have taken the changes.

> 2.4
>
> +# Drop publications as we don't need them anymore
>
> +$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema");
>
>
>
> There is only one publication, so the comment here should be 'Drop publication as we don't need it anymore'.

I have taken the changes.

> 3.
>
> There are some existing test cases about publication for table and publication for all tables in 002_pg_dump.pl, so I
thinkwe could add some test cases about publication for schema. 
>
> I tried to add some cases in the attached file.

Thanks for the patch, I have merged the changes. Attached v16 patch
has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

28 July 2021, 10:52:45

On Tue, Jul 27, 2021 at 5:11 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Jul 26, 2021 at 3:21 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comment, this is modified in the v15 patch attached.
> >
>
> I have several minor review comments.
>
> (1) src/backend/catalog/objectaddress.c
> Should start comment sentences with an uppercase letter, for consistency.
>
> + /* fetch publication name and schema oid from input list */
>
> I also notice that some 1-sentence comments end with "." (full-stop)
> and others don't. It seems to alternate all over the place, and so is
> quite noticeable.
> Unfortunately, it already seems to be like this in much of the code
> that this patch patches.
> Ideally (at least my personal preference is) 1-sentence comments
> should not end with a ".".

Modified.

> (2) src/backend/catalog/pg_publication.c
> errdetail message
>
> I think the following should say "Temporary schemas ..." (since the
> existing error message for tables says "System tables cannot be added
> to publications.").
>
> + errdetail("Temporary schema cannot be added to publications.")));
>

Modified.

> (3) src/backend/commands/publicationcmds.c
> PublicationAddTables
>
> I think that the Assert below is not correct (i.e. not restrictive enough).
> Although the condition passes, it is allowing, for example,
> stmt->for_all_tables==true if stmt->schemas==NIL, and that doesn't
> seem to be correct.
> I suggest the following change:
>
> BEFORE:
> + Assert(!stmt || !stmt->for_all_tables || !stmt->schemas);
> AFTER:
> + Assert(!stmt || (!stmt->for_all_tables && !stmt->schemas));
>

Modified.

> (4) src/backend/commands/publicationcmds.c
> PublicationAddSchemas
>
> Similarly, I think that the Assert below is not restrictive enough,
> and think it should be changed:
>
> BEFORE:
> + Assert(!stmt || !stmt->for_all_tables || !stmt->tables);
> AFTER:
> + Assert(!stmt || (!stmt->for_all_tables && !stmt->tables));
>

Modified.

> (5) src/bin/pg_dump/common.c
>
> Spelling mistake.
>
> BEFORE:
> + pg_log_info("reading publciation schemas");
> AFTER:
> + pg_log_info("reading publication schemas");

Modified.

Thanks for the comments, the comments are fixed in the v16 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2LgV5XcLF80rJ60NwnjKpZj%3D%3DLxJpO4W2TG2G5XmUtDA%40mail.gmail.com

Regards,

Vignesh

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

30 July 2021, 06:42:43

On July 28, 2021 6:44 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for the patch, I have merged the changes. Attached v16 patch has the
> fixes for the same.

Thanks for the new version patches.
Here are a few comments:

1)

+       /* Identify which schemas should be dropped */
+       foreach(oldlc, oldschemaids)
+       {
+           Oid         oldschemaid = lfirst_oid(oldlc);
+
+           if (!list_member_oid(schemaoidlist, oldschemaid))
+               delschemas = lappend_oid(delschemas, oldschemaid);
+       }
+

We can use list_difference here to simplify the code.

2)

+
+       /* Filter out duplicates if user specifies "sch1, sch1" */
+       if (list_member_oid(schemaoidlist, schemaoid))
+           continue;
+
+       schemaoidlist = lappend_oid(schemaoidlist, schemaoid);

It might be more concise to use list_append_unique_oid() here.

3)

+  <para>
+   Create a publication that publishes all changes for all the tables present in
+the schema "production":
+<programlisting>

The second line seems not aligned.
After:
+  <para>
+   Create a publication that publishes all changes for all the tables present in
+   the schema "production":
+<programlisting>

4)

+       resetPQExpBuffer(query);
+
+       /* Get the publication membership for the schema */
+       appendPQExpBuffer(query,
+                         "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
+                         "FROM pg_publication_schema ps, pg_publication p "
+                         "WHERE ps.psnspcid = '%u' "
+                         "AND p.oid = ps.pspubid",
+                         nsinfo->dobj.catId.oid);

It seems we can use printfPQExpBuffer() here which is a convenience routine
that does the same thing as resetPQExpBuffer() followed by appendPQExpBuffer().

Best regards,
Houzj

Re: Added schema level support for publication.

From

vignesh C

Date:

02 August 2021, 15:39:53

On Fri, Jul 30, 2021 at 12:12 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On July 28, 2021 6:44 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the patch, I have merged the changes. Attached v16 patch has the
> > fixes for the same.
>
> Thanks for the new version patches.
> Here are a few comments:
>
> 1)
>
> +       /* Identify which schemas should be dropped */
> +       foreach(oldlc, oldschemaids)
> +       {
> +           Oid         oldschemaid = lfirst_oid(oldlc);
> +
> +           if (!list_member_oid(schemaoidlist, oldschemaid))
> +               delschemas = lappend_oid(delschemas, oldschemaid);
> +       }
> +
>
> We can use list_difference here to simplify the code.

Modified.

> 2)
>
> +
> +       /* Filter out duplicates if user specifies "sch1, sch1" */
> +       if (list_member_oid(schemaoidlist, schemaoid))
> +           continue;
> +
> +       schemaoidlist = lappend_oid(schemaoidlist, schemaoid);
>
> It might be more concise to use list_append_unique_oid() here.

Modified.

> 3)
>
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +the schema "production":
> +<programlisting>
>
> The second line seems not aligned.
> After:
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +   the schema "production":
> +<programlisting>

Modified.

> 4)
>
> +       resetPQExpBuffer(query);
> +
> +       /* Get the publication membership for the schema */
> +       appendPQExpBuffer(query,
> +                         "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
> +                         "FROM pg_publication_schema ps, pg_publication p "
> +                         "WHERE ps.psnspcid = '%u' "
> +                         "AND p.oid = ps.pspubid",
> +                         nsinfo->dobj.catId.oid);
>
> It seems we can use printfPQExpBuffer() here which is a convenience routine
> that does the same thing as resetPQExpBuffer() followed by appendPQExpBuffer().

Modified.

Thanks for the comments, attached v17 patches has the fixes for the same.

Regards,
Vignesh

On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> >
> > Thanks for the comments, attached v17 patches has the fixes for the same.
>
> Thanks for your new patch.
>
> I saw the following warning when compiling. It seems we don't need this variable any more.
>
> publicationcmds.c: In function ‘AlterPublicationSchemas’:
> publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
>    ListCell   *oldlc;
>                ^~~~~

Thanks for reporting this, this is fixed in the v18 patch attached.

Regards,
Vignesh

On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
> >
>
> I have started looking into this patch and below are some initial comments.
>
> 1.
> + /* Fetch publication name and schema oid from input list */
> + schemaname = strVal(linitial(object));
> + pubname = strVal(lsecond(object));
>
> I think the comment should be: "Fetch schema name and publication name
> from input list"
>

Modified.

> 2.
> @@ -3902,6 +3958,46 @@ getObjectDescription(const ObjectAddress
> *object, bool missing_ok)
>   break;
>   }
>
> + case OCLASS_PUBLICATION_SCHEMA:
> + {
> + HeapTuple tup;
> + char    *pubname;
> + Form_pg_publication_schema psform;
> + char    *nspname;
> +
> + tup = SearchSysCache1(PUBLICATIONSCHEMA,
> +   ObjectIdGetDatum(object->objectId));
> + if (!HeapTupleIsValid(tup))
> + {
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for publication schema %u",
> + object->objectId);
> + break;
> + }
> +
> + psform = (Form_pg_publication_schema) GETSTRUCT(tup);
> + pubname = get_publication_name(psform->pspubid, false);
> + nspname = get_namespace_name(psform->psnspcid);
> + if (!nspname)
> + {
> + Oid psnspcid = psform->psnspcid;
> +
> + pfree(pubname);
> + ReleaseSysCache(tup);
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for schema %u",
> + psnspcid);
> + break;
> + }
>
> The above code in getObjectDescription looks quite similar to what you
> have in getObjectIdentityParts(). Can we extract the common code into
> a separate function?

Modified.

> 3. Can we use column name pubkind (similar to relkind in pg_class)
> instead of pubtype? If so, please change PUBTYPE_ALLTABLES and similar
> other defines to PUBKIND_*.
>

Modified.

> 4.
> @@ -3632,6 +3650,7 @@ typedef struct CreatePublicationStmt
>   List    *options; /* List of DefElem nodes */
>   List    *tables; /* Optional list of tables to add */
>   bool for_all_tables; /* Special publication for all tables in db */
> + List    *schemas; /* Optional list of schemas */
>  } CreatePublicationStmt;
>
> Isn't it better to keep a schemas list after tables?

Modified.

> 5.
> @@ -1163,12 +1168,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
>   Publication *pub = lfirst(lc);
>   bool publish = false;
>
> - if (pub->alltables)
> + if (pub->pubtype == PUBTYPE_ALLTABLES)
>   {
>   publish = true;
>   if (pub->pubviaroot && am_partition)
>   publish_as_relid = llast_oid(get_partition_ancestors(relid));
>   }
> + else if (pub->pubtype == PUBTYPE_SCHEMA)
> + {
> + Oid schemaId = get_rel_namespace(relid);
> + Oid psid = GetSysCacheOid2(PUBLICATIONSCHEMAMAP,
> +    Anum_pg_publication_schema_oid,
> +    ObjectIdGetDatum(schemaId),
> +    ObjectIdGetDatum(pub->oid));
> +
> + if (OidIsValid(psid))
> + {
> + publish = true;
> + if (pub->pubviaroot && am_partition)
> + publish_as_relid = llast_oid(get_partition_ancestors(relid));
> + }
> + }
>
> Isn't it better to get schema publications once as we get relation
> publications via GetRelationPublications and then decide whether to
> publish or not? I think that will save repeated cache searches for
> each publication requested by the subscriber?

Modified.

> 6.
> + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> + PublicationSchemaPsnspcidPspubidIndexId,
> + 2,
> + {
> + Anum_pg_publication_schema_psnspcid,
> + Anum_pg_publication_schema_pspubid,
> + 0,
> + 0
> + },
>
> Why don't we keep pubid as the first column in this index?

I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
it is, thoughts?

> 7.
> getPublicationSchemas()
> {
> ..
> + /* Get the publication membership for the schema */
> + printfPQExpBuffer(query,
> +   "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
> +   "FROM pg_publication_schema ps, pg_publication p "
> +   "WHERE ps.psnspcid = '%u' "
> +   "AND p.oid = ps.pspubid",
> +   nsinfo->dobj.catId.oid);
> ..
> }
>
> Why do you need to use join here? Why the query and another handling
> in this function be similar to what we have in getPublicationTables?
> Also, there is another function GetPublicationSchemas() in the patch,
> can we name one of these differently for the purpose of easy
> grepability?

Modified it similar to getPublicationTables without joins. The
function is renamed to getPublicationNamespaces.

Thanks for the comments, the attached v19 patch has the fixes for the comments.

Regards,
Vignesh

On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> > > >
> > > > Thanks for the comments, attached v17 patches has the fixes for the same.
> > >
> > > Thanks for your new patch.
> > >
> > > I saw the following warning when compiling. It seems we don't need this variable any more.
> > >
> > > publicationcmds.c: In function ‘AlterPublicationSchemas’:
> > > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
> > >    ListCell   *oldlc;
> > >                ^~~~~
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
>
> I've also started reviewing this patch. I've not looked at the patch
> yet but here are initial comments/questions based on using this
> feature:
>
> pg_publication catalog still has puballtables column but it's still
> necessary? IIUC since pubtype = 'a' means publishing all tables in the
> database puballtables seems no longer necessary.

Removed puballtables.

> ---
> Suppose that a parent table and its child table are defined in
> different schemas, there is a publication for the schema where only
> the parent table is defined, and the subscriber subscribes to the
> publication, should changes for its child table be replicated to the
> subscriber?
>
> In FOR TABLE cases, i.g., where the subscriber subscribes to the
> publication that is only for the parent table, changes for its child
> table are replicated to the subscriber.

Modified it to keep the behavior similar to FOR Table publication.

> As far as I tested v18 patch, changes for the child table are not
> replicated in FOR SCHEMA cases. Here is the test script:
>
> On publisher and subscriber:
> create schema p_schema;
> create schema c_schema;
> create table p_schema.p (a int) partition by list (a);
> create table c_schema.c partition of p_schema.p for values in (1);
>
> On publisher:
> create publication pub_p_schema for schema p_schema;
>
> On subscriber:
> create subscription pub connection 'dbname=postgres' publication pub_p_schema;
>
> On publisher:
> insert into p_schema.p values (1);
> select * from p_schema.p;
>  a
> ---
>  1
> (1 row)
>
> On subscriber:
> select * from p_schema.p;
>  a
> ---
>
> (0 rows)

Modified to handle this.
Thanks for the comments, the attached v20 patch handles the above issues.

Regards,
Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

14 August 2021, 12:21:50

On Mon, Aug 9, 2021 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > > 6.
> > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > > + 2,
> > > > > + {
> > > > > + Anum_pg_publication_schema_psnspcid,
> > > > > + Anum_pg_publication_schema_pspubid,
> > > > > + 0,
> > > > > + 0
> > > > > + },
> > > > >
> > > > > Why don't we keep pubid as the first column in this index?
> > > >
> > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > > it is, thoughts?
> > > >
> > >
> > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > > because it is searched using the only relid in
> > > GetRelationPublications, so, similarly, in the patch, you are using
> > > schema_oid in GetSchemaPublications, so probably that will help. I was
> > > wondering why you haven't directly used the cache in
> > > GetSchemaPublications similar to GetRelationPublications?
> >
> > Both of the approaches work, I was not sure which one is better, If
> > the approach in GetRelationPublications is better, I will change it to
> > something similar to GetRelationPublications. Thoughts?
> >
>
> I think it is better to use the cache as if we don't find the entry in
> the cache, then we will anyway search the required entry via sys table
> scan, see SearchCatCacheList.

Modified. This is handled in the v20 patch posted at [1].

I think the point I wanted to ensure was
> that a concurrent session won't blow up the entry while we are looking
> at it. How is that ensured?

The concurrency points occur at two places, Walsender session and user session:
For Walsender process when we query the data from the cache we will get the results based on historic snapshot. I also debugged and verified that we get the results based on historic snapshot, if we check the cache during our operation (insert is just before drop) we will be able to get the dropped record from the cache as this drop occurred after our insert. And if we query the cache after the drop, we will not get the dropped information from the cache. So I feel our existing code is good enough which handles the concurrency through the historic snapshot.
For user sessions, user session checks for replica identity for update/delete operations. To prevent concurrency issues, when schema is added to the publication, the rel cache invalidation happens in publication_add_schema by calling InvalidatePublicationRels, similarly when a schema is dropped from the publication, the rel cache invalidation is handled in RemovePublicationSchemaById by calling InvalidatePublicationRels. Once the invalidation happens it will check the system tables again before deciding. I felt this rel cache invalidation will prevent concurrency issues.

[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

14 August 2021, 12:24:48

On Fri, Aug 6, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Few more comments:
> > > ===================
> > > 1. Do we need the previous column 'puballtables' after adding pubtype
> > > or pubkind in pg_publication?

Removed puballtables, this is handled as part of v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,

Vignesh

Re: Added schema level support for publication.

From

Tom Lane

Date:

14 August 2021, 15:23:00

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> I think the strict separation between publication-for-tables vs. 
> publication-for-schemas is a mistake.  Why can't I have a publication 
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note 
> that we have a pending patch to add sequences support to logical 
> replication.  So eventually, a publication will be able to contain a 
> bunch of different objects of different kinds.

This seems like it's going to create a mess, because the meaning of
"include schema S" will change over time as we add more features.
That is, with the present patch (I suppose, haven't read it) we have
"schema S" meaning "publish all tables in schema S".  When that other
patch lands, presumably that same publication definition would start
meaning "publish all tables and sequences in schema S".  And a few years
down the road it might start meaning something else again.  That sounds
like the sort of potentially app-breaking change that we don't like
to make.

We could avoid that bug-in-waiting if the syntax were more like
"FOR ALL TABLES IN SCHEMA s", which could later extend to
"FOR ALL SEQUENCES IN SCHEMA s", etc.  This is then a very clean
intermediate step between publishing one table and "FOR ALL TABLES"
without a schema limitation.

I tend to agree that a single publication should be able to incorporate
any of these options.

            regards, tom lane

Re: Added schema level support for publication.

From

Masahiko Sawada

Date:

15 August 2021, 20:34:44

On Fri, Aug 13, 2021 at 11:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > ---
> > Even if we drop all tables added to the publication from it, 'pubkind'
> > doesn't go back to 'empty'. Is that intentional behavior? If we do
> > that, we can save the lookup of pg_publication_rel and
> > pg_publication_schema in some cases, and we can switch the publication
> > that was created as FOR SCHEMA to FOR TABLE and vice versa.
> >
>
> Do we really want to allow users to change a publication that is FOR
> SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
> postgres=# Alter Publication pub add table tbl1;
> ERROR:  publication "pub" is defined as FOR ALL TABLES
> DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.

When it comes to FOR ALL TABLES, we can neither add/drop tables
to/from the publication. So it makes sense to me that it never changes
to 'empty'. I'm not sure there are use cases in practice where to
change FOR SCHEMA to FOR TABLE. But it seems a bit weird to me that
pubkind doesn't change to "empty" even if the publication actually
gets empty. It might be just that "empty" is misleading, though.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Added schema level support for publication.

From

Masahiko Sawada

Date:

15 August 2021, 21:04:39

On Sun, Aug 15, 2021 at 12:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > I think the strict separation between publication-for-tables vs.
> > publication-for-schemas is a mistake.  Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note
> > that we have a pending patch to add sequences support to logical
> > replication.  So eventually, a publication will be able to contain a
> > bunch of different objects of different kinds.
>
> This seems like it's going to create a mess, because the meaning of
> "include schema S" will change over time as we add more features.
> That is, with the present patch (I suppose, haven't read it) we have
> "schema S" meaning "publish all tables in schema S".  When that other
> patch lands, presumably that same publication definition would start
> meaning "publish all tables and sequences in schema S".  And a few years
> down the road it might start meaning something else again.  That sounds
> like the sort of potentially app-breaking change that we don't like
> to make.
>
> We could avoid that bug-in-waiting if the syntax were more like
> "FOR ALL TABLES IN SCHEMA s", which could later extend to
> "FOR ALL SEQUENCES IN SCHEMA s", etc.  This is then a very clean
> intermediate step between publishing one table and "FOR ALL TABLES"
> without a schema limitation.

+1

> I tend to agree that a single publication should be able to incorporate
> any of these options.

I personally prefer that a single publication can include both all
tables and all sequences in a database or a schema. It would be a more
convenient way to specify replicating all objects (tables and
sequences) in a database or a schema.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Added schema level support for publication.

From

Peter Smith

Date:

16 August 2021, 06:00:24

On Sun, Aug 15, 2021 at 1:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > I think the strict separation between publication-for-tables vs.
> > publication-for-schemas is a mistake.  Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note
> > that we have a pending patch to add sequences support to logical
> > replication.  So eventually, a publication will be able to contain a
> > bunch of different objects of different kinds.
>
> This seems like it's going to create a mess, because the meaning of
> "include schema S" will change over time as we add more features.
> That is, with the present patch (I suppose, haven't read it) we have
> "schema S" meaning "publish all tables in schema S".  When that other
> patch lands, presumably that same publication definition would start
> meaning "publish all tables and sequences in schema S".  And a few years
> down the road it might start meaning something else again.  That sounds
> like the sort of potentially app-breaking change that we don't like
> to make.
>
> We could avoid that bug-in-waiting if the syntax were more like
> "FOR ALL TABLES IN SCHEMA s", which could later extend to
> "FOR ALL SEQUENCES IN SCHEMA s", etc.  This is then a very clean
> intermediate step between publishing one table and "FOR ALL TABLES"
> without a schema limitation.
>
> I tend to agree that a single publication should be able to incorporate
> any of these options.
>

How about if the improved syntax from Tom Lane [1] also allowed an
"AND" keyword for combining whatever you wish?

Then the question from Peter E. [2] "Why can't I have a publication
that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
have an intuitive solution like:

CREATE PUBLICATION pub1
FOR TABLE t1,t2,t3 AND
FOR ALL TABLES IN SCHEMA s1,s2,s3;

------
[1] https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/4fb39707-dca9-1563-4482-b7a8315c36ca%40enterprisedb.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Added schema level support for publication.

From

Tom Lane

Date:

16 August 2021, 13:31:18

Peter Smith <smithpb2250@gmail.com> writes:
> Then the question from Peter E. [2] "Why can't I have a publication
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
> have an intuitive solution like:

> CREATE PUBLICATION pub1
> FOR TABLE t1,t2,t3 AND
> FOR ALL TABLES IN SCHEMA s1,s2,s3;

That seems a bit awkward, since the existing precedent is
to use commas.  We shouldn't need more than one FOR noise-word,
either.  So I was imagining syntax more like, say,

    CREATE PUBLICATION pub1 FOR
      TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
          SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;

Abstractly it'd be

createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]

cpitem := ALL TABLES |
          TABLE name |
          ALL TABLES IN SCHEMA name |
          ALL SEQUENCES |
          SEQUENCE name |
          ALL SEQUENCES IN SCHEMA name |
          name

The grammar output would need some post-analysis to attribute the
right type to bare "name" items, but that doesn't seem difficult.

            regards, tom lane

Re: Added schema level support for publication.

From

Peter Smith

Date:

17 August 2021, 01:09:54

On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Smith <smithpb2250@gmail.com> writes:
> > Then the question from Peter E. [2] "Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
> > have an intuitive solution like:
>
> > CREATE PUBLICATION pub1
> > FOR TABLE t1,t2,t3 AND
> > FOR ALL TABLES IN SCHEMA s1,s2,s3;
>
> That seems a bit awkward, since the existing precedent is
> to use commas.  We shouldn't need more than one FOR noise-word,
> either.  So I was imagining syntax more like, say,

When I wrote that "AND" suggestion I had in mind that commas may get
weird if there were objects with keyword names. e.g. if there was a
schema called SEQUENCE and a SEQUENCE called  SEQUENCE then this will
be allowed.

CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA SEQUENCE, SEQUENCE SEQUENCE;

But probably I was just overthinking it.

>
>         CREATE PUBLICATION pub1 FOR
>           TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
>           SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
>
> Abstractly it'd be
>
> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
>
> cpitem := ALL TABLES |
>           TABLE name |
>           ALL TABLES IN SCHEMA name |
>           ALL SEQUENCES |
>           SEQUENCE name |
>           ALL SEQUENCES IN SCHEMA name |
>           name
>
> The grammar output would need some post-analysis to attribute the
> right type to bare "name" items, but that doesn't seem difficult.

That last bare "name" cpitem. looks like it would permit the following syntax:

CREATE PUBLICATION pub FOR a,b,c;

Was that intentional?

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

17 August 2021, 10:08:44

On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> >
> >         CREATE PUBLICATION pub1 FOR
> >           TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
> >           SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
> >
> > Abstractly it'd be
> >
> > createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> >
> > cpitem := ALL TABLES |
> >           TABLE name |
> >           ALL TABLES IN SCHEMA name |
> >           ALL SEQUENCES |
> >           SEQUENCE name |
> >           ALL SEQUENCES IN SCHEMA name |
> >           name
> >
> > The grammar output would need some post-analysis to attribute the
> > right type to bare "name" items, but that doesn't seem difficult.
>
> That last bare "name" cpitem. looks like it would permit the following syntax:
>
> CREATE PUBLICATION pub FOR a,b,c;
>
> Was that intentional?
>

I think so. IIUC, the idea is that after parsing we find out whether
the given name is table, sequence, or any other object. Here, I think
the name could be either of table or sequence because, for schema, we
won't be knowing whether to include tables, sequences, or both in the
schema. Also, we can have the same name for schema and table so it
might be tricky to distinguish among those unless we give priority to
one of those.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Tom Lane

Date:

17 August 2021, 13:25:30

Amit Kapila <amit.kapila16@gmail.com> writes:
> On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
>> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Abstractly it'd be
>>>
>>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
>>>
>>> cpitem := ALL TABLES |
>>>     TABLE name |
>>>     ALL TABLES IN SCHEMA name |
>>>     ALL SEQUENCES |
>>>     SEQUENCE name |
>>>     ALL SEQUENCES IN SCHEMA name |
>>>     name
>>>
>>> The grammar output would need some post-analysis to attribute the
>>> right type to bare "name" items, but that doesn't seem difficult.

>> That last bare "name" cpitem. looks like it would permit the following syntax:
>> CREATE PUBLICATION pub FOR a,b,c;
>> Was that intentional?

> I think so.

I had supposed that we could throw an error at the post-processing stage,
or alternatively default to assuming that such names are tables.

Now you could instead make the grammar work like

cpitem := ALL TABLES |
      TABLE name [, ...] |
      ALL TABLES IN SCHEMA name [, ...] |
      ALL SEQUENCES |
      SEQUENCE name [, ...] |
      ALL SEQUENCES IN SCHEMA name [, ...]

which would result in a two-level-list data structure.  I'm not sure
that this is better, as any sort of mistake would result in a very
uninformative generic "syntax error" from Bison.  Errors out of a
post-processing stage could be more specific than that.

(Perhaps, though, we should *document* it like the latter way,
even if the actual implementation is more like the first way.)

            regards, tom lane

Re: Added schema level support for publication.

From

vignesh C

Date:

23 August 2021, 14:16:41

On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Abstractly it'd be
> >>>
> >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> >>>
> >>> cpitem := ALL TABLES |
> >>> TABLE name |
> >>> ALL TABLES IN SCHEMA name |
> >>> ALL SEQUENCES |
> >>> SEQUENCE name |
> >>> ALL SEQUENCES IN SCHEMA name |
> >>> name
> >>>
> >>> The grammar output would need some post-analysis to attribute the
> >>> right type to bare "name" items, but that doesn't seem difficult.
>
> >> That last bare "name" cpitem. looks like it would permit the following syntax:
> >> CREATE PUBLICATION pub FOR a,b,c;
> >> Was that intentional?
>
> > I think so.
>
> I had supposed that we could throw an error at the post-processing stage,
> or alternatively default to assuming that such names are tables.
>
> Now you could instead make the grammar work like
>
> cpitem := ALL TABLES |
> TABLE name [, ...] |
> ALL TABLES IN SCHEMA name [, ...] |
> ALL SEQUENCES |
> SEQUENCE name [, ...] |
> ALL SEQUENCES IN SCHEMA name [, ...]
>
> which would result in a two-level-list data structure. I'm not sure
> that this is better, as any sort of mistake would result in a very
> uninformative generic "syntax error" from Bison. Errors out of a
> post-processing stage could be more specific than that.

I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is something like below:
CreatePublicationStmt:
CREATE PUBLICATION name FOR pub_obj_list opt_definition
{
CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
n->pubname = $3;
n->options = $6;
n->pubobjects = (List *)$5;
$$ = (Node *)n;
}
;
pub_obj_list: PublicationObjSpec
{ $$ = list_make1($1); }
| pub_obj_list ',' PublicationObjSpec
{ $$ = lappend($1, $3); }
;
/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec: TABLE pubobj_expr
{ ....}
| ALL TABLES IN_P SCHEMA pubobj_expr
{ ....}
| pubobj_expr
{ ....}
;
pubobj_expr:
any_name
{ ....}
| any_name '*'
{ ....}
| ONLY any_name
{ ....}
| ONLY '(' any_name ')'
{ ....}
| CURRENT_SCHEMA
{ ....}
;

I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA support in case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like "sch1.t1".
I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR ALL TABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables.
Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALL TABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2" and "ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP separately which cannot be combined currently. I have not mentioned SEQUENCE handling separately, the sequence will be extended in similar lines. I thought of throwing an error at post processing, the first option that Tom Lane had suggested if the user specified syntax like: create publication pub1 for t1,t2;
Thoughts?

[1] - https://www.postgresql.org/message-id/877603.1629120678%40sss.pgh.pa.us

Regards,

Vignesh

Re: Added schema level support for publication.

From

Masahiko Sawada

Date:

25 August 2021, 07:48:59

On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>> Abstractly it'd be
> > >>>
> > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> > >>>
> > >>> cpitem := ALL TABLES |
> > >>>     TABLE name |
> > >>>     ALL TABLES IN SCHEMA name |
> > >>>     ALL SEQUENCES |
> > >>>     SEQUENCE name |
> > >>>     ALL SEQUENCES IN SCHEMA name |
> > >>>     name
> > >>>
> > >>> The grammar output would need some post-analysis to attribute the
> > >>> right type to bare "name" items, but that doesn't seem difficult.
> >
> > >> That last bare "name" cpitem. looks like it would permit the following syntax:
> > >> CREATE PUBLICATION pub FOR a,b,c;
> > >> Was that intentional?
> >
> > > I think so.
> >
> > I had supposed that we could throw an error at the post-processing stage,
> > or alternatively default to assuming that such names are tables.
> >
> > Now you could instead make the grammar work like
> >
> > cpitem := ALL TABLES |
> >           TABLE name [, ...] |
> >           ALL TABLES IN SCHEMA name [, ...] |
> >           ALL SEQUENCES |
> >           SEQUENCE name [, ...] |
> >           ALL SEQUENCES IN SCHEMA name [, ...]
> >
> > which would result in a two-level-list data structure.  I'm not sure
> > that this is better, as any sort of mistake would result in a very
> > uninformative generic "syntax error" from Bison.  Errors out of a
> > post-processing stage could be more specific than that.
>
> I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is something
likebelow: 
> CreatePublicationStmt:
> CREATE PUBLICATION name FOR pub_obj_list opt_definition
> {
> CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
> n->pubname = $3;
> n->options = $6;
> n->pubobjects = (List *)$5;
> $$ = (Node *)n;
> }
> ;
> pub_obj_list: PublicationObjSpec
> { $$ = list_make1($1); }
> | pub_obj_list ',' PublicationObjSpec
> { $$ = lappend($1, $3); }
> ;
> /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
> PublicationObjSpec: TABLE pubobj_expr
> { ....}
> | ALL TABLES IN_P SCHEMA pubobj_expr
> { ....}
> | pubobj_expr
> { ....}
> ;
> pubobj_expr:
> any_name
> { ....}
> | any_name '*'
> { ....}
> | ONLY any_name
> { ....}
> | ONLY '(' any_name ')'
> { ....}
> | CURRENT_SCHEMA
> { ....}
> ;

"FOR ALL TABLES” (that includes all tables in the database) is missing
in this syntax?

>
> I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA
supportin case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like
"sch1.t1".

I think that relation_expr also accepts objects like "sch1.t1", no?

> I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR
ALLTABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. 

I think so too.

> Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALL
TABLESIN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and "ALTER
PUBLICATIONpub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP
separatelywhich cannot be combined currently. 

If we support the former syntax, the latter two syntaxes are also
supported. Why do we want to support only the latter separate two
syntaxes?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Added schema level support for publication.

From

vignesh C

Date:

25 August 2021, 09:37:25

On Wed, Aug 25, 2021 at 1:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >>> Abstractly it'd be
> > > >>>
> > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> > > >>>
> > > >>> cpitem := ALL TABLES |
> > > >>>     TABLE name |
> > > >>>     ALL TABLES IN SCHEMA name |
> > > >>>     ALL SEQUENCES |
> > > >>>     SEQUENCE name |
> > > >>>     ALL SEQUENCES IN SCHEMA name |
> > > >>>     name
> > > >>>
> > > >>> The grammar output would need some post-analysis to attribute the
> > > >>> right type to bare "name" items, but that doesn't seem difficult.
> > >
> > > >> That last bare "name" cpitem. looks like it would permit the following syntax:
> > > >> CREATE PUBLICATION pub FOR a,b,c;
> > > >> Was that intentional?
> > >
> > > > I think so.
> > >
> > > I had supposed that we could throw an error at the post-processing stage,
> > > or alternatively default to assuming that such names are tables.
> > >
> > > Now you could instead make the grammar work like
> > >
> > > cpitem := ALL TABLES |
> > >           TABLE name [, ...] |
> > >           ALL TABLES IN SCHEMA name [, ...] |
> > >           ALL SEQUENCES |
> > >           SEQUENCE name [, ...] |
> > >           ALL SEQUENCES IN SCHEMA name [, ...]
> > >
> > > which would result in a two-level-list data structure.  I'm not sure
> > > that this is better, as any sort of mistake would result in a very
> > > uninformative generic "syntax error" from Bison.  Errors out of a
> > > post-processing stage could be more specific than that.
> >
> > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is
somethinglike below: 
> > CreatePublicationStmt:
> > CREATE PUBLICATION name FOR pub_obj_list opt_definition
> > {
> > CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
> > n->pubname = $3;
> > n->options = $6;
> > n->pubobjects = (List *)$5;
> > $$ = (Node *)n;
> > }
> > ;
> > pub_obj_list: PublicationObjSpec
> > { $$ = list_make1($1); }
> > | pub_obj_list ',' PublicationObjSpec
> > { $$ = lappend($1, $3); }
> > ;
> > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
> > PublicationObjSpec: TABLE pubobj_expr
> > { ....}
> > | ALL TABLES IN_P SCHEMA pubobj_expr
> > { ....}
> > | pubobj_expr
> > { ....}
> > ;
> > pubobj_expr:
> > any_name
> > { ....}
> > | any_name '*'
> > { ....}
> > | ONLY any_name
> > { ....}
> > | ONLY '(' any_name ')'
> > { ....}
> > | CURRENT_SCHEMA
> > { ....}
> > ;
>
> "FOR ALL TABLES” (that includes all tables in the database) is missing
> in this syntax?

"FOR ALL TABLES" is present in CreatePublicationStmt rule, sorry for
not including all of CreatePublicationStmt rule in the previous mail,
I thought of keeping the contents shorter:

CreatePublicationStmt:
CREATE PUBLICATION name opt_definition
{....}
| CREATE PUBLICATION name FOR ALL TABLES opt_definition
{....}
| CREATE PUBLICATION name FOR pub_obj_list opt_definition
{....}
;

It is not in  pub_obj_list as the user will be able to specify either
of "FOR ALL TABLES" or "FOR TABLE/ FOR ALL TABLES IN SCHEMA" along
with create publication.

> >
> > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA
supportin case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like
"sch1.t1".
>
> I think that relation_expr also accepts objects like "sch1.t1", no?

Earlier syntax only supported relations, the relations were parsed
into RangeVar datatype. The new feature supports schema for which only
the schema name is required. To keep the parsing rule common, I used
any_name which will store the dotted name into a list for both
relation and schema. I will later convert it into rangevar for
relation and schema oid for schema names during the processing and
create the publication. I felt relation_expr was able to handle dotted
names because of qualified_name having "ColId indirection", here
indirection rule takes care of handling the dotted names.

> > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR
ALLTABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. 
>
> I think so too.
>
> > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2,
ALLTABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and
"ALTERPUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP
separatelywhich cannot be combined currently. 
>
> If we support the former syntax, the latter two syntaxes are also
> supported. Why do we want to support only the latter separate two
> syntaxes?

We can support either syntax, I was not sure which one is better. If
alter also should support similar syntax I can do it as a separate
patch so as to not increase the main patch size. Thoughts?

Attached v21 patch has the changes based on the new syntax and fixes
few of the other review comments provided by reviewers.

Regards,
Vignesh

On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Aug 25, 2021 at 1:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > >>> Abstractly it'd be
> > > > >>>
> > > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> > > > >>>
> > > > >>> cpitem := ALL TABLES |
> > > > >>>     TABLE name |
> > > > >>>     ALL TABLES IN SCHEMA name |
> > > > >>>     ALL SEQUENCES |
> > > > >>>     SEQUENCE name |
> > > > >>>     ALL SEQUENCES IN SCHEMA name |
> > > > >>>     name
> > > > >>>
> > > > >>> The grammar output would need some post-analysis to attribute the
> > > > >>> right type to bare "name" items, but that doesn't seem difficult.
> > > >
> > > > >> That last bare "name" cpitem. looks like it would permit the following syntax:
> > > > >> CREATE PUBLICATION pub FOR a,b,c;
> > > > >> Was that intentional?
> > > >
> > > > > I think so.
> > > >
> > > > I had supposed that we could throw an error at the post-processing stage,
> > > > or alternatively default to assuming that such names are tables.
> > > >
> > > > Now you could instead make the grammar work like
> > > >
> > > > cpitem := ALL TABLES |
> > > >           TABLE name [, ...] |
> > > >           ALL TABLES IN SCHEMA name [, ...] |
> > > >           ALL SEQUENCES |
> > > >           SEQUENCE name [, ...] |
> > > >           ALL SEQUENCES IN SCHEMA name [, ...]
> > > >
> > > > which would result in a two-level-list data structure.  I'm not sure
> > > > that this is better, as any sort of mistake would result in a very
> > > > uninformative generic "syntax error" from Bison.  Errors out of a
> > > > post-processing stage could be more specific than that.
> > >
> > > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is
somethinglike below: 
> > > CreatePublicationStmt:
> > > CREATE PUBLICATION name FOR pub_obj_list opt_definition
> > > {
> > > CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
> > > n->pubname = $3;
> > > n->options = $6;
> > > n->pubobjects = (List *)$5;
> > > $$ = (Node *)n;
> > > }
> > > ;
> > > pub_obj_list: PublicationObjSpec
> > > { $$ = list_make1($1); }
> > > | pub_obj_list ',' PublicationObjSpec
> > > { $$ = lappend($1, $3); }
> > > ;
> > > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
> > > PublicationObjSpec: TABLE pubobj_expr
> > > { ....}
> > > | ALL TABLES IN_P SCHEMA pubobj_expr
> > > { ....}
> > > | pubobj_expr
> > > { ....}
> > > ;
> > > pubobj_expr:
> > > any_name
> > > { ....}
> > > | any_name '*'
> > > { ....}
> > > | ONLY any_name
> > > { ....}
> > > | ONLY '(' any_name ')'
> > > { ....}
> > > | CURRENT_SCHEMA
> > > { ....}
> > > ;
> >
> > "FOR ALL TABLES” (that includes all tables in the database) is missing
> > in this syntax?
>
> "FOR ALL TABLES" is present in CreatePublicationStmt rule, sorry for
> not including all of CreatePublicationStmt rule in the previous mail,
> I thought of keeping the contents shorter:
>
> CreatePublicationStmt:
> CREATE PUBLICATION name opt_definition
> {....}
> | CREATE PUBLICATION name FOR ALL TABLES opt_definition
> {....}
> | CREATE PUBLICATION name FOR pub_obj_list opt_definition
> {....}
> ;
>
> It is not in  pub_obj_list as the user will be able to specify either
> of "FOR ALL TABLES" or "FOR TABLE/ FOR ALL TABLES IN SCHEMA" along
> with create publication.
>
> > >
> > > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle
CURRENT_SCHEMAsupport in case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support
objectslike "sch1.t1". 
> >
> > I think that relation_expr also accepts objects like "sch1.t1", no?
>
> Earlier syntax only supported relations, the relations were parsed
> into RangeVar datatype. The new feature supports schema for which only
> the schema name is required. To keep the parsing rule common, I used
> any_name which will store the dotted name into a list for both
> relation and schema. I will later convert it into rangevar for
> relation and schema oid for schema names during the processing and
> create the publication. I felt relation_expr was able to handle dotted
> names because of qualified_name having "ColId indirection", here
> indirection rule takes care of handling the dotted names.
>
> > > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and
"FORALL TABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. 
> >
> > I think so too.
> >
> > > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2,
ALLTABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and
"ALTERPUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP
separatelywhich cannot be combined currently. 
> >
> > If we support the former syntax, the latter two syntaxes are also
> > supported. Why do we want to support only the latter separate two
> > syntaxes?
>
> We can support either syntax, I was not sure which one is better. If
> alter also should support similar syntax I can do it as a separate
> patch so as to not increase the main patch size. Thoughts?

I have implemented this in the 0003 patch, I have kept it separate to
reduce the testing effort and also it will be easier if someone
disagrees with the syntax. I will merge it to the main patch later
based on the feedback. Attached v22 patch has the changes for the
same.
Thoughts?

Regards,
Vignesh

On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
> >
>
> Few comments on v22-0001-Added-schema-level-support-for-publication:
> ========================================================
> 1. Why in publication_add_schema(), you are registering invalidation
> for all schema relations? It seems this is to allow rebuilding the
> publication info for decoding sessions. But that is not required as
> you are registering rel_sync_cache_publication_cb for
> publication_schema relation. In rel_sync_cache_publication_cb, we are
> marking replicate_valid as false for each entry which will allow
> publication info to be rebuilt in get_rel_sync_entry.
>
> I see that it is done for a single relation in the current code in
> function publication_add_relation but I guess that is also not
> required. You can once test this. If you still think it is required,
> can you please explain me why and then we can accordingly add some
> comments in the patch.

Added a comment for this.

> 2.
> + * Publication object type
> + */
> +typedef enum PublicationObjSpecType
> +{
> + PUBLICATIONOBJ_TABLE, /* Table type */
> + PUBLICATIONOBJ_SCHEMA, /* Schema type */
> + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */
>
> Why add anything related to the sequence in this patch?

This is removed.

> 3.
> +get_object_address_publication_schema(List *object, bool missing_ok)
> +{
> + ObjectAddress address;
> + char    *pubname;
> + Publication *pub;
> + char    *schemaname;
> + Oid schemaoid;
> +
> + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid);
> +
> + /* Fetch schema name and publication name from input list */
> + schemaname = strVal(linitial(object));
> + pubname = strVal(lsecond(object));
> +
> + schemaoid = get_namespace_oid(schemaname, false);
> +
> + /* Now look up the pg_publication tuple */
> + pub = GetPublicationByName(pubname, missing_ok);
> + if (!pub)
> + return address;
>
> Why the schema name handling is different from publication name? Why
> can't we pass missing_ok for schema api and handle it similar
> publication api?

Modified to handle it similarly.

> 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used
> in get_publication_name() whereas it is used for all other syscache
> searches in that function?

Modified it to use missing_ok.

> 5. Don't we need to expose a view for publication schemas similar to
> pg_publication_tables?

I have not handled in this version, I will do it in the next version.

> 6.
> publication_add_schema()
> {
> ..
> + /* Can't be system namespace */
> + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("\"%s\" is a system schema",
> + get_namespace_name(schemaoid)),
> + errdetail("System schemas cannot be added to publications.")));
> +
> + /* Can't be temporary namespace */
> + if (isAnyTempNamespace(schemaoid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("\"%s\" is a temporary schema",
> + get_namespace_name(schemaoid)),
> + errdetail("Temporary schemas cannot be added to publications.")));
> ..
> }
>
> Can we change the first detail message as: "This operation is not
> supported for system schemas." and the second detail message as:
> "Temporary schemas cannot be replicated."? This is to make these
> messages similar to corresponding messages for relations in function
> check_publication_add_relation(). Can we move these checks to a
> separate function?

Modified.

Attached v23 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

30 August 2021, 15:19:53

On Mon, Aug 30, 2021 at 8:23 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
>
> Just experimenting with the new syntax so far, and seeing some new
> messages and docs, I have the following suggestions for improvements:
>
> src/backend/commands/publicationcmds.c
>
> (1)
> BEFORE:
> for table/for all tables in schema should be specified before the object
> AFTER:
> FOR TABLE / FOR ALL TABLES IN SCHEMA should be specified before the
> table/schema name(s)

Modified it to "FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s)", I felt we need not have space before and after "/", I had seen it is specified similarly in few other places.

> (2)
> BEFORE:
> Tables cannot be added, dropped or set on FOR ALL TABLES publications.
> AFTER:
> Tables cannot be added to, dropped from, or set on FOR ALL TABLES publications.

Modified

> (3)
> BEFORE:
> Schemas cannot be added, dropped or set on FOR ALL TABLES publications.
> AFTER:
> Schemas cannot be added to, dropped from, or set on FOR ALL TABLES publications.

Modified

> v22-0002
>
> doc/src/sgml/ref/create_publication.sgml
>
> (1)
> BEFORE:
> + Create a publication that publishes all changes for users and departments
> + table and that publishes all changes for all the tables present in the
> AFTER:
> + Create a publication that publishes all changes for tables "users" and
> + "departments" and that publishes all changes for all the tables
> present in the

Modified.

I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,

Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

30 August 2021, 15:26:29

On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to reduce the
> > testing effort and also it will be easier if someone disagrees with the syntax. I
> > will merge it to the main patch later based on the feedback. Attached v22 patch
> > has the changes for the same.
> > Thoughts?
>
> Hi,
>
> Here are some comments for the new version patches.
>
> About  0001
> 1)
> +                       rel->relpersistence = RELPERSISTENCE_PERMANENT;
>
> It seems we don't need to set this since makeRangeVarFromNameList()
> already set it.

Modified

> 2)
> +                       if (!relids || !schemarelids)
> +                               tables = list_concat(relids, schemarelids);
> +                       else
> +                               tables = list_concat_unique_oid(relids, schemarelids);
> +               }
>
> It seems we can simplify the above code like the following:
> tables = list_concat_unique_oid(relids, schemarelids);

Modified

> 3)
> +               relids = GetPublicationRelations(pubform->oid,
> +                                                                                PUBLICATION_PART_ALL);
> +               schemarelids = GetAllSchemasPublicationRelations(pubform->oid,
> +
PUBLICATION_PART_ALL);
> +               relids = list_concat(relids, schemarelids);
>
> should we invoke list_concat_unique_oid here ?

Modified

> 4)
>
> +                               search_path = fetch_search_path(false);
> +                               if (search_path == NIL) /* nothing valid in search_path? */
>
> It might be better to list_free(search_path) when not used.

Modified

> 5)
> +                       if (list_length(pubobj->name) == 1 &&
> +                               (strcmp(relname, "CURRENT_SCHEMA") == 0))
> +                               ereport(ERROR,
> +                                               errcode(ERRCODE_SYNTAX_ERROR),
> +                                               errmsg("invalid relation name at or near"),
> +                                               parser_errposition(pstate, pubobj->location));
>
> Maybe we don't need this check, because it will report an error in
> OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that
> message seems readable to me.

Allowing CURRENT_SCHEMA is required to support current schema for
schema publications, currently I'm allowing this syntax during parsing
and this error is thrown for relations later, this is done to keep the
similar error as earlier before this feature support. I felt we can
keep it like this to maintain the similar error. Thoughts?

> About  0002
> 6)
>
> diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
> index 0c84d87873..0a479dfe36 100644
> --- a/src/test/subscription/t/001_rep_changes.pl
> +++ b/src/test/subscription/t/001_rep_changes.pl
> @@ -6,7 +6,7 @@ use strict;
>  use warnings;
>  use PostgresNode;
>  use TestLib;
> -use Test::More tests => 32;
> +use Test::More tests => 46;
>
> I think it might be better to move these testcases create a separate perl file.

Modified, added to 025_rep_changes_for_schema.pl

> About  0003
> 7)
> The v22-0003 seems simple and can remove lots of code in patch v22-0001, so
> maybe we can merge 0001 and 0003 into one patch ?

I agree that the code becomes simpler, it reduces a lot of code. I had
kept it like that as the testing effort might be more and also I was
waiting if there was no objection for that syntax from anyone else. I
will wait for a few more reviews and merge it to 0001 if there are no
objections.

I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

30 August 2021, 15:28:02

On Mon, Aug 30, 2021 at 1:31 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
> > Thoughts?
> >
>
> Thanks for your new patch. Here are some suggestions:
>
> 1.
> If a publication published a table and the schema where the table belonged to, the
> publication name would show twice when using '\d+' for the table.
> Maybe we should add some check to avoid the duplication. Thought?
>
> For example:
>
> create schema sch1;
> create table sch1.tbl(a int);
> create publication pub for table sch1.tbl, all tables in schema sch1;
>
> postgres=# \d+ sch1.tbl
>                                             Table "sch1.tbl"
>  Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
> --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
>  a      | integer |           |          |         | plain   |             |              |
> Publications:
>     "pub"
>     "pub"
> Access method: heap

Modified

> 2. doc/src/sgml/catalogs.sgml
> @@ -6169,6 +6174,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
>         publication instead of its own.
>        </para></entry>
>       </row>
> +
>      </tbody>
>     </tgroup>
>    </table>
>
> It seems that we don't need this change.

Modified

> 3. src/bin/psql/tab-complete.c
>
> +       /* Complete "CREATE PUBLICATION <name> FOR SCHEMA <schema>, ..." */
> +       else if (HeadMatches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA"))
> +               COMPLETE_WITH_QUERY(Query_for_list_of_schemas
> +                                                       " UNION SELECT 'CURRENT_SCHEMA' "
> +                                                       "UNION SELECT 'WITH ('");
>
> The comment should be updated to "FOR ALL TABLES IN SCHEMA".

Modified.

I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

31 August 2021, 03:40:52

On Mon, Aug 30, 2021 at 2:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
>
> I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> TABLE sc1.test;"  maintains the table separately and results in the
> following in the \dRp+ output:
>
> Tables:
>     "sc1.test"
> Schemas:
>     "sc1"
>
> and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> still leaves the "sc1.test" table in the publication.

I had intentionally implemented this way, the reason being it gives
the flexibility to modify the publications based on the way the
publication is created. My idea was that if a user specified a
table/schema of the same schema while creating the publication, the
user should be allowed to drop any of them at any time. In the above
case if we don't maintain the results separately, users will not be
able to drop the table from the publication at a later point of time.
Thoughts?

> Is there a reason why we don't/can't support "ALTER SUBSCRIPTION ...
> SET ALL TABLES;"?
> (I know it wasn't supported before, but now "ALTER SUBSCRIPTION ...
> SET ALL TABLES IN SCHEMA ..." is being supported)

I agree this can be implemented with the current design. I felt I can
work on getting the current patches into a committable shape and then
work on this after the current patch series is finished. Thoughts?

> I notice that the v22-0003 documentation updates for ALTER
> SUBSCRIPTION are missing - but you're probably waiting on all feedback
> before proceeding with that.

I have fixed this as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

31 August 2021, 03:45:39

On Mon, Aug 30, 2021 at 12:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> Okay, I got it but let's add few comments in the code related to it.
> Also, I noticed that the code in InvalidatePublicationRels() already
> exists in AlterPublicationOptions(). You can try to refactor the
> existing code as a separate initial patch.

I have made these changes at the v23 patch attached at [1].

> BTW, I noticed that "for all tables", we don't register invalidations
> in the above scenario, and then later that causes conflict on the
> subscriber. I think that is a bug in the current code and we can deal
> with that separately.

I agree that the cache invalidation has been missed in case of "for all tables" publication, I have fixed these and posted a patch for the same at [2].

[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm0zkQznFrxzHBoWZUGsf%3DnKSxhEZZhZ1eTDWLpFok6zZw%40mail.gmail.com

Regards,
Vigneshhas

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

31 August 2021, 05:20:31

On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> >
> > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > TABLE sc1.test;"  maintains the table separately and results in the
> > following in the \dRp+ output:
> >
> > Tables:
> >     "sc1.test"
> > Schemas:
> >     "sc1"
> >
> > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > still leaves the "sc1.test" table in the publication.
>
> I had intentionally implemented this way, the reason being it gives
> the flexibility to modify the publications based on the way the
> publication is created. My idea was that if a user specified a
> table/schema of the same schema while creating the publication, the
> user should be allowed to drop any of them at any time. In the above
> case if we don't maintain the results separately, users will not be
> able to drop the table from the publication at a later point of time.
> Thoughts?
>

Hmmm. I'm not sure it should work like that (but maybe I'm wrong -
what do others think???).
I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
TABLE sc1.test;" should silently just ignore the "TABLE sc1.test"
part, as that is a table in schema sc1, so it's effectively a
duplicate.

Also, I noticed the following:

postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
postgres-# TABLE sc1.test;
CREATE PUBLICATION
postgres=# \dRp+
                            Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Tables:
    "sc1.test"
Schemas:
    "sc1"

postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
ALTER PUBLICATION
postgres=# \dRp+
                            Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Tables:
    "sc1.test"

postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
ERROR:  schema "sc1" is not part of the publication

I think it seems odd to not drop table "sc1.test" from the publication
after "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;".
Also, after running that command again, it seems odd to report that
schema sc1 is not part of the publication, when there remains one
table from that schema in the publication.
And shouldn't it say "tables from schema ... are not part of the
publication" rather than "schema ... is not part of the publication"?
I think the former is better and more accurate. Schemas can contain
database objects other than tables.
Similarly, I'm also thinking that in the "\dRp+" output, it should say
"Tables from schemas:" instead of "Schemas:".

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Added schema level support for publication.

From

Amit Kapila

Date:

31 August 2021, 10:57:16

On Tue, Aug 31, 2021 at 10:50 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > >
> > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > > TABLE sc1.test;"  maintains the table separately and results in the
> > > following in the \dRp+ output:
> > >
> > > Tables:
> > >     "sc1.test"
> > > Schemas:
> > >     "sc1"
> > >
> > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > > still leaves the "sc1.test" table in the publication.
> >
> > I had intentionally implemented this way, the reason being it gives
> > the flexibility to modify the publications based on the way the
> > publication is created. My idea was that if a user specified a
> > table/schema of the same schema while creating the publication, the
> > user should be allowed to drop any of them at any time. In the above
> > case if we don't maintain the results separately, users will not be
> > able to drop the table from the publication at a later point of time.
> > Thoughts?
> >
>
> Hmmm. I'm not sure it should work like that (but maybe I'm wrong -
> what do others think???).
> I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> TABLE sc1.test;" should silently just ignore the "TABLE sc1.test"
> part, as that is a table in schema sc1, so it's effectively a
> duplicate.
>

I find the way it is implemented to be more intuitive as that gives
users more flexibility to retain certain tables from the schema and
appears to be exactly what users intended by the command. I don't
think finding duplicates among different object lists (schema, table)
is a good idea because tomorrow for some other objects the same thing
can happen. It might be better to get some other opinions on this
matter though.

> Also, I noticed the following:
>
> postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> postgres-# TABLE sc1.test;
> CREATE PUBLICATION
> postgres=# \dRp+
>                             Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Tables:
>     "sc1.test"
> Schemas:
>     "sc1"
>
> postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> ALTER PUBLICATION
> postgres=# \dRp+
>                             Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Tables:
>     "sc1.test"
>
> postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> ERROR:  schema "sc1" is not part of the publication
>

What will happen if you second time run the command as ALTER
PUBLICATION pub1 DROP Table sc1.test? If that works, I think the
behavior should be fine.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

31 August 2021, 12:58:17

On Tue, Aug 31, 2021 at 9:15 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 12:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Okay, I got it but let's add few comments in the code related to it.
> > Also, I noticed that the code in InvalidatePublicationRels() already
> > exists in AlterPublicationOptions(). You can try to refactor the
> > existing code as a separate initial patch.
>
> I have made these changes at the v23 patch attached at [1].
>

*
+/*
+ * Invalidate the relations.
+ */
+static inline void
+InvalidatePublicationRels(List *relids)

I don't see the need to make this an inline function.

* On similar lines, the code in function
GetPubPartitionOptionRelations seems to be extracted from existing
function GetPublicationRelations(). Can't we move it into the 0001
patch?

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

31 August 2021, 14:05:44

On Tue, Aug 31, 2021 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 10:50 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > >
> > > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > > > TABLE sc1.test;"  maintains the table separately and results in the
> > > > following in the \dRp+ output:
> > > >
> > > > Tables:
> > > >     "sc1.test"
> > > > Schemas:
> > > >     "sc1"
> > > >
> > > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > > > still leaves the "sc1.test" table in the publication.
> > >
> > > I had intentionally implemented this way, the reason being it gives
> > > the flexibility to modify the publications based on the way the
> > > publication is created. My idea was that if a user specified a
> > > table/schema of the same schema while creating the publication, the
> > > user should be allowed to drop any of them at any time. In the above
> > > case if we don't maintain the results separately, users will not be
> > > able to drop the table from the publication at a later point of time.
> > > Thoughts?
> > >
> >
> > Hmmm. I'm not sure it should work like that (but maybe I'm wrong -
> > what do others think???).
> > I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > TABLE sc1.test;" should silently just ignore the "TABLE sc1.test"
> > part, as that is a table in schema sc1, so it's effectively a
> > duplicate.
> >
>
> I find the way it is implemented to be more intuitive as that gives
> users more flexibility to retain certain tables from the schema and
> appears to be exactly what users intended by the command. I don't
> think finding duplicates among different object lists (schema, table)
> is a good idea because tomorrow for some other objects the same thing
> can happen. It might be better to get some other opinions on this
> matter though.
>
> > Also, I noticed the following:
> >
> > postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > postgres-# TABLE sc1.test;
> > CREATE PUBLICATION
> > postgres=# \dRp+
> >                             Publication pub1
> >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > -------+------------+---------+---------+---------+-----------+----------
> >  gregn | f          | t       | t       | t       | t         | f
> > Tables:
> >     "sc1.test"
> > Schemas:
> >     "sc1"
> >
> > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> > ALTER PUBLICATION
> > postgres=# \dRp+
> >                             Publication pub1
> >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > -------+------------+---------+---------+---------+-----------+----------
> >  gregn | f          | t       | t       | t       | t         | f
> > Tables:
> >     "sc1.test"
> >
> > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> > ERROR:  schema "sc1" is not part of the publication
> >
>
> What will happen if you second time run the command as ALTER
> PUBLICATION pub1 DROP Table sc1.test? If that works, I think the
> behavior should be fine.

Alter publication drop table works fine:
postgres=# ALTER PUBLICATION pub1 DROP table sc1.test ;
ALTER PUBLICATION
postgres=# \dRp+
                             Publication pub1
  Owner  | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------+------------+---------+---------+---------+-----------+----------
 vignesh | f          | t       | t       | t       | t         | f
(1 row)

Regards,
Vignesh

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

01 September 2021, 01:28:38

From Mon, Aug 30, 2021 11:26 PM vignesh C <vignesh21@gmail.com> wrote:
> On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > 5)
> > +                       if (list_length(pubobj->name) == 1 &&
> > +                               (strcmp(relname, "CURRENT_SCHEMA") ==
> 0))
> > +                               ereport(ERROR,
> > +
> errcode(ERRCODE_SYNTAX_ERROR),
> > +                                               errmsg("invalid relation
> name at or near"),
> > +
> > + parser_errposition(pstate, pubobj->location));
> >
> > Maybe we don't need this check, because it will report an error in
> > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" ,
> > and that message seems readable to me.
> 
> Allowing CURRENT_SCHEMA is required to support current schema for schema
> publications, currently I'm allowing this syntax during parsing and this error is
> thrown for relations later, this is done to keep the similar error as earlier before
> this feature support. I felt we can keep it like this to maintain the similar error.
> Thoughts?

Thanks for the explanation, I got the point.

Here are some other comments for v23-000x patches.

1)

@@ -6225,6 +6342,9 @@ describePublications(const char *pattern)
     bool        has_pubtruncate;
     bool        has_pubviaroot;
 
+    PQExpBufferData title;
+    printTableContent cont;
+
     if (pset.sversion < 100000)
     {
        ...
        PQExpBufferData title;
         printTableOpt myopt = pset.popt.topt;
         printTableContent cont;

Should we delete the inner declaration of 'title' and 'cont' ?

2)
-    /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
+    /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE/SCHEMA */

SCHEMA => ALL TABLES IN SCHEMA

3)

+                              .description = "PUBLICATION SCHEMA",
+                              .section = SECTION_POST_DATA,
+                              .createStmt = query->data));

Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to describe
the schema level table publication ? Because there could be some other type
publication such as 'ALL SEQUENCES IN SCHEMA' in the future, it will be better
to make it clear that we only publish table in schema in this patch.

Best regards,
Hou zj

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

01 September 2021, 03:22:22

On Tue, Aug 31, 2021 at 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I find the way it is implemented to be more intuitive as that gives
> users more flexibility to retain certain tables from the schema and
> appears to be exactly what users intended by the command. I don't
> think finding duplicates among different object lists (schema, table)
> is a good idea because tomorrow for some other objects the same thing
> can happen. It might be better to get some other opinions on this
> matter though.
>

I think that such functionality needs to be clearly documented (but
currently the documentation doesn't sufficiently explain it).
Yes, I would definitely like to hear other opinions on this.

Note also that currently parts of the documentation are still
referring to "ADD SCHEMA/DROP SCHEMA/SET SCHEMA" instead of the new
syntax "ADD ALL TABLES IN SCHEMA/DROP ALL TABLES IN SCHEMA/SET ALL
TABLES IN SCHEMA":

e.g.
v23-0003:
doc/src/sgml/ref/alter_publication.sgml
+   The fourth, fifth and sixth variants of this command change which schemas
+   are part of the publication.  The <literal>SET SCHEMA</literal> clause will
+   replace the list of schemas in the publication with the specified one.
+   The <literal>ADD SCHEMA</literal> and <literal>DROP SCHEMA</literal> clauses
+   will add and remove one or more schemas from the publication.  Note that
+   adding schemas to a publication that is already subscribed to will require
+   a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on
+   the subscribing side in order to become effective.

With the new syntax changes (SCHEMA -> ALL TABLES IN SCHEMA), it seems
less intuitive that there are separate schema and table operations on
the publication.

I'd expect a lot of users to naturally think that "ALTER PUBLICATION
pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication
all tables that are in schema "sc1", which is not what it is currently
doing.
Since the syntax was changed to specifically refer to FOR ALL TABLES
IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring
to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE
sc1.test", so IMHO it's reasonable to remove duplicates here, rather
than treating these as somehow separate ways of referencing the same
table.

One thing the current scheme doesn't allow is to publish all tables in
a schema except for certain table(s) - and you can't achieve that by
adding all tables from a schema to the publication and then
selectively dropping some of those tables. I thought that this would
be a more common pattern than adding separate tables from schemas that
are already included as part of the publication, in order to "retain"
them if "all tables from schema ..." are later dropped.

postgres=# create schema sc1;
CREATE SCHEMA
postgres=# create table sc1.test(i int);
CREATE TABLE
postgres=# create publication pub1 for all tables in schema sc1;
CREATE PUBLICATION
postgres=# \dRp+

                            Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Schemas:
    "sc1"

postgres=# alter publication pub1 drop table sc1.test;
ERROR:  relation "test" is not part of the publication

The above error message seems slightly misleading (as that table IS
published by that publication) and also note the relation is not
schema-qualified.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

01 September 2021, 05:44:45

> On Monday, August 30, 2021 11:28 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have fixed these comments as part of v23 patch attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw
> %40mail.gmail.com
> 

Thanks for your new patch. Here are some comments on v23 patch. 

1. doc/src/sgml/ref/alter_publication.sgml
+  <para>
+   Add some schemas to the publication:
+<programlisting>
+ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june;
+</programlisting>
+  </para>

This change seems to be added twice, both 0003 and 0004 patch have this change.

2. src/sgml/ref/create_publication.sgml
There is the following description about "FOR TABLE" parameter:
      Only persistent base tables and partitioned tables can be part of a
      publication.  Temporary tables, unlogged tables, foreign tables,
      materialized views, and regular views cannot be part of a publication.

"FOR ALL TABLES IN SCHEMA" parameter also have restrictions, should we add
some doc description for it?

3. When using '\dn+', I noticed that the list of publications only contains the
publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it designed on purpose?
(The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES").

For example:
create schema sch1;
create table sch1.tbl(a int);
create publication pub_schema for all tables in schema sch1;
create publication pub_all_tables for all tables;

postgres=# \d+ sch1.tbl
                                            Table "sch1.tbl"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              |
Publications:
    "pub_all_tables"
    "pub_schema"
Access method: heap

postgres=# \dn+ sch1
                  List of schemas
 Name |  Owner   | Access privileges | Description
------+----------+-------------------+-------------
 sch1 | postgres |                   |
Publications:
    "pub_schema"

Regards
Tang

Re: Added schema level support for publication.

From

Amit Kapila

Date:

01 September 2021, 06:35:23

On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I find the way it is implemented to be more intuitive as that gives
> > users more flexibility to retain certain tables from the schema and
> > appears to be exactly what users intended by the command. I don't
> > think finding duplicates among different object lists (schema, table)
> > is a good idea because tomorrow for some other objects the same thing
> > can happen. It might be better to get some other opinions on this
> > matter though.
> >
>
> I think that such functionality needs to be clearly documented (but
> currently the documentation doesn't sufficiently explain it).
>

I agree that we should document it clearly if we decide to go in this direction.

> Yes, I would definitely like to hear other opinions on this.
>
> Note also that currently parts of the documentation are still
> referring to "ADD SCHEMA/DROP SCHEMA/SET SCHEMA" instead of the new
> syntax "ADD ALL TABLES IN SCHEMA/DROP ALL TABLES IN SCHEMA/SET ALL
> TABLES IN SCHEMA":
>
> e.g.
> v23-0003:
> doc/src/sgml/ref/alter_publication.sgml
> +   The fourth, fifth and sixth variants of this command change which schemas
> +   are part of the publication.  The <literal>SET SCHEMA</literal> clause will
> +   replace the list of schemas in the publication with the specified one.
> +   The <literal>ADD SCHEMA</literal> and <literal>DROP SCHEMA</literal> clauses
> +   will add and remove one or more schemas from the publication.  Note that
> +   adding schemas to a publication that is already subscribed to will require
> +   a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on
> +   the subscribing side in order to become effective.
>
> With the new syntax changes (SCHEMA -> ALL TABLES IN SCHEMA), it seems
> less intuitive that there are separate schema and table operations on
> the publication.
>

I think the documentation could be improved w.r.t above points because
it is quite clear that we support schema and tables separately.


> I'd expect a lot of users to naturally think that "ALTER PUBLICATION
> pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication
> all tables that are in schema "sc1", which is not what it is currently
> doing.
> Since the syntax was changed to specifically refer to FOR ALL TABLES
> IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring
> to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE
> sc1.test", so IMHO it's reasonable to remove duplicates here, rather
> than treating these as somehow separate ways of referencing the same
> table.
>

I see your point and if we decide to go this path then it is better to
give an error than silently removing duplicates.

> One thing the current scheme doesn't allow is to publish all tables in
> a schema except for certain table(s)
>

True, we can always support that as a separate feature. Note that same
is true for existing FOR ALL TABLES syntax where users could expect to
leave few tables with syntax like FOR ALL TABLES EXCEPT t1,t2,... but
we don't have that yet.

> - and you can't achieve that by
> adding all tables from a schema to the publication and then
> selectively dropping some of those tables. I thought that this would
> be a more common pattern than adding separate tables from schemas that
> are already included as part of the publication, in order to "retain"
> them if "all tables from schema ..." are later dropped.
>
> postgres=# create schema sc1;
> CREATE SCHEMA
> postgres=# create table sc1.test(i int);
> CREATE TABLE
> postgres=# create publication pub1 for all tables in schema sc1;
> CREATE PUBLICATION
> postgres=# \dRp+
>
>                             Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Schemas:
>     "sc1"
>
> postgres=# alter publication pub1 drop table sc1.test;
> ERROR:  relation "test" is not part of the publication
>
> The above error message seems slightly misleading (as that table IS
> published by that publication) and also note the relation is not
> schema-qualified.
>

I think we should try to improve this message, maybe we can give
detail message similar to what we give For ALL Tables case (DETAIL:
Tables that are part of SCHEMA cannot be dropped independently).

--
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Peter Smith

Date:

01 September 2021, 06:36:03

On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 2:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have implemented this in the 0003 patch, I have kept it separate to
> > > reduce the testing effort and also it will be easier if someone
> > > disagrees with the syntax. I will merge it to the main patch later
> > > based on the feedback. Attached v22 patch has the changes for the
> > > same.
> >
> > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > TABLE sc1.test;"  maintains the table separately and results in the
> > following in the \dRp+ output:
> >
> > Tables:
> >     "sc1.test"
> > Schemas:
> >     "sc1"
> >
> > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > still leaves the "sc1.test" table in the publication.
>
> I had intentionally implemented this way, the reason being it gives
> the flexibility to modify the publications based on the way the
> publication is created. My idea was that if a user specified a
> table/schema of the same schema while creating the publication, the
> user should be allowed to drop any of them at any time. In the above
> case if we don't maintain the results separately, users will not be
> able to drop the table from the publication at a later point of time.
> Thoughts?

I think ALL should mean ALL.

When you say CREATE PUBLICATION pub1 FOR ALL TABLES; you know it means
ALL tables;

When you say CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1; you
know it means ALL tables in the schema sc1;

Similarly, when you say ALTER PUBLICATION pub1 DROP ALL TABLES IN
SCHEMA sc1; I would expect it means to drop ALL tables in sc1 - not
all of them sometimes but not all of them at other times or even none
of them.

~~

I thought a motivation for this patch was to make it easy for the user
to specify tables en-masse. e.g, saying FOR ALL TABLES IN SCHEMA sc1
is a convenience instead of having to specify every schema table
explicitly.

e.g. What if there are 100s of tables explicitly in a publication.
1. CREATE PUBLICATION pub FOR TABLE sc1.t1,sc1,t2,sc1.t3,....,sc1.t999;
2. ALTER PUBLICATION pub DROP ALL TABLES IN SCHEMA sc1;

IIUC the current patch behaviour for step 2 will do nothing at all.
That doesn't seem right to me. Where is the user-convenience factor
for dropping tables here?

~~

If the rule was simply "ALL means ALL" that hardly even needs
documentation to describe it. OTOH, current patch behaviour is quirky
wrt how the publication was created and would need to be carefully
documented.

It is easy to imagine a user unfamiliar with how the publication was
originally created will be confused when ALTER PUBLICATION DROP ALL
TABLES IN SCHEMA sc1 still leaves some sc1 tables lurking.

~~

Schema objects are not part of the publication. Current only TABLES
are in publications, so I thought that \dRp+ output would just be the
of "Tables" in the publication. Schemas would not even be displayed at
all (except in the table name).

Consider that everything is just going to get messier when SEQUENCES
are added. If you list Schemas like is done now then what's that going
to look like later? I think you'd need to display many lists like -
"Tables" and "Tables for Schemas" and "Sequences" and "Sequences for
Schema"...

IMO it's all beginning to sound like it would become overly complex

~~

For all the above reasons I think ALL should mean ALL, and also \dRp+
should display tables only. (and later table and sequences only)

YMMV.

------
Kind Regards,
Peter Smith
Fujitsu Australia

Re: Added schema level support for publication.

From

Amit Kapila

Date:

02 September 2021, 05:33:47

On Wed, Sep 1, 2021 at 12:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Similarly, when you say ALTER PUBLICATION pub1 DROP ALL TABLES IN
> SCHEMA sc1; I would expect it means to drop ALL tables in sc1 - not
> all of them sometimes but not all of them at other times or even none
> of them.
>
> ~~
>
> I thought a motivation for this patch was to make it easy for the user
> to specify tables en-masse. e.g, saying FOR ALL TABLES IN SCHEMA sc1
> is a convenience instead of having to specify every schema table
> explicitly.
>
> e.g. What if there are 100s of tables explicitly in a publication.
> 1. CREATE PUBLICATION pub FOR TABLE sc1.t1,sc1,t2,sc1.t3,....,sc1.t999;
> 2. ALTER PUBLICATION pub DROP ALL TABLES IN SCHEMA sc1;
>
> IIUC the current patch behaviour for step 2 will do nothing at all.
> That doesn't seem right to me. Where is the user-convenience factor
> for dropping tables here?
>

It will give an error (ERROR:  schema "sc1" is not part of the
publication), we can probably add DETAIL and HINT message (like try
dropping using DROP TABLE syntax) indicating the reason and what user
need to do in this regard. I think if the user has specified tables
individually instead of ALL TABLES IN SCHEMA, she should drop them
individually rather than expecting them to be dropped via SCHEMA
command. I think in a similar context you can also argue that we
should have DROP ALL TABLES syntax to drop all the tables that were
individually specified by the user with FOR TABLE command.

I think this should be documented as well to avoid any confusion.

> ~~
>
> If the rule was simply "ALL means ALL" that hardly even needs
> documentation to describe it. OTOH, current patch behaviour is quirky
> wrt how the publication was created and would need to be carefully
> documented.
>
> It is easy to imagine a user unfamiliar with how the publication was
> originally created will be confused when ALTER PUBLICATION DROP ALL
> TABLES IN SCHEMA sc1 still leaves some sc1 tables lurking.
>
> ~~
>
> Schema objects are not part of the publication. Current only TABLES
> are in publications, so I thought that \dRp+ output would just be the
> of "Tables" in the publication. Schemas would not even be displayed at
> all (except in the table name).
>

Hmm, I think this will lead to a lot of table names in output. I think
displaying schema names separately is a better approach here.

> Consider that everything is just going to get messier when SEQUENCES
> are added. If you list Schemas like is done now then what's that going
> to look like later? I think you'd need to display many lists like -
> "Tables" and "Tables for Schemas" and "Sequences" and "Sequences for
> Schema"...
>

I think that would be better than displaying all the tables and
sequences as that will be very difficult for users to view/read.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

02 September 2021, 06:28:16

On Tue, Aug 31, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> >
> > I have made these changes at the v23 patch attached at [1].
> >
>
> *
> +/*
> + * Invalidate the relations.
> + */
> +static inline void
> +InvalidatePublicationRels(List *relids)
>
> I don't see the need to make this an inline function.

Modified

> * On similar lines, the code in function
> GetPubPartitionOptionRelations seems to be extracted from existing
> function GetPublicationRelations(). Can't we move it into the 0001
> patch?

Modified

Thanks for the comments, the attached v24 patch has the changes for the same.

Regards,
Vignesh

On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Below are few comments on v23. If you have already addressed anything
> in v24, then ignore it.
>
> 1. The commit message says: A new system table "pg_publication_schema"
> has been added, to maintain the schemas that the user wants to publish
> through the publication.". The alternative name for this system table
> could be "pg_publication_namespace". The reason why this alternative
> comes to my mind is that the current system table to store schema
> information is named pg_namespace. So shouldn't we be consistent here?
> What do others think about this?

Modified

> 2. In function check_publication_add_schema(), the primary error
> message should be "cannot add schema \"%s\" to publication". See
> check_publication_add_relation() for similar error messages.

Modified

> 3.
> +ObjectAddress
> +publication_add_schema(Oid pubid, Oid schemaoid, bool if_not_exists)
>
> Isn't it better to use 'schemaid' so that it is consistent with 'pubid'?

Modified

> 4.
> ConvertPubObjSpecListToOidList()
> {
> ..
> + schemaoid = linitial_oid(search_path);
> + nspname = get_namespace_name(schemaoid);
> + if (nspname == NULL) /* recently-deleted namespace? */
> + ereport(ERROR,
> + errcode(ERRCODE_UNDEFINED_SCHEMA),
> + errmsg("no schema has been selected"));
> +
> + list_free(search_path);
> ..
> }
>
> You can free the memory for 'nspname' as that is not used afterward.

I have removed get_namespace_name, no need to validate the schemaid,
fetch_search_path will return valid schemaids.


> 5.
> + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL);
> +
> + /* Invalidate relcache so that publication info is rebuilt. */
> + InvalidatePublicationRels(schemaRels);
>
> It is better to write this comment above
> GetSchemaPublicationRelations, something like below:
>
> + /* Invalidate relcache so that publication info is rebuilt. */
> + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL);
> + InvalidatePublicationRels(schemaRels);

Modified

> 6.
> +static List *
> +GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
> +    Oid relid)
>
> I think it is better to name this function as
> GetPublicationPartOptRelations as that way it will be more consistent
> with existing functions and structure name PublicationPartOpt.

Modified

> 7. All the callers of PublicationAddSchemas() have a superuser check,
> then why there is again a check of owner/superuser in that function?

Modified to remove the check

> 8.
> +/*
> + * Gets the list of FOR ALL TABLES IN SCHEMA publication oids associated with a
> + * specified schema oid
> + */
> +List *
> +GetSchemaPublications(Oid schemaid)
>
> I find it a bit difficult to read this comment. Can we omit "FOR ALL
> TABLES IN SCHEMA" from this comment?

Modified.

> 9. In the doc patch
> (v23-0003-Tests-and-documentation-for-schema-level-support), I see
> below line:
>    <para>
> -   To add a table to a publication, the invoking user must have ownership
> -   rights on the table.  The <command>FOR ALL TABLES</command> clause requires
> -   the invoking user to be a superuser.
> +   To add a table/schema to a publication, the invoking user must have
> +   ownership rights on the table/schema.  The <command>FOR ALL TABLES</command>
> +   and <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking
> +   user to be a superuser.
>
> Is it correct to specify the schema in the first line? AFAIU, all
> forms of schema addition requires superuser privilege.

That is not required, modified.

Attached v25 patch has the changes for the same.
Currently I have used a different way to parse which does not require
"Add PublicationTable and PublicationRelInfo structs" committed
changes, I have currently removed the changes in this patch. I will
analyze further and use whichever parsing is better. I will handle
this in the next version.

Regards,
Vignesh

On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 5.
> > > If I modify the search path to remove public schema then I get the
> > > below error message:
> > > postgres=# Create publication mypub for all tables in schema current_schema;
> > > ERROR:  no schema has been selected
> > >
> > > I think this message is not very clear. How about changing to
> > > something like "current_schema doesn't contain any valid schema"? This
> > > message is used in more than one place, so let's keep it the same at
> > > all the places if you agree to change it.
> >
> > I would prefer to use the existing messages as we have used this in a
> > few other places similarly. Thoughts?
> >
>
> Yeah, I also see the same message in code but I think here usage is a
> bit different. If you see a similar SQL statement that causes the same
> error message then can you please give an example?

Changed it to "no schema has been selected for CURRENT_SCHEMA"

> Few comments on latest patch
> (v25-0002-Added-schema-level-support-for-publication):
> =====================================================================
> 1.
> getPublicationSchemaInfo()
> ..
> + *nspname = get_namespace_name(pnform->pnnspcid);
> + if (!(*nspname))
> + {
> + Oid schemaid = pnform->pnpubid;
> +
> + pfree(*pubname);
> + ReleaseSysCache(tup);
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for schema %u",
> + schemaid);
>
> Why are you using pnform->pnpubid to get schemaid? I think you need
> pnform->pnnspcid here and it was like that in the previous version,
> not sure, why you changed it?

Modified

> 2.
> @@ -369,15 +531,20 @@ AlterPublicationTables(AlterPublicationStmt
> *stmt, Relation rel,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>   errmsg("publication \"%s\" is defined as FOR ALL TABLES",
>   NameStr(pubform->pubname)),
> - errdetail("Tables cannot be added to or dropped from FOR ALL TABLES
> publications.")));
> + errdetail("Tables cannot be added to, dropped from, or set on FOR
> ALL TABLES publications.")));
>
> Why is this message changed? Have we changed anything related to this
> as part of this patch?

This change is not required in this patch, reverted

> 3.
> + Oid pnnspcid BKI_LOOKUP(pg_namespace); /* Oid of the schema */
> +} FormData_pg_publication_namespace;
> +
> ...
> ...
> +DECLARE_UNIQUE_INDEX(pg_publication_namespace_pnnspcid_pnpubid_index,
> 8903, PublicationNamespacePnnspcidPnpubidIndexId, on
> pg_publication_namespace using btree(pnnspcid oid_ops, pnpubid
> oid_ops));
>
> Let's use nspid instead nspcid at all places as before we refer that
> way in the code. The way you have done is also okay but it is better
> to be consistent with existing code.

Modified

> 4. Need to think of comments in GetSchemaPublicationRelations.
> + /* get all the ordinary tables present in schema schemaid */
> ..
>
> Let's change the above comment to something like: "get all the
> relations present in the given schema"
>
> +
> + /*
> + * Get all relations that inherit from the partition table, directly or
> + * indirectly.
> + */
> + scan = table_beginscan_catalog(classRel, keycount, key);
>
>
> Let's change the above comment to something like: "It is quite
> possible that some of the partitions are in a different schema than
> the parent table, so we need to get such partitions separately."

Modified

> 5.
> + if (list_member_oid(schemaidlist, relSchemaId))
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot add relation \"%s.%s\" to publication",
> +    get_namespace_name(relSchemaId),
> +    RelationGetRelationName(rel)),
> + errdetail("Table's schema is already included as part of ALL TABLES
> IN SCHEMA option."));
>
> I think the better errdetail would be: "Table's schema \"%s\" is
> already part of the publication."
>
> + if (list_member_oid(schemaidlist, relSchemaId))
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot add schema \"%s\" to publication",
> +    get_namespace_name(get_rel_namespace(tableOid))),
> + errdetail("Table \"%s.%s\" is part of publication, adding same
> schema \"%s\" is not supported",
> +   get_namespace_name(get_rel_namespace(tableOid)),
> +   get_rel_name(tableOid),
> +   get_namespace_name(get_rel_namespace(tableOid))));
>
> I think this errdetail message can also be improved. One idea could
> be: "Table \"%s\" in schema \"%s\" is already part of the publication,
> adding the same schema is not supported.". Do you or anyone else have
> better ideas?

Modified

> 6. I think instead of two different functions
> CheckRelschemaInSchemaList and CheckSchemaInRels, let's have a single
> function RelSchemaIsMemberOfSchemaList and have a boolean variable to
> distinguish the two cases.

Modified
Thanks for the comments, the attached v26 patch has the changes for the same.

Regards,
VIgnesh

From Wed, Sept 8, 2021 7:44 PM vignesh C <vignesh21@gmail.com> wrote:
> Modified
> Thanks for the comments, the attached v26 patch has the changes for the same.

Hi,

Thanks for updating the patch, I have a suggestion for the gram.y.

Currently, we have the following two members in PublicationObjSpec to distinguish
between names of different objects for the post-analysis phase.

>bool        inh;
>bool        spl_rel_type_syn;

I was thinking do we have another way to distinguish that which can make code
smaller. I tried serval approaches and found a possible better approach.

First, I refer to the design of Grant/Revoke syntax, it use two members
'targtype' and 'objtype' to mark different type. 'targtype' can be
ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema level)
.'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE or
... . I think if we follow this way, the code could be cleaner.

Second, we can move the special relation expression into a separate rule
'speical_relation_expr' like the following, this can remove duplicate code
used by pubobj_expr.
------
relation_expr:
            qualified_name
                {}
            | speical_relation_expr
                {
                    $$ = $1;
                }
        ;
speical_relation_expr:
            qualified_name '*'
                {}
        | ONLY qualified_name
        {}
        | ONLY '(' qualified_name ')'
        {}
    ;
------

Finally, the gram.y will look like the following.
Personally, the code looks cleaner in this approach.

-----
pubobj_expr:
            any_name
                {
                    /* inheritance query, implicitly */
                    $$ = makeNode(PublicationObjSpec);
                    $$->targettype = TARGETOBJ_UNKNOWN;
                    $$->object = $1;
                }
            | special_relation_expr
                {
                    $$ = makeNode(PublicationObjSpec);
                    $$->targettype = TARGETOBJ_TABLE;
                    $$->object = $1;
                }
            | CURRENT_SCHEMA
                {
                    $$ = makeNode(PublicationObjSpec);
                    $$->object = list_make1(makeString("CURRENT_SCHEMA"));
                    $$->targettype = TARGETOBJ_SCHEMA;
                }
        ;

/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec:    TABLE pubobj_expr
                    {
                        $$ = $2;
                        $$->pubobjtype = PUBLICATIONOBJ_TABLE_LIST;
                        $$->location = @1;
                    }

            | ALL TABLES IN_P SCHEMA pubobj_expr
                    {
                        $$ = $5;
                        $$->pubobjtype = PUBLICATIONOBJ_SCHEMA_LIST;
                        $$->location = @1;
                    }
            | pubobj_expr
                    {
                        $$ = $1;
                        $$->pubobjtype = PUBLICATIONOBJ_UNKNOWN;
                        $$->location = @1;
                    }
        ;

Attach a diff patch based on v26-0002 patch, which change the corresponding
code based on the design above. Please have a look.

Best regards,
Hou zj

Attachment

0001-refactor_patch

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

10 September 2021, 03:23:56

From Friday, September 10, 2021 10:33 AM Hou Zhijie<houzj.fnst@fujitsu.com> wrote:
> From Wed, Sept 8, 2021 7:44 PM vignesh C <vignesh21@gmail.com> wrote:
> > Modified
> > Thanks for the comments, the attached v26 patch has the changes for the
> same.
> 
> Hi,
> 
> Thanks for updating the patch, I have a suggestion for the gram.y.
> 
> Currently, we have the following two members in PublicationObjSpec to
> distinguish
> between names of different objects for the post-analysis phase.
> 
> >bool        inh;
> >bool        spl_rel_type_syn;
> 
> I was thinking do we have another way to distinguish that which can make code
> smaller. I tried serval approaches and found a possible better approach.
> 
> First, I refer to the design of Grant/Revoke syntax, it use two members
> 'targtype' and 'objtype' to mark different type. 'targtype' can be
> ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema
> level)
> .'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE
> or
> ... . I think if we follow this way, the code could be cleaner.
> 
> Second, we can move the special relation expression into a separate rule
> 'speical_relation_expr' like the following, this can remove duplicate code
> used by pubobj_expr.
> ------
> relation_expr:
>             qualified_name
>                 {}
>             | speical_relation_expr
>                 {
>                     $$ = $1;
>                 }
>         ;
> speical_relation_expr:
>             qualified_name '*'
>                 {}
>         | ONLY qualified_name
>         {}
>         | ONLY '(' qualified_name ')'
>         {}
>     ;
> ------
> 
> Finally, the gram.y will look like the following.
> Personally, the code looks cleaner in this approach.

Besides, If we don't want to use a new flag to distinguish tablename and schemaname,
We can only check the NodeTag to distinguish the difference.

Attach two diff patches based on the latest schema patch
which change the code with a flag and without a flag.

Best regards,
Hou zj

Attachment

Re: Added schema level support for publication.

From

Amit Kapila

Date:

10 September 2021, 05:10:22

On Fri, Sep 10, 2021 at 8:54 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Friday, September 10, 2021 10:33 AM Hou Zhijie<houzj.fnst@fujitsu.com> wrote:
>
> Besides, If we don't want to use a new flag to distinguish tablename and schemaname,
> We can only check the NodeTag to distinguish the difference.
>
> Attach two diff patches based on the latest schema patch
> which change the code with a flag and without a flag.
>

I would prefer a version without additional flags unless you think it
is difficult to extend it in the future for other objects like
sequences which as far as I can see shouldn't be the case. Is there a
reason to define pubobj_name similar to any_name? If so, then please
do add the comments. One reason I could think of is that any_name is
not used for schema names currently which might have motivated you to
define a separate naming convention for publication.

-- 
With Regards,
Amit Kapila.

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

10 September 2021, 05:51:18

From Friday, September 10, 2021 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> >
> > From Friday, September 10, 2021 10:33 AM Hou
> Zhijie<houzj.fnst@fujitsu.com> wrote:
> >
> > Besides, If we don't want to use a new flag to distinguish tablename and
> schemaname,
> > We can only check the NodeTag to distinguish the difference.
> >
> > Attach two diff patches based on the latest schema patch
> > which change the code with a flag and without a flag.
> >
> 
> I would prefer a version without additional flags unless you think it
> is difficult to extend it in the future for other objects like
> sequences which as far as I can see shouldn't be the case.

Ok, I agreed.

> Is there a
> reason to define pubobj_name similar to any_name? If so, then please
> do add the comments. One reason I could think of is that any_name is
> not used for schema names currently which might have motivated you to
> define a separate naming convention for publication.

When I used any_name, Bison reported that the dot('.') in rule attr
would have a shift/reduce conflict with the dot('.') in rule indirection_el
which also used in pubobj_expr. So, I declared a new rule which will directly
use indirection_el to resolve the conflicts.

Attach the without-flag version and add comments about the pubobj_name.

Best regards,
Hou zj

Attachment

refactor-without-flag_patch

Re: Added schema level support for publication.

From

vignesh C

Date:

12 September 2021, 15:13:12

On Fri, Sep 10, 2021 at 11:21 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Friday, September 10, 2021 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> > >
> > > From Friday, September 10, 2021 10:33 AM Hou
> > Zhijie<houzj.fnst@fujitsu.com> wrote:
> > >
> > > Besides, If we don't want to use a new flag to distinguish tablename and
> > schemaname,
> > > We can only check the NodeTag to distinguish the difference.
> > >
> > > Attach two diff patches based on the latest schema patch
> > > which change the code with a flag and without a flag.
> > >
> >
> > I would prefer a version without additional flags unless you think it
> > is difficult to extend it in the future for other objects like
> > sequences which as far as I can see shouldn't be the case.
>
> Ok, I agreed.
>
> > Is there a
> > reason to define pubobj_name similar to any_name? If so, then please
> > do add the comments. One reason I could think of is that any_name is
> > not used for schema names currently which might have motivated you to
> > define a separate naming convention for publication.
>
> When I used any_name, Bison reported that the dot('.') in rule attr
> would have a shift/reduce conflict with the dot('.') in rule indirection_el
> which also used in pubobj_expr. So, I declared a new rule which will directly
> use indirection_el to resolve the conflicts.
>
> Attach the without-flag version and add comments about the pubobj_name.

Thanks for the changes, the suggested changes make the parsing code
simpler. I have merged the changes to the main patch. Attached v27
patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

Amit Kapila

Date:

13 September 2021, 11:41:41

On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the changes, the suggested changes make the parsing code
> simpler. I have merged the changes to the main patch. Attached v27
> patch has the changes for the same.
>

+pubobj_name: ColId { $$ = list_make1(makeString($1)); }
+ | ColId indirection { $$ = lcons(makeString($1), $2); }
  ;

I think "ColId indirection" should handle catalog and schema name as
we are doing in qualified_name. See attached (this can be applied atop
v27-0002). The one other improvement we can do here is to extract the
common code from qualified_name (especially for "ColId indirection")
and pubobj_name. What do you think?

-- 
With Regards,
Amit Kapila.

Attachment

v28-0003-Include-catalog-name-and-schema-name-in-rangevar.patch

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

13 September 2021, 13:36:03

On Sunday, September 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for the changes, the suggested changes make the parsing code
> simpler. I have merged the changes to the main patch. Attached v27
> patch has the changes for the same.
> 

Thanks for your new patch. I had a look at the changes related to document and tried
the patch. Here are several comments.

1.
   <para>
    You must own the publication to use <command>ALTER PUBLICATION</command>.
    Adding a table to a publication additionally requires owning that table.
+   The <literal>ADD SCHEMA</literal> and <literal>SET SCHEMA</literal> to a
+   publication requires the invoking user to be a superuser.
    To alter the owner, you must also be a direct or indirect member of the new
    owning role. The new owner must have <literal>CREATE</literal> privilege on
    the database.  Also, the new owner of a <literal>FOR ALL TABLES</literal>
    publication must be a superuser.  However, a superuser can change the
    ownership of a publication regardless of these restrictions.
   </para>

ADD SCHEMA
->
ADD ALL TABLES IN SCHEMA

SET SCHEMA
->
SET ALL TABLES IN SCHEMA

2.
+  <para>
+   ADD some tables and schemas to the publication:
+<programlisting>
+ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production;
+</programlisting>
+  </para>

ADD some tables and schemas to the publication:
->
Add some tables and schemas to the publication:

3.
+  <para>
+   Drop some schema from the publication:
+<programlisting>
+ALTER PUBLICATION production_quarterly_publication DROP ALL TABLES IN SCHEMA production_july;
+</programlisting>
+  </para>

Drop some schema from the publication:
->
Drop some schemas from the publication:

4.
+   The catalog <structname>pg_publication_namespace</structname> contains the
+   mapping between schemas and publications in the database.  This is a
+   many-to-many mapping.  

There are two Spaces at the end of the paragraph.

5.
Adding a table and the schema where the table belonged to is not supported. But
it didn't report error message when I try to add them in the same statement by
using 'ALTER PUBLICATION'.

For example:
postgres=# create publication pub;
CREATE PUBLICATION
postgres=# alter publication pub add  all tables in schema s1, table s1.tbl;
ALTER PUBLICATION
postgres=# \dRp+
                              Publication pub
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "s1.tbl"
Tables from schemas:
    "s1"

It didn't check if table 's1.tbl' is member of schema 's1'.


6.
I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
list of all tables in schemas should be reset. The publication should only
contain the tables and all tables in schemas which user specified. If user only
specified all tables in schema, and didn't specify tables, the tables which used
to be part of the publication should be dropped, too. But currently, if I didn't
specify tables, the list of tables wouldn't be set to empty. Thoughts?

For example:
postgres=# create publication pub for table s1.tbl;
CREATE PUBLICATION
postgres=# alter publication pub set all tables in schema s2;
ALTER PUBLICATION
postgres=# \dRp+
                              Publication pub
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "s1.tbl"
Tables from schemas:
    "s2"

Regards
Tang

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

14 September 2021, 01:01:46

From Sun, Sept 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote:
> On Fri, Sep 10, 2021 at 11:21 AM Hou Zhijie <houzj.fnst@fujitsu.com> wrote:
> > Attach the without-flag version and add comments about the pubobj_name.
> 
> Thanks for the changes, the suggested changes make the parsing code simpler.
> I have merged the changes to the main patch. Attached v27 patch has the
> changes for the same.

Hi,

I have some suggestions for the documents and comments of the new syntax.

IMO, the document could be clearer in the following format.
------
Synopsis
CREATE PUBLICATION name
    [ FOR ALL TABLES 
      | FOR publication object [, ... ] ]
    [ WITH ( publication_parameter [= value] [, ... ] ) ]

where publication object is one of:

    TABLE [ ONLY ] table_name [ * ] [, ... ]
    ALL TABLES IN SCHEMA { schema_name | CURRENT_SCHEMA } [, ... ]
------

Attach a diff(based on v27-*) which change the doc and comments like the
following.

Best regards,
Hou zj

Attachment

0001-doc-diff.patch

Re: Added schema level support for publication.

From

Amit Kapila

Date:

14 September 2021, 03:53:10

On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> 6.
> I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
> list of all tables in schemas should be reset. The publication should only
> contain the tables and all tables in schemas which user specified. If user only
> specified all tables in schema, and didn't specify tables, the tables which used
> to be part of the publication should be dropped, too. But currently, if I didn't
> specify tables, the list of tables wouldn't be set to empty. Thoughts?
>

I think we can go either way here but it seems like we should drop the
tables in the case you mentioned. The idea is that the SET variant in
ALTER PUBLICATION should replace the set of tables and schemas for the
publication which seems to be in line with the current behavior where
we replace the set of tables.

Anyone else wants to weigh in on this?

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Tom Lane

Date:

14 September 2021, 04:02:49

Amit Kapila <amit.kapila16@gmail.com> writes:
> On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
>> I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
>> list of all tables in schemas should be reset. The publication should only
>> contain the tables and all tables in schemas which user specified. If user only
>> specified all tables in schema, and didn't specify tables, the tables which used
>> to be part of the publication should be dropped, too. But currently, if I didn't
>> specify tables, the list of tables wouldn't be set to empty. Thoughts?

> I think we can go either way here but it seems like we should drop the
> tables in the case you mentioned. The idea is that the SET variant in
> ALTER PUBLICATION should replace the set of tables and schemas for the
> publication which seems to be in line with the current behavior where
> we replace the set of tables.

Yeah, I think it's sensible to define that there is just one SET variant
that replaces both the list-of-tables and the list-of-schemas.  (Of
course, the syntax for it has to permit both lists to be written.)
You could imagine having two independent SET commands for the two lists,
but that seems fairly confusing.

            regards, tom lane

Re: Added schema level support for publication.

From

Amit Kapila

Date:

14 September 2021, 04:28:01

On Mon, Sep 13, 2021 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the changes, the suggested changes make the parsing code
> > simpler. I have merged the changes to the main patch. Attached v27
> > patch has the changes for the same.
> >
>
> +pubobj_name: ColId { $$ = list_make1(makeString($1)); }
> + | ColId indirection { $$ = lcons(makeString($1), $2); }
>   ;
>
> I think "ColId indirection" should handle catalog and schema name as
> we are doing in qualified_name. See attached (this can be applied atop
> v27-0002). The one other improvement we can do here is to extract the
> common code from qualified_name (especially for "ColId indirection")
> and pubobj_name. What do you think?
>

One more suggestion for changes in gram.y:
@@ -12430,7 +12476,14 @@ relation_expr:
  $$->inh = true;
  $$->alias = NULL;
  }
- | qualified_name '*'
+ | special_relation_expr
+ {
+ $$ = $1;
+ }
+ ;
+
+special_relation_expr:
+ qualified_name '*'

I am not sure if naming the above rule as special_relation_expr is a
good idea. Can we name it as extended_relation_expr?

-- 
With Regards,
Amit Kapila.

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

14 September 2021, 05:08:40

From Tue, Sept 14, 2021 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> >
> > 6.
> > I think if I use 'ALTER PUBLICATION ... SET', both the list of tables
> > and the list of all tables in schemas should be reset. The publication
> > should only contain the tables and all tables in schemas which user
> > specified. If user only specified all tables in schema, and didn't
> > specify tables, the tables which used to be part of the publication
> > should be dropped, too. But currently, if I didn't specify tables, the list of
> tables wouldn't be set to empty. Thoughts?
> >
> 
> I think we can go either way here but it seems like we should drop the tables in
> the case you mentioned. The idea is that the SET variant in ALTER PUBLICATION
> should replace the set of tables and schemas for the publication which seems
> to be in line with the current behavior where we replace the set of tables.
> 
> Anyone else wants to weigh in on this?

I agree that the one SET variant should replaces both the list-of-tables and the list-of-schemas.

Best regards,
Hou zj

Re: Added schema level support for publication.

From

vignesh C

Date:

14 September 2021, 08:38:30

On Mon, Sep 13, 2021 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the changes, the suggested changes make the parsing code
> > simpler. I have merged the changes to the main patch. Attached v27
> > patch has the changes for the same.
> >
>
> +pubobj_name: ColId { $$ = list_make1(makeString($1)); }
> + | ColId indirection { $$ = lcons(makeString($1), $2); }
>   ;
>
> I think "ColId indirection" should handle catalog and schema name as
> we are doing in qualified_name. See attached (this can be applied atop
> v27-0002). The one other improvement we can do here is to extract the
> common code from qualified_name (especially for "ColId indirection")
> and pubobj_name. What do you think?

I have handled this in the patch attached.

Regards,
Vignesh

On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have handled this in the patch attached.
> >
>
> Few comments:
> =============
> 1.
> + * CREATE PUBLICATION FOR pub_obj [, pub_obj2] [WITH options]
> + *
> + * pub_obj is one of:
> + *
> + * TABLE table [, table2]
> ..
> ..
> - * ALTER PUBLICATION name ADD TABLE table [, table2]
> + * ALTER PUBLICATION name ADD pub_obj [, pub_obj ...]
>   *
> - * ALTER PUBLICATION name DROP TABLE table [, table2]
> + * ALTER PUBLICATION name DROP pub_obj [, pub_obj ...]
>   *
> - * ALTER PUBLICATION name SET TABLE table [, table2]
> + * ALTER PUBLICATION name SET pub_obj [, pub_obj ...]
>
> In all the above places, the object names mentioned in square brackets
> are not consistent. I suggest using [, ...] everywhere as that is what
> we are using in docs as well.

Modified

> 2.
> +/*
> + * Check if the relation schema is member of the schema list.
> + */
> +static void
> +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)
>
> Can we change the above comment as: "Check if any of the given
> relation's schema is a member of the given schema list."?

Modified

> 3.
> + errmsg("cannot add relation \"%s.%s\" to publication",
> + get_namespace_name(relSchemaId),
> + RelationGetRelationName(rel)),
> + errdetail("Table's schema \"%s\" is already part of the publication.",
> + get_namespace_name(relSchemaId)));
>
> This and other parts of error messages in
> +RelSchemaIsMemberOfSchemaList are not aligned. I think you can now
> run pgindent on your patches that will solve the indentation issues in
> the patch.

Changed by running pgindent.

> 4.
> AlterPublicationSchemas()
> {
> ..
> + /*
> + * If the table option was not specified remove the existing tables
> + * from the publication.
> + */
> + if (!tables)
> + {
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + PublicationDropTables(pubform->oid, rels, false, true);
> + }
> +
> + /* Identify which schemas should be dropped */
> + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> +
> + /* And drop them */
> + PublicationDropSchemas(pubform->oid, delschemas, true);
>
> Here, you have neither locked tables to be dropped nor schemas. I
> think both need to be locked as we do for tables in similar code in
> AlterPublicationTables(). Can you please test via debugger what
> happens if we try to drop without taking lock here and concurrently
> try to drop the actual object? It should give some error. If we decide
> to lock here then we should be able to pass the list of relations to
> PublicationDropTables() instead of Oids which would then obviate the
> need for any change to that function.
>
> Similarly don't we need to lock schemas before dropping them in
> AlterPublicationTables()?

we will get the following error, if concurrently dropped from another
session during debugging:
postgres=# alter publication pub1 set all tables in schema sch2;
ERROR:  cache lookup failed for publication table 16418
Modified to add locking

> 5.
> +/*
> + * Find the ObjectAddress for a publication tables in schema.  The first
> + * element of the object parameter is the schema name, the second is the
> + * publication name.
> + */
> +static ObjectAddress
> +get_object_address_publication_schema(List *object, bool missing_ok)
>
> The first part of the above comment is not clear. Can we write it as:
> "Find the ObjectAddress for a publication schema. .."?

Modified

> 6.
> +List *
> +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
> +{
> + Relation classRel;
> + ScanKeyData key[3];
> + TableScanDesc scan;
> + HeapTuple tuple;
> + List    *result = NIL;
> + int keycount = 0;
> +
> + Assert(schemaid != InvalidOid);
>
> Isn't it better to use OidIsValid() in the above assert?

Modified

> 7.
> @@ -974,6 +974,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
>   case OBJECT_PROCEDURE:
>   case OBJECT_PUBLICATION:
>   case OBJECT_PUBLICATION_REL:
> + case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
>   case OBJECT_ROUTINE:
>   case OBJECT_RULE:
>   case OBJECT_SCHEMA:
> @@ -1050,6 +1051,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
>   case OCLASS_EXTENSION:
>   case OCLASS_POLICY:
>   case OCLASS_PUBLICATION:
> + case OCLASS_PUBLICATION_NAMESPACE:
>   case OCLASS_PUBLICATION_REL:
>   case OCLASS_SUBSCRIPTION:
>   case OCLASS_TRANSFORM:
> @@ -2127,6 +2129,7 @@ stringify_grant_objtype(ObjectType objtype)
>   case OBJECT_POLICY:
>   case OBJECT_PUBLICATION:
>   case OBJECT_PUBLICATION_REL:
> + case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
>   case OBJECT_ROLE:
>   case OBJECT_RULE:
>   case OBJECT_STATISTIC_EXT:
> @@ -2209,6 +2212,7 @@ stringify_adefprivs_objtype(ObjectType objtype)
>   case OBJECT_POLICY:
>   case OBJECT_PUBLICATION:
>   case OBJECT_PUBLICATION_REL:
> + case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
>
> What is the reason for using different names for object_class and
> object_type? Normally, we use the same. Is it making things clear in
> any place?

I thought it might be easier to review, I have changed it to the
standard way of naming object_class and object_type. Renamed it to
OBJECT_PUBLICATION_NAMESPACE.

> 8.
> + if (stmt->action == DEFELEM_ADD)
> + {
> + List *pubschemas = GetPublicationSchemas(pubid);
> +
> + /* check if the relation is member of the schema list specified */
> + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);
> +
> + /*
> + * Check if the relation is member of the existing schema in the
> + * publication.
> + */
> + RelSchemaIsMemberOfSchemaList(rels, pubschemas, false);
>
> Isn't it better to concat the list of schemas and then check the
> membership of relations once?

Modified.

Attached v29 patch has the fixes for the same.

Regards,
Vignesh

On Mon, Sep 20, 2021 at 3:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 17, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 4.
> > > AlterPublicationSchemas()
> > > {
> > > ..
> > > + /*
> > > + * If the table option was not specified remove the existing tables
> > > + * from the publication.
> > > + */
> > > + if (!tables)
> > > + {
> > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > > + PublicationDropTables(pubform->oid, rels, false, true);
> > > + }
> > > +
> > > + /* Identify which schemas should be dropped */
> > > + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> > > +
> > > + /* And drop them */
> > > + PublicationDropSchemas(pubform->oid, delschemas, true);
> > >
> > > Here, you have neither locked tables to be dropped nor schemas. I
> > > think both need to be locked as we do for tables in similar code in
> > > AlterPublicationTables(). Can you please test via debugger what
> > > happens if we try to drop without taking lock here and concurrently
> > > try to drop the actual object? It should give some error. If we decide
> > > to lock here then we should be able to pass the list of relations to
> > > PublicationDropTables() instead of Oids which would then obviate the
> > > need for any change to that function.
> > >
> > > Similarly don't we need to lock schemas before dropping them in
> > > AlterPublicationTables()?
> >
> > we will get the following error, if concurrently dropped from another
> > session during debugging:
> > postgres=# alter publication pub1 set all tables in schema sch2;
> > ERROR:  cache lookup failed for publication table 16418
> > Modified to add locking
> >
>
> But you haven't followed my other suggestion related to
> PublicationDropTables(). I don't think after doing this, you need to
> pass 'true' as the last parameter to PublicationDropTables. In fact,
> you can remove that parameter altogether or in other words, we don't
> need any change in PublicationDropTables for this patch. Is there a
> reason why we shouldn't make this change?

Modified.

> Few other comments:
> ===================
> 1. The ordering of lock acquisition for schema and relation in
> AlterPublicationSchemas() and AlterPublicationTables() is opposite
> which would generally lead to deadlock but it won't here because we
> acquire share lock on the schema. But, I think it may still be better
> to keep the locking order the same and it might help us to keep schema
> and relation code separate

Modified

> 2. One more thing, I think one can concurrently add-relation for a
> particular schema and that particular schema. To protect that
> AlterPublication should acquire an exclusive lock similar to how we do
> in AlterSubscription.

Modified

> 3.
> + /*
> + * If the table option was not specified remove the existing tables
> + * from the publication.
> + */
> + if (!relsSpecified)
> + {
> + List    *relations = NIL;
> + List    *tables = NIL;
> +
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + tables = RelationOidsToRangevars(rels);
> + relations = OpenTableList(tables);
>
> One problem with using OpenTableList here is that it might try to lock
> inherited children twice. Also, you don't need to first convert to
> rangevar for locking relations, you can directly use table_open here.

Modified

> 4.
> + | extended_relation_expr
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->object = (Node *)$1;
> + }
> + | CURRENT_SCHEMA
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->object = (Node *)makeString("CURRENT_SCHEMA");
> + }
>   ;
>
> -publication_for_tables:
> - FOR TABLE publication_table_list
> +/* This can be either a schema or relation name. */
> +pubobj_name:
> + ColId
>   {
> - $$ = (Node *) $3;
> + $$ = (Node *) makeString($1);
>   }
> - | FOR ALL TABLES
> + | ColId indirection
>   {
> - $$ = (Node *) makeInteger(true);
> + $$ = (Node *) makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
>   }
>
> In some places, you have given space after (Node *) and at other
> places, there is no space. Isn't it better to be consistent?

I have changed it to "(Node *)" without space in *.y, I did not change
in *.c as I noticed it is used with space like "(Node *) " in other
places of *.c files.

> 5.
> +/* This can be either a schema or relation name. */
> +pubobj_name:
>
> Here, we can modify the comment as "This can be either a schema or
> relation name. For relations, the inheritance will be implicit."

Modified

 And
> then remove the inheritance related comment from code below:
>
> + /* inheritance query, implicitly */
> + $$ = makeNode(PublicationObjSpec);
> + $$->object = $1;

Modified.

Attached v30 patch has the fixes for the same.

Regards,
Vignesh

On Wed, Sep 22, 2021 1:29 PMAmit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 8:02 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wednesday, September 22, 2021 2:02 AM vignesh C
> <vignesh21@gmail.com> wrote:
> > > Attached v30 patch has the fixes for the same.
> >
> > Thanks for updating the patches.
> >
> > I have one comment.
> > @@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate,
> > AlterPublicationStmt *stmt) ...
> > +               if (list_length(relations))
> > +               {
> > ...
> > +                       /* remove the existing schemas from the publication
> */
> > +                       PublicationDropSchemas(pubform->oid,
> > + delschemas, false);
> > ...
> > +               }
> >
> > After more thoughts on it, I still don't think drop all the schemas
> > under " if (list_length(relations))" is a good idea. I think 1) we'd
> > better keep schema and relation code separate.
> >
> 
> How do you suggest changing it?

Personally, I think we'd better move the code about changing publication's
tablelist into AlterPublicationTables and the code about changing publication's
schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset
did, the difference is the SET action, I suggest we drop all the tables in
function AlterPublicationTables when user only set schemas and drop all the
schema in AlterPublicationSchemas when user only set tables. In this approach,
we can keep schema and relation code separate and don't need to worry
about the locking order.

Attach a top-up patch which refactor the code like above.
Thoughts ?

Best regards,
Hou zj

Attachment

0001-schema-refactor_patch

Re: Added schema level support for publication.

From

Greg Nancarrow

Date:

23 September 2021, 03:05:59

On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> >
> > How do you suggest changing it?
>
> Personally, I think we'd better move the code about changing publication's
> tablelist into AlterPublicationTables and the code about changing publication's
> schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset
> did, the difference is the SET action, I suggest we drop all the tables in
> function AlterPublicationTables when user only set schemas and drop all the
> schema in AlterPublicationSchemas when user only set tables. In this approach,
> we can keep schema and relation code separate and don't need to worry
> about the locking order.
>
> Attach a top-up patch which refactor the code like above.
> Thoughts ?
>

Sounds like a good idea.
Is it possible to incorporate the existing
CheckPublicationAlterTables() and CheckPublicationAlterSchemas()
functions into your suggested update?
I think it might tidy up the error-checking a bit.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Added schema level support for publication.

From

Amit Kapila

Date:

23 September 2021, 04:09:17

On Wed, Sep 22, 2021 at 5:03 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Personally, I think we'd better move the code about changing publication's
> tablelist into AlterPublicationTables and the code about changing publication's
> schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset
> did, the difference is the SET action, I suggest we drop all the tables in
> function AlterPublicationTables when user only set schemas and drop all the
> schema in AlterPublicationSchemas when user only set tables. In this approach,
> we can keep schema and relation code separate and don't need to worry
> about the locking order.
>
> Attach a top-up patch which refactor the code like above.
>

Good suggestion. I think it would still be better if we can move the
checks related to superuser and puballtables into a separate function
that gets called before taking a lock on publication.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

23 September 2021, 04:56:11

On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> ---
> This patch introduces some new static functions to publicationcmds.c
> but some have function prototypes but some don't. As far as I checked,
>
> ObjectsInPublicationToOids()
> CheckObjSchemaNotAlreadyInPublication()
> GetAlterPublicationDelRelations()
> AlterPublicationSchemas()
> CheckPublicationAlterTables()
> CheckPublicationAlterSchemas()
> LockSchemaList()
> OpenReliIdList()
> PublicationAddSchemas()
> PublicationDropSchemas()
>
> are newly introduced but only four functions:
>
> OpenReliIdList()
> LockSchemaList()
> PublicationAddSchemas()
> PublicationDropSchemas()
>
> have function prototypes. Actually, there already are functions that
> don't have their prototype in publicationcmds.c. But it seems better
> to clarify what kind of functions don't need to have a prototype at
> least in this file.
>

I think if the function is defined after its use then we declare it at
the top. Do you prefer to declare all static functions to allow ease
of usage? Do you have something else in mind?

>
> ---
> +       if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser())
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                errmsg("must be superuser to add or
> set schemas")));
>
> Why do we require the superuser privilege only for ADD and SET but not for DROP?
>

For Add/Set of for all tables of Schema is similar to all tables
publication requirement. For Drop, I don't think it is mandatory to
allow only to superuser. The same applies to Alter Publication ...
Drop table case where you don't need to be table owner whereas, for
Add, you need to be. We had a discussion on these points in this
thread. See [1] and some emails prior to it.

[1] - https://www.postgresql.org/message-id/CAA4eK1KqhUBHbcpT92VMPvUUDgGvyOK0ekXOwjNR6L%3DY_bcsGw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

23 September 2021, 06:51:53

From Thurs, Sep 23, 2021 12:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 5:03 PM Hou Zhijie <houzj.fnst@fujitsu.com> wrote:
> >
> > Personally, I think we'd better move the code about changing publication's
> > tablelist into AlterPublicationTables and the code about changing
> publication's
> > schemalist into AlterPublicationSchemas. It's similar to what the
> v29-patchset
> > did, the difference is the SET action, I suggest we drop all the tables in
> > function AlterPublicationTables when user only set schemas and drop all the
> > schema in AlterPublicationSchemas when user only set tables. In this
> approach,
> > we can keep schema and relation code separate and don't need to worry
> > about the locking order.
> >
> > Attach a top-up patch which refactor the code like above.
> >
> 
> Good suggestion. I think it would still be better if we can move the
> checks related to superuser and puballtables into a separate function
> that gets called before taking a lock on publication.

I agreed.

I noticed v30-0001 has been committed with some minor changes, and the V30-0002
patchset need to be rebased accordingly. Attach a rebased version patch set to
make cfbot happy. Also Attach the two top-up patches which refactor the code as
suggested. (top-up patch 1 is to keep schema and table code separate, top-up
patch 2 is to move some cheap check into a function and invoke it before
locking.)

Best regards,
Hou zj

On Wed, Sep 22, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 4:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v30 patch has the fixes for the same.
> >
>
> Thanks for all the patch updates.
> I have some suggested updates to the v30-0005 documentation patch:
>
> doc/src/sgml/ref/alter_publication.sgml
> (1)
> I'm thinking it might be better to simplify the description, because
> it's a bit wordy and difficult to read with the "all tables in schema"
> bits.
> Suggested update is below (thoughts?):
>
> BEFORE:
> +   The first three variants change which tables and/or all tables in schema are
> +   part of the publication.  The <literal>SET</literal> clause will replace
> +   the list of tables and/or all tables in schema in the publication with the
> +   specified one, the existing tables and all tables in schema that were
> +   present in the publication will be removed.  The <literal>ADD</literal>
> +   clause will add one or more tables and/or all tables in schema to the
> +   publication. The <literal>DROP</literal> clauses will remove one or more
> +   tables and/or all tables in schema from the publication.  Note that adding
> +   tables and/or all tables in schema to a publication that is already
> +   subscribed to will require a <literal>ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION</literal>
> +   action on the subscribing side in order to become effective.
> AFTER:
> +   The first three variants change which tables/schemas are
> +   part of the publication.  The <literal>SET</literal> clause will replace
> +   the list of tables/schemas in the publication with the
> +   specified list; the existing tables/schemas that were
> +   present in the publication will be removed.  The <literal>ADD</literal>
> +   clause will add one or more tables/schemas to the
> +   publication. The <literal>DROP</literal> clauses will remove one or more
> +   tables/schemas from the publication.  Note that adding
> +   tables/schemas to a publication that is already
> +   subscribed to will require a <literal>ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION</literal>
> +   action on the subscribing side in order to become effective.

Modified

>
> doc/src/sgml/ref/create_publication.sgml
> (2)
> I suggest an update similar to the following:
>
> BEFORE:
> +      Specifying a table that is part of schema specified in
> +      <literal>FOR ALL TABLES IN SCHEMA</literal> option is not supported.
> AFTER:
> +      Specifying a table that is part of a schema already included in
> the publication is not supported.

This doc content is for the following example:
create publication pub1 for all tables in schema sch1, table sch1.t1;
It is  about the schema specified in all tables in schema option along
with table option. I think the existing content is better.

> (3)
> I find the following description a little unclear:
>
> +     <para>
> +      Specifying a schema along with schema's table specified as part of
> +      <literal>FOR TABLE</literal> option is not supported.
> +     </para>
>
> Perhaps the following would be better:
>
> +     <para>
> +      Specifying a schema that contains a table already included in the
> +      publication is not supported.
> +     </para>

Similar to above

> (4)
> Minor fix:
> BEFORE:
> +   rights on the table.  The <command>FOR ALL TABLES</command> and
> +   <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking
> +   user to be a superuser.
> AFTER:
> +   rights on the table.  The <command>FOR ALL TABLES</command> and
> +   <command>FOR ALL TABLES IN SCHEMA</command> clauses require the invoking
> +   user to be a superuser.

Modified

Attached v32 patch has the fixes for the same.

Regards,
Vignesh

On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote:
>

Attached v33 patch has the preprocess_pubobj_list review comment fix
suggested by Alvaro at [1]. The
v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
the grammar changes as suggested by Alvaro at [1]. If we agree this is
better, I will merge this into the 0001 patch.
[1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql

Regards,
Vignesh

On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, September 27, 2021 12:32 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Attached v33 patch has the preprocess_pubobj_list review comment fix
> > suggested by Alvaro at [1]. The
> > v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
> > the grammar changes as suggested by Alvaro at [1]. If we agree this is better, I
> > will merge this into the 0001 patch.
> > [1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql
>
> Hi,
>
> The grammar change basically looks good to me. Only one suggestion is that it
> will be better to add some more comments in gram.y to describe the rule
> PublicationObjSpec. Because it's a new style syntax in postgresql, people might
> wonder how the code work and why we choose this design when they first time
> see this rule in gram.y.
>
> Maybe something like the following:
>
> +/*
> + * FOR TABLE and FOR ALL TABLES IN SCHEMA specifications
> + *
> + * This rule parses publication object with and without keyword prefix.
> + *
> + * The actual type of the object without keyword prefix depends on the previous
> + * one with keyword prefix. It will be preprocessed in preprocess_pubobj_list().
> + *
> + * For the object without keyword prefix, we cannot just use relation_expr here,
> + * because some extended expression in relation_expr cannot be used as a
> + * schemaname and we cannot differentiate it. So, we extract the rules from
> + * relation_expr here.
> + */
> PublicationObjSpec:
>                         TABLE relation_expr
>                         ...
>
> My words might not be good, but I think it will be better to add some comments
> to explain a bit about the code in gram.y. Thoughts ?

Modified.

Attached v34 patch has the changes for the same.

Regards,
Vignesh

On Wed, Sep 29, 2021 at 8:47 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, Tuesday, September 28, 2021 10:49 PM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > Yes this should not be supported, we should throw an error in this case.
> > This is handled in the v34 patch attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.g
> > mail.com
> >
>
> Thanks for fixing it. I confirmed the error can be output as expected.
>
> Here is a problem related to publish_via_partition_root option when using this
> patch. With this option on, I think pg_get_publication_tables function gave an
> unexcepted result and the subscriber would get dual data during table sync.
>
>
> For example:
> (I used pg_publication_tables view to make it looks clearer)
>
> create schema sch1;
> create table sch1.tbl1 (a int) partition by range ( a );
> create table sch1.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
> create table sch1.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20);
> create table sch1.tbl1_part3 partition of sch1.tbl1 for values from (20) to (30);
> create publication pub for all tables in schema sch1 with(publish_via_partition_root=1);
>
> postgres=# select * from pg_publication_tables where pubname='pub';
>  pubname | schemaname | tablename
> ---------+------------+------------
>  pub     | sch1       | tbl1_part1
>  pub     | sch1       | tbl1_part2
>  pub     | sch1       | tbl1_part3
>  pub     | sch1       | tbl1
> (4 rows)
>
>
> It shows both the partitioned table and its leaf partitions. But the result of
> FOR ALL TABLES publication couldn't show the leaf partitions.
>
>
> postgres=# create publication pub_all for all tables with(publish_via_partition_root=1);
> CREATE PUBLICATION
> postgres=# select * from pg_publication_tables where pubname='pub_all';
>  pubname | schemaname | tablename
> ---------+------------+-----------
>  pub_all | sch1       | tbl1
> (1 row)
>
>
> How about make the following change to avoid it? I tried it and it also fixed dual
> data issue during table sync.
>
>
> diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
> index 04e785b192..4e8ccdabc6 100644
> --- a/src/backend/catalog/pg_publication.c
> +++ b/src/backend/catalog/pg_publication.c
> @@ -632,7 +632,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
>                 Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
>                 Oid                     relid = relForm->oid;
>
> -               if (is_publishable_class(relid, relForm))
> +               if (is_publishable_class(relid, relForm) &&
> +                       !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT))
>                         result = lappend_oid(result, relid);
>         }

The suggested change works, I have modified it in the attached patch.

Regards,
Vignesh

On Thu, Sep 30, 2021 at 3:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> The suggested change works, I have modified it in the attached patch.
>

I have reviewed the latest version and made a number of changes to the
0001 patch. The changes are in v1-0001-Changes-by-Amit. It includes
(a) Changed preprocess_pubobj_list() to make the code easy to
understand, (b) the handling of few variables was missing in equal
function, (c) the ordering of functions, and a few parameters were not
matching .c and .h files, (d) added/edited multiple comments and other
cosmetic changes.

Apart from that, I have few other comments:
1. It seems you have started using unique list variants in
GetPubPartitionOptionRelations because one of its caller
GetSchemaPublicationRelations need it. I think the unique variants are
costlier, so isn't it better to use it where it is required? I think
it would be good to use in GetPubPartitionOptionRelations, if most of
its caller requires the same.

2. In GetSchemaPublicationRelations(), I think we need to perform a
second scan using RELKIND_PARTITIONED_TABLE only if we
publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
what we are doing in GetAllTablesPublicationRelations. Is there a
reason to be different here?

3.
@@ -538,7 +788,7 @@ RemovePublicationById(Oid pubid)
  if (!HeapTupleIsValid(tup))
  elog(ERROR, "cache lookup failed for publication %u", pubid);

- pubform = (Form_pg_publication)GETSTRUCT(tup);
+ pubform = (Form_pg_publication) GETSTRUCT(tup);

We don't need the above change for this patch. I think this may be due
pgindent but we can do this separately rather than as part of this
patch.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

03 October 2021, 17:55:27

On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 30, 2021 at 3:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The suggested change works, I have modified it in the attached patch.
> >
>
> I have reviewed the latest version and made a number of changes to the
> 0001 patch. The changes are in v1-0001-Changes-by-Amit. It includes
> (a) Changed preprocess_pubobj_list() to make the code easy to
> understand, (b) the handling of few variables was missing in equal
> function, (c) the ordering of functions, and a few parameters were not
> matching .c and .h files, (d) added/edited multiple comments and other
> cosmetic changes.

I have merged these changes into the main patch.

> Apart from that, I have few other comments:
> 1. It seems you have started using unique list variants in
> GetPubPartitionOptionRelations because one of its caller
> GetSchemaPublicationRelations need it. I think the unique variants are
> costlier, so isn't it better to use it where it is required? I think
> it would be good to use in GetPubPartitionOptionRelations, if most of
> its caller requires the same.

I have removed unique list changes from GetPubPartitionOptionRelations
and handled it in GetSchemaPublicationRelations.

> 2. In GetSchemaPublicationRelations(), I think we need to perform a
> second scan using RELKIND_PARTITIONED_TABLE only if we
> publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
> what we are doing in GetAllTablesPublicationRelations. Is there a
> reason to be different here?

In the first table scan we are getting all the ordinary tables present
in the schema. In the second table scan we will get all the
partitioned table present in the schema and the relations will be
added based on pub_partopt. I felt if we have the check we will not
get the relations in the following case:
create schema sch1;
create schema sch2;
create table sch1.p (a int) partition by list (a);
create table sch2.c1 partition of sch1.p for values in (1);

> 3.
> @@ -538,7 +788,7 @@ RemovePublicationById(Oid pubid)
>   if (!HeapTupleIsValid(tup))
>   elog(ERROR, "cache lookup failed for publication %u", pubid);
>
> - pubform = (Form_pg_publication)GETSTRUCT(tup);
> + pubform = (Form_pg_publication) GETSTRUCT(tup);
>
> We don't need the above change for this patch. I think this may be due
> pgindent but we can do this separately rather than as part of this
> patch.

Removed this change.

Attached v36 patch has the changes for the same.

Regards,
Vignesh

On Mon, Oct 4, 2021 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Oct 3, 2021 at 11:25 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 2. In GetSchemaPublicationRelations(), I think we need to perform a
> > > second scan using RELKIND_PARTITIONED_TABLE only if we
> > > publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
> > > what we are doing in GetAllTablesPublicationRelations. Is there a
> > > reason to be different here?
> >
> > In the first table scan we are getting all the ordinary tables present
> > in the schema. In the second table scan we will get all the
> > partitioned table present in the schema and the relations will be
> > added based on pub_partopt. I felt if we have the check we will not
> > get the relations in the following case:
> > create schema sch1;
> > create schema sch2;
> > create table sch1.p (a int) partition by list (a);
> > create table sch2.c1 partition of sch1.p for values in (1);
> >
>
> But we don't need to get the partitioned tables for the invalidations,
> see the corresponding case for tables. So, I am not sure why you have
> used two scans to the system table for such scenarios?

The second loop is required to get the 'p' relkind relations like in
the below case:
create schema sch1;
create schema sch2;
create table sch1.p1 (a int) partition by list (a);
create table sch2.c1 partition of sch1.p1 for values in (1);
create table sch2.p2(a int) partition by list (a);
create table sch1.c2 partition of sch2.p2 for values in (1);
create publication pub1 for all tables in schema sch2;
The first loop will give us sch2.c1 relation and the second loop will
get sch2.p2, sch1.c2, this is required for schema publication as the
schema can have both r relkind and p relkind tables and all of them
need to be invalidated. This is not required in case of table
publication as we can get the required relations from that particular
table.

> Few additional comments on
> v36-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN:
> =========================================================================
> 1.
> @@ -3961,21 +3965,25 @@ getPublications(Archive *fout, int *numPublications)
>   appendPQExpBuffer(query,
>     "SELECT p.tableoid, p.oid, p.pubname, "
>     "(%s p.pubowner) AS rolname, "
> -   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, p.pubviaroot "
> +   "p.puballtables, p.pubinsert, p.pubupdate, "
> +   "p.pubdelete, p.pubtruncate, p.pubviaroot "
>     "FROM pg_publication p",
>     username_subquery);
>   else if (fout->remoteVersion >= 110000)
>   appendPQExpBuffer(query,
>     "SELECT p.tableoid, p.oid, p.pubname, "
>     "(%s p.pubowner) AS rolname, "
> -   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, false AS pubviaroot "
> +   "p.puballtables, p.pubinsert, p.pubupdate, "
> +   "p.pubdelete, p.pubtruncate, false AS pubviaroot "
>     "FROM pg_publication p",
>     username_subquery);
>   else
>   appendPQExpBuffer(query,
>     "SELECT p.tableoid, p.oid, p.pubname, "
>     "(%s p.pubowner) AS rolname, "
> -   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
> pubtruncate, false AS pubviaroot "
> +   "p.puballtables, p.pubinsert, p.pubupdate, "
> +   "p.pubdelete, false AS pubtruncate, "
> +   "false AS pubviaroot "
>     "FROM pg_publication p",
>     username_subquery);
>
> Is there a reason to change this part of the code?

It is not required, I have removed it.

> 2.
> @@ -257,6 +257,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
>   pg_log_info("reading publication membership");
>   getPublicationTables(fout, tblinfo, numTables);
>
> + pg_log_info("reading publication tables in schemas");
> + getPublicationNamespaces(fout, nspinfo, numNamespaces);
>
> I think for the above change, the first should be changed to "reading
> publication membership of tables" and the second one should be changed
> to "reading publication membership of schemas".

Modified

> 3. The function names getPublicationNamespaces and
> dumpPublicationSchema are not in sync. Let's name the second one as
> dumpPublicationNamespace.

Modified

> 4. It is not clear to me why the patch has introduced a new component
> type object DUMP_COMPONENT_PUBSCHEMA. In particular, in the below code
> if we are already setting DUMP_COMPONENT_ALL, how the additional
> setting of DUMP_COMPONENT_PUBSCHEMA helps?
>
> @@ -1631,9 +1631,13 @@ selectDumpableNamespace(NamespaceInfo *nsinfo,
> Archive *fout)
>   if (nsinfo->nspowner == ROLE_PG_DATABASE_OWNER)
>   nsinfo->dobj.dump &= ~DUMP_COMPONENT_DEFINITION;
>   nsinfo->dobj.dump_contains = DUMP_COMPONENT_ALL;
> + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA;
>   }
>   else
> + {
>   nsinfo->dobj.dump_contains = nsinfo->dobj.dump = DUMP_COMPONENT_ALL;
> + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA;
> + }

I have removed DUMP_COMPONENT_PUBSCHEMA and used DUMP_COMPONENT_NONE
to identify the publications that should be dumped.

Attached v37 patch has the changes for the same.

Regards,
Vignesh

On Thu, Oct 7, 2021 at 12:51 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Oct 6, 2021 at 4:42 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v37 patch has the changes for the same.
> >
>
> A small issue I noticed is that using "\dS" in PSQL shows UNLOGGED
> tables as belonging to a publication, if the table belongs to a schema
> that was added to the publication using ALL TABLES IN SCHEMA (yet
> doesn't show as part of an ALL TABLES publication).
> Since publication of UNLOGGED tables is silently skipped in the case
> of ALL TABLES and ALL TABLES IN SCHEMA, it shouldn't show as belonging
> to the publication, right?
>
> test=# \dRp+ pub2
>                             Publication pub2
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | t          | t       | t       | t       | t         | f
> (1 row)
>
> test=# \dRp+ pub
>                              Publication pub
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | t
> Tables from schemas:
>     "sch1"
>
> test=# \dS sch1.test2
>             Unlogged table "sch1.test2"
>  Column |  Type   | Collation | Nullable | Default
> --------+---------+-----------+----------+---------
>  i      | integer |           |          |
> Publications:
>     "pub"

Thanks for reporting the issue. The attached patch has the fix for the issue.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

11 October 2021, 06:39:05

On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v37 patch has the changes for the same.
> >
>
> Few comments:
> ==============
> v37-0001-Added-schema-level-support-for-publication
> 1.
> + *
> + * The first scan will get all the 'r' relkind tables for the specified schema,
> + * iterate the 'r' relkind tables and prepare a list of:
> + * 1) non partition table if pub_partopt is PUBLICATION_PART_ROOT
> + * 2) partition table and non partition table if pub_partopt is
> + *    PUBLICATION_PART_LEAF.
> + *
> + * The second scan will get all the 'p'' relkind tables for the  specified
> + * schema, iterate the 'p' relkind tables and prepare a list of:
> + * 1) partition table's child relations if pub_partopt is PUBLICATION_PART_LEAF
> + * 2) partition table if pub_partopt is PUBLICATION_PART_ROOT.
>
> I think these comments are redundant and not sure if they are
> completely correct. We don't need these as the actual code explains
> these conditions better. The earlier part of these comments is
> sufficient.

Removed it.

> v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN
> 2.
> + * selectDumpablePublicationObject: policy-setting subroutine
> + * Mark a publication as to be dumped or not
>   *
> - * Publication tables have schemas, but those are ignored in decision making,
> + * Publications have schemas, but those are ignored in decision making,
>   * because publications are only dumped when we are dumping everything.
>   */
>
> Change the above comment lines:
> a. "Mark a publication as to be dumped or not" to "Mark a publication
> object as to be dumped or not".
>
> b. "Publications have schemas, but those are ignored in decision
> making, .." to "A publication can have schemas and tables which have
> schemas, but those are ignored in decision making, .."

Modified

> 3.
> +/*
> + * dumpPublicationNamespace
> + *   dump the definition of the given publication tables in schema mapping
> + */
>
> Can we change the comment to: "dump the definition of the given
> publication schema mapping"? IT is easier to read and understand.
>
> 4.
> +/*
> + * The PublicationSchemaInfo struct is used to represent publication tables
> + * in schema mapping.
> + */
> +typedef struct _PublicationSchemaInfo
> +{
> + DumpableObject dobj;
> + NamespaceInfo *pubschema;
> + PublicationInfo *publication;
> +} PublicationSchemaInfo;
>
> Can we change the comment similar to the comment change in point 3?
> Also, let's keep PublicationInfo * before NamespaceInfo * just to be
> consistent with the existing structure PublicationRelInfo?
>
> 5.
> + printfPQExpBuffer(&buf,
> +   "SELECT p.pubname\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid AND
> pc.oid = '%s'\n"
>
> I think this part of the query can be improved in multiple ways: (a)
> pc.oid = '%s' should be part of WHERE clause not join condition, (b)
> for pubname, no need to use alias name, it can be directly referred as
> pubname, (c) you can check if the relation is publishable. So, the
> formed query would look like:
>
> SELECT p.pubname FROM pg_catalog.pg_publication p JOIN
> pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid JOIN
> pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid  WHERE pc.oid =
> '%s' and pg_catalog.pg_relation_is_publishable('%s')

Modified

> 6.
> listSchemas()
> {
> ..
> + if (pub_schema_tuples > 0)
> + {
> + /*
> + * Allocate memory for footers. Size of footers will be 1 (for
> + * storing "Publications:" string) + Schema count +  1 (for
> + * storing NULL).
> + */
> + footers = (char **) palloc((1 + pub_schema_tuples + 1) * sizeof(char *));
> + footers[0] = pstrdup(_("Publications:"));
> +
> + /* Might be an empty set - that's ok */
> + for (i = 0; i < pub_schema_tuples; i++)
> + {
> + printfPQExpBuffer(&buf, "    \"%s\"",
> +   PQgetvalue(result, i, 0));
> +
> + footers[i + 1] = pstrdup(buf.data);
> + }
> +
> + footers[i + 1] = NULL;
> + myopt.footers = footers;
> + }
> ..
> }
>
> Is there a reason of not using printTableAddFooter() here similar to
> how we use it to print Publications in describeOneTableDetails()?

There are 2 ways to print table in psql:
1) call printTableInit, printTableAddHeader, printTableAddCell,
printTableAddFooter, printTable & printTableCleanup to print the table
2) prepare the table contents and call printQuery to print the
table(which will take care of handling all of the above)
describeOneTableDetails uses 1st method
listSchemas uses 2nd method, in case of this method since table is not
initialized we cannot use printTableAddFooter. we have to prepare the
footers and set the footers.

> 7.
> describePublications()
> {
> ..
> + /* Get the schemas for the specified publication */
> + printfPQExpBuffer(&buf,
> +   "SELECT n.nspname\n"
> +   "FROM pg_catalog.pg_namespace n,\n"
> +   "     pg_catalog.pg_publication_namespace pn\n"
> +   "WHERE n.oid = pn.pnnspid\n"
> +   "  AND pn.pnpubid = '%s'\n"
> +   "ORDER BY 1", pubid);
> + if (!addFooterToPublicationDesc(&buf, "Tables from schemas:",
> + true, &cont))
> + goto error_return;
> ..
> }
>
> Shouldn't we try to get schemas only when pset.sversion >= 150000?

Modified

> 8.
> +addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
> +    bool singlecol, printTableContent *cont)
> +{
> ..
> + termPQExpBuffer(buf);
> ..
> }
>
> It seems this buffer is freed at the caller's site, if so, no need to
> free it here.
>

Modified

These comments are fixed in the v38 patch attached.

Regards,
Vignesh

On Mon, Oct 11, 2021 at 12:50 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, October 11, 2021 2:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > These comments are fixed in the v38 patch attached.
>
> Thanks for updating the patches.
> Here are a few comments on the v38-0004-Doc patch.
>
> 1.
> +  <para>
> +   Adding/Setting a table that is part of schema specified in
> +   <literal>ALL TABLES IN SCHEMA</literal>, adding/setting a schema to a
> +   publication along with same schema's table specified with
> +   <literal>TABLE</literal>, adding/setting a schema to a publication that
> +   already has a table that is part of specified schema or adding/setting a
> +   table to a publication that already has a table's schema as part of
> +   specified schema is not supported.
>
> ISTM we can remove the description "adding/setting a schema to a publication
> along with same schema's table specified with <literal>TABLE</literal>",
> because it seems the same as the first mentioned case "Adding/Setting a table
> that is part of schema specified in <literal>ALL TABLES IN SCHEMA</literal>"

Modified

> 2.
>
> +</programlisting></para>
> +
> +  <para>
> +   Add some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june;
> +</programlisting>
> +  </para>
> +
> +  <para>
> +   Add some tables and schemas to the publication:
> ...
> +
> +  <para>
> +   Drop some schemas from the publication:
> ...
> +  <para>
> +   Set some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION production_publication SET ALL TABLES IN SCHEMA production_september, production_october;
>
> Personally, I think we don't need the example about DROP and SET here.
> The example of ADD seems sufficient.

Modified

> 3.
> +</programlisting>
> +  </para>
> +
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +   the schema "production":
> +<programlisting>
> +CREATE PUBLICATION production_publication FOR ALL TABLES IN SCHEMA production;
> +</programlisting>
> +  </para>
> ...
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +   the schemas "marketing" and "sales":
> +<programlisting>
> +CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;
>
> I think the example for publishing all the tables in schemas "marketing" and
> "sales" is sufficient, the example for pulishing signal schema seems can be
> removed.

Modified

The attached v39 patch has the fixes for the above issues.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

12 October 2021, 06:54:06

On Monday, October 11, 2021 11:02 PM vignesh C <vignesh21@gmail.com> wrote:
> The attached v39 patch has the fixes for the above issues.

Thanks for the updates.
I have a few minor suggestions about the testcases in the v39-0003-Test patch.

1)
+-- alter publication drop CURRENT_SCHEMA
+ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA CURRENT_SCHEMA;
+\dRp+ testpub1_forschema

Since we already tested CURRENT_SCHEMA in various CREATE PUBLICATION cases, maybe
we don't need to test it again in SET/DROP/ADD cases.

2)
+-- alter publication set schema
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
+\dRp+ testpub1_forschema
+
+-- alter publication set multiple schema
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
+\dRp+ testpub1_forschema
+

I think the multiple schemas testcase is sufficient, maybe we can remove the
single schema case.


3)
+
+-- alter publication set it with the same schema
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
+\dRp+ testpub1_forschema

ISTM, we didn't have some special code path for this case, maybe we can remove
this testcase.


Best regards,
Hou zj

Re: Added schema level support for publication.

From

vignesh C

Date:

12 October 2021, 13:15:02

On Tue, Oct 12, 2021 at 12:24 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, October 11, 2021 11:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > The attached v39 patch has the fixes for the above issues.
>
> Thanks for the updates.
> I have a few minor suggestions about the testcases in the v39-0003-Test patch.
>
> 1)
> +-- alter publication drop CURRENT_SCHEMA
> +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA CURRENT_SCHEMA;
> +\dRp+ testpub1_forschema
>
> Since we already tested CURRENT_SCHEMA in various CREATE PUBLICATION cases, maybe
> we don't need to test it again in SET/DROP/ADD cases.

Modified

> 2)
> +-- alter publication set schema
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
> +\dRp+ testpub1_forschema
> +
> +-- alter publication set multiple schema
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
> +\dRp+ testpub1_forschema
> +
>
> I think the multiple schemas testcase is sufficient, maybe we can remove the
> single schema case.

Modified

> 3)
> +
> +-- alter publication set it with the same schema
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
> +\dRp+ testpub1_forschema
>
> ISTM, we didn't have some special code path for this case, maybe we can remove
> this testcase.

Modified

Attached v40 patch has the fix for the above comments.

Regards,
Vignesh

On Wednesday, October 13, 2021 10:49 AM Hou, Zhijie wrote:
> On Tuesday, October 12, 2021 9:15 PM vignesh C <vignesh21@gmail.com>
> > Attached v40 patch has the fix for the above comments.
> 
> Thanks for the update, I have some minor issues about partition related
> behavior.
> 
> 1)
> 
> Tang tested and discussed this issue with me.
> The testcase is:
> We publish a schema and there is a partition in the published schema. If
> publish_via_partition_root is on and the partition's parent table is not in the
> published schema, neither the change on the partition nor the parent table will
> not be published.
> 
> But if we publish by FOR TABLE partition and set publish_via_partition_root to
> on, the change on the partition will be published. So, I think it'd be better to
> publish the change on partition for FOR ALL TABLES IN SCHEMA case if its parent
> table
> is not published in the same publication.
> 
> It seems we should pass publication oid to the GetSchemaPublicationRelations()
> and add some check like the following:
> 2)
> I think a partitioned table could also be a partition which should not be
> appended to the list. I think we should also filter these cases here by same
> check in 1).

After some offline discussion with Vignesh and Amit.
I found my proposed fix can be improved because it brings some overhead to
functions which could be invoked many times for the same publication. Now, I
think it'd be better to add this check in GetAllSchemaPublicationRelations and
GetPublicationRelations.

Besides, I found we misunderstood the flag PUBLICATION_PART_ROOT it means:
"ROOT: only the table explicitly mentioned in the publication" We cannot use it
as a flag to judge whether do the partition filtering, I think we need to pass
the actual pubviaroot flag.

Based on the V40 patchset, attaching the Top-up patch which try to fix the
partition issue in a cleaner way.

Best regards,
Hou zj

On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote:
> Based on the V40 patchset, attaching the Top-up patch which try to fix the
> partition issue in a cleaner way.

Attach the new version patch set which merge the partition fix into it.
Besides, instead of introducing new function and parameter, just add the
partition filter in pg_get_publication_tables which makes the code cleaner.

Only 0001 and 0003 was changed.

Best regards,
Hou zj

On Mon, Oct 18, 2021 at 2:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi,
>
> On Mon, Oct 18, 2021 at 3:14 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote:
> > > Based on the V40 patchset, attaching the Top-up patch which try to fix the
> > > partition issue in a cleaner way.
> >
> > Attach the new version patch set which merge the partition fix into it.
> > Besides, instead of introducing new function and parameter, just add the
> > partition filter in pg_get_publication_tables which makes the code cleaner.
> >
> > Only 0001 and 0003 was changed.
>
> I've reviewed 0001 and 0002 patch and here are comments:
>
> 0001 patch:
>
> +/*
> + * Get the list of publishable relation oids for a specified schema.
> + *
> + * Schema will be having both ordinary('r') relkind tables and partitioned('p')
> + * relkind tables, so two rounds of scan are required.
> + */
> +List *
> +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
> +{
> +        Relation       classRel;
> +        ScanKeyData key[3];
> +        TableScanDesc scan;
>
> I think it's enough to have key[2], not key[3].

Modified

> BTW, this function does the table scan on pg_class twice in order to
> get OIDs of both normal tables and partitioned tables. But can't we do
> that by the single table scan? I think we can set a scan key for
> relnamespace, and check relkind inside a scan loop.

Modified

> ---
> +                ObjectsInPublicationToOids(stmt->pubobjects, pstate,
> &relations,
> +
> &schemaidlist);
> +
> +                if (list_length(relations) > 0)
> +                {
> +                        List      *rels;
> +
> +                        rels = OpenTableList(relations);
> +                        CheckObjSchemaNotAlreadyInPublication(rels,
> schemaidlist,
> +
> PUBLICATIONOBJ_TABLE);
> +                        PublicationAddTables(puboid, rels, true, NULL);
> +                        CloseTableList(rels);
> +                }
> +
> +                if (list_length(schemaidlist) > 0)
> +                {
> +                        /* FOR ALL TABLES IN SCHEMA requires superuser */
> +                        if (!superuser())
> +                                ereport(ERROR,
> +
> errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                                errmsg("must be
> superuser to create FOR ALL TABLES IN SCHEMA publication"));
> +
>
> Perhaps we can do a superuser check before handling "relations"? If
> the user doesn't have the permission, we don't need to do anything for
> relations.

Modified

> 0002 patch:
>
> postgres(1:13619)=# create publication pub for all TABLES in schema
> CURRENT_SCHEMA      pg_catalog          public              s2
> information_schema  pg_toast            s1
>
> Since pg_catalog and pg_toast cannot be added to the schema
> publication can we exclude them from the completion list?

Modified

Thanks for the comments, the attached v42 patch has the fixes for the same.

Regards,
Vignesh

On Tuesday, October 19, 2021 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > >
> > > Thanks for the comments, the attached v42 patch has the fixes for the same.
> >
> > Thanks for your new patch.
> >
> > I tried your patch and found that the permission check for superuser didn't work.
> >
> > For example:
> > postgres=# create role r1;
> > CREATE ROLE
> > postgres=# grant all privileges on database postgres to r1;
> > GRANT
> > postgres=# set role r1;
> > SET
> > postgres=> create schema s1;
> > CREATE SCHEMA
> > postgres=> create publication pub for all tables in schema s1;
> > CREATE PUBLICATION
> >
> > Role r1 is not superuser, but this role could create publication for all tables in
> schema
> > successfully, I think it is related the following change. List schemaidlist was
> > not assigned yet. I think we should check it later.
> >
> 
> It seems this got broken in yesterday's patch version. Do you think it
> is a good idea to add a test for this case?
> 

Agreed. Thanks for your suggestion.

I tried to add this test to publication.sql, a patch diff file for this test is attached.

Regards
Tang

Attachment

Topup-permissions-test_diff

Re: Added schema level support for publication.

From

Amit Kapila

Date:

19 October 2021, 11:28:49

On Mon, Oct 18, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
>

Few comments on latest set of patches:
===============================
1.
+/*
+ * Filter out the partitions whose parent tables was also specified in
+ * the publication.
+ */
+static List *
+filter_out_partitions(List *relids)

Can we name this function as filter_partitions()?

2.
+ /*
+ * If the publication publishes partition changes via their
+ * respective root partitioned tables, we must exclude partitions
+ * in favor of including the root partitioned tables. Otherwise,
+ * the function could return both the child and parent tables which
+ * could cause the data of child table double-published in
+ * subscriber side.
+ */

Let's slightly change the last part of the line in the above comment
as: "... which could cause data of the child table to be
double-published on the subscriber side."

3.
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
..
..
@@ -38,7 +40,6 @@
 #include "utils/builtins.h"
 #include "utils/catcache.h"
 #include "utils/fmgroids.h"
-#include "utils/inval.h"
 #include "utils/lsyscache.h"

Does this change belong to this patch? If not, maybe you can submit a
separate patch for this. A similar change is present in
publicationcmds.c as well, not sure if that is required as well.

4.
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
...
...
+#include "nodes/makefuncs.h"

Do we need to include this file? I am able to compile without
including this file.

v42-0003-Add-tests-for-the-schema-publication-feature-of-
5.
+-- pg_publication_tables
+SET client_min_messages = 'ERROR';
+CREATE SCHEMA sch1;
+CREATE SCHEMA sch2;
+CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a);
+CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM
(1) to (10);
+CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH
(PUBLISH_VIA_PARTITION_ROOT=1);
+SELECT * FROM pg_publication_tables;
+
+DROP PUBLICATION pub;
+CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH
(PUBLISH_VIA_PARTITION_ROOT=1);
+SELECT * FROM pg_publication_tables;

Can we expand the above comment on the lines of: "Test the list of
partitions published"?

v42-0004-Add-documentation-for-the-schema-publication-fea
6.
+     <row>
+      <entry><link
linkend="catalog-pg-publication-namespace"><structname>pg_publication_namespace</structname></link></entry>
+      <entry>schema to publication mapping</entry>
+     </row>
+
      <row>
       <entry><link
linkend="catalog-pg-publication-rel"><structname>pg_publication_rel</structname></link></entry>
       <entry>relation to publication mapping</entry>
@@ -6238,6 +6243,67 @@ SCRAM-SHA-256$<replaceable><iteration
count></replaceable>:<replaceable>&l
   </table>
  </sect1>

+ <sect1 id="catalog-pg-publication-namespace">
+  <title><structname>pg_publication_namespace</structname></title>

At one place, the new catalog is placed after pg_publication_rel and
at another place, it is before it. Shouldn't it be before in both
places as we have a place as per naming order?

7.
The
+   <literal>ADD</literal> clause will add one or more tables/schemas to the
+   publication. The <literal>DROP</literal> clauses will remove one or more
+   tables/schemas from the publication.

Isn't it better to write the above as one line: "The
<literal>ADD</literal> and <literal>DROP</literal> clauses will add
and remove one or more tables/schemas from the publication."?

8.
+  <para>
+   Add some schemas to the publication:
+<programlisting>
+ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA
marketing_june, sales_june;
+</programlisting>
+  </para>

Can we change schema names to just marketing and sales? Also, let's
change the description as:"Add schemas
<structname>marketing</structname> and <structname>sales</structname>
to the publication <structname>sales_publication</structname>?

9.
+    [ FOR ALL TABLES
+      | FOR <replaceable class="parameter">publication
object</replaceable> [, ... ] ]
     [ WITH ( <replaceable
class="parameter">publication_parameter</replaceable> [= <replaceable
class="parameter">value</replaceable>] [, ... ] ) ]
+
+<phrase>where <replaceable class="parameter">publication
object</replaceable> is one of:</phrase>

Similar to Alter Publication, here also we should use
publication_object instead of publication object.

10.
+  <para>
+   Create a publication that publishes all changes for tables "users" and
+   "departments" and that publishes all changes for all the tables present in
+   the schema "production":
+<programlisting>
+CREATE PUBLICATION production_publication FOR TABLE users,
departments, ALL TABLES IN SCHEMA production;
+</programlisting>
+  </para>
+
+  <para>
+   Create a publication that publishes all changes for all the tables
present in
+   the schemas "marketing" and "sales":

It is better to use <structname> before and </structname> after schema
names in above descriptions.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

19 October 2021, 15:42:02

On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v42 patch has the fixes for the same.
>
> Thanks for your new patch.
>
> I tried your patch and found that the permission check for superuser didn't work.
>
> For example:
> postgres=# create role r1;
> CREATE ROLE
> postgres=# grant all privileges on database postgres to r1;
> GRANT
> postgres=# set role r1;
> SET
> postgres=> create schema s1;
> CREATE SCHEMA
> postgres=> create publication pub for all tables in schema s1;
> CREATE PUBLICATION
>
> Role r1 is not superuser, but this role could create publication for all tables in schema
> successfully, I think it is related the following change. List schemaidlist was
> not assigned yet. I think we should check it later.
>
> @@ -165,6 +265,12 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
>                                 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                                  errmsg("must be superuser to create FOR ALL TABLES publication")));
>
> +       /* FOR ALL TABLES IN SCHEMA requires superuser */
> +       if (list_length(schemaidlist) > 0 && !superuser())
> +               ereport(ERROR,
> +                               errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                               errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));
> +
>         rel = table_open(PublicationRelationId, RowExclusiveLock);
>
>         /* Check if name is used */

This issue got induced in the v42 version, attached v43 patch has the
fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From

vignesh C

Date:

19 October 2021, 15:44:16

On Tue, Oct 19, 2021 at 11:23 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Tuesday, October 19, 2021 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > >
> > > > Thanks for the comments, the attached v42 patch has the fixes for the same.
> > >
> > > Thanks for your new patch.
> > >
> > > I tried your patch and found that the permission check for superuser didn't work.
> > >
> > > For example:
> > > postgres=# create role r1;
> > > CREATE ROLE
> > > postgres=# grant all privileges on database postgres to r1;
> > > GRANT
> > > postgres=# set role r1;
> > > SET
> > > postgres=> create schema s1;
> > > CREATE SCHEMA
> > > postgres=> create publication pub for all tables in schema s1;
> > > CREATE PUBLICATION
> > >
> > > Role r1 is not superuser, but this role could create publication for all tables in
> > schema
> > > successfully, I think it is related the following change. List schemaidlist was
> > > not assigned yet. I think we should check it later.
> > >
> >
> > It seems this got broken in yesterday's patch version. Do you think it
> > is a good idea to add a test for this case?
> >
>
> Agreed. Thanks for your suggestion.
>
> I tried to add this test to publication.sql, a patch diff file for this test is attached.

Thanks for the test case, I have merged it to the v43 version patch
attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2pJ49wAv%3DgEZrAP5%3D_apAzv_rgK3zjX-wfwCY%2BWWfT9w%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

vignesh C

Date:

19 October 2021, 16:11:45

On Tue, Oct 19, 2021 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Few comments on latest set of patches:
> ===============================
> 1.
> +/*
> + * Filter out the partitions whose parent tables was also specified in
> + * the publication.
> + */
> +static List *
> +filter_out_partitions(List *relids)
>
> Can we name this function as filter_partitions()?

Modified

> 2.
> + /*
> + * If the publication publishes partition changes via their
> + * respective root partitioned tables, we must exclude partitions
> + * in favor of including the root partitioned tables. Otherwise,
> + * the function could return both the child and parent tables which
> + * could cause the data of child table double-published in
> + * subscriber side.
> + */
>
> Let's slightly change the last part of the line in the above comment
> as: "... which could cause data of the child table to be
> double-published on the subscriber side."

Modified

> 3.
> --- a/src/backend/catalog/pg_publication.c
> +++ b/src/backend/catalog/pg_publication.c
> ..
> ..
> @@ -38,7 +40,6 @@
>  #include "utils/builtins.h"
>  #include "utils/catcache.h"
>  #include "utils/fmgroids.h"
> -#include "utils/inval.h"
>  #include "utils/lsyscache.h"
>
> Does this change belong to this patch? If not, maybe you can submit a
> separate patch for this. A similar change is present in
> publicationcmds.c as well, not sure if that is required as well.

I have removed these changes from this patch, I will post a patch for
this separately later.

> 4.
> --- a/src/backend/commands/publicationcmds.c
> +++ b/src/backend/commands/publicationcmds.c
> ...
> ...
> +#include "nodes/makefuncs.h"
>
> Do we need to include this file? I am able to compile without
> including this file.

Modified

> v42-0003-Add-tests-for-the-schema-publication-feature-of-
> 5.
> +-- pg_publication_tables
> +SET client_min_messages = 'ERROR';
> +CREATE SCHEMA sch1;
> +CREATE SCHEMA sch2;
> +CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a);
> +CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM
> (1) to (10);
> +CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH
> (PUBLISH_VIA_PARTITION_ROOT=1);
> +SELECT * FROM pg_publication_tables;
> +
> +DROP PUBLICATION pub;
> +CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH
> (PUBLISH_VIA_PARTITION_ROOT=1);
> +SELECT * FROM pg_publication_tables;
>
> Can we expand the above comment on the lines of: "Test the list of
> partitions published"?

Modified

> v42-0004-Add-documentation-for-the-schema-publication-fea
> 6.
> +     <row>
> +      <entry><link
> linkend="catalog-pg-publication-namespace"><structname>pg_publication_namespace</structname></link></entry>
> +      <entry>schema to publication mapping</entry>
> +     </row>
> +
>       <row>
>        <entry><link
> linkend="catalog-pg-publication-rel"><structname>pg_publication_rel</structname></link></entry>
>        <entry>relation to publication mapping</entry>
> @@ -6238,6 +6243,67 @@ SCRAM-SHA-256$<replaceable><iteration
> count></replaceable>:<replaceable>&l
>    </table>
>   </sect1>
>
> + <sect1 id="catalog-pg-publication-namespace">
> +  <title><structname>pg_publication_namespace</structname></title>
>
> At one place, the new catalog is placed after pg_publication_rel and
> at another place, it is before it. Shouldn't it be before in both
> places as we have a place as per naming order?

Modified

> 7.
> The
> +   <literal>ADD</literal> clause will add one or more tables/schemas to the
> +   publication. The <literal>DROP</literal> clauses will remove one or more
> +   tables/schemas from the publication.
>
> Isn't it better to write the above as one line: "The
> <literal>ADD</literal> and <literal>DROP</literal> clauses will add
> and remove one or more tables/schemas from the publication."?

Modified

> 8.
> +  <para>
> +   Add some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA
> marketing_june, sales_june;
> +</programlisting>
> +  </para>
>
> Can we change schema names to just marketing and sales? Also, let's
> change the description as:"Add schemas
> <structname>marketing</structname> and <structname>sales</structname>
> to the publication <structname>sales_publication</structname>?

Modified

> 9.
> +    [ FOR ALL TABLES
> +      | FOR <replaceable class="parameter">publication
> object</replaceable> [, ... ] ]
>      [ WITH ( <replaceable
> class="parameter">publication_parameter</replaceable> [= <replaceable
> class="parameter">value</replaceable>] [, ... ] ) ]
> +
> +<phrase>where <replaceable class="parameter">publication
> object</replaceable> is one of:</phrase>
>
> Similar to Alter Publication, here also we should use
> publication_object instead of publication object.

Modified

> 10.
> +  <para>
> +   Create a publication that publishes all changes for tables "users" and
> +   "departments" and that publishes all changes for all the tables present in
> +   the schema "production":
> +<programlisting>
> +CREATE PUBLICATION production_publication FOR TABLE users,
> departments, ALL TABLES IN SCHEMA production;
> +</programlisting>
> +  </para>
> +
> +  <para>
> +   Create a publication that publishes all changes for all the tables
> present in
> +   the schemas "marketing" and "sales":
>
> It is better to use <structname> before and </structname> after schema
> names in above descriptions.

Modified

Attached v43 patch has the fixes for the same.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From

"tanghy.fnst@fujitsu.com"

Date:

20 October 2021, 06:04:16

On Tuesday, October 19, 2021 11:42 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> This issue got induced in the v42 version, attached v43 patch has the
> fixes for the same.
> 

Thanks for your new patch. I confirmed that this issue has be fixed.

All regression tests passed. 
I also tested V43 in some other scenarios and found no issue.
So the v43 patch LGTM.

Regards
Tang

Re: Added schema level support for publication.

From

Amit Kapila

Date:

20 October 2021, 07:02:43

On Tue, Oct 19, 2021 at 9:42 PM vignesh C <vignesh21@gmail.com> wrote:
>

Thanks for the modified patch. I have a few more comments and suggestions:

As the thread [1] is still not concluded, I suggest we fix the
duplicate data case only when schemas are involved by slightly
tweaking the code as per attached. This is just to give you an idea
about what I have in mind, if you find a better solution then feel
free to let me know.

Few additional minor comments:
1.
+-- Test the list of partitions published

Shall we change the above comment as: "Test the list of partitions
published with or without 'PUBLISH_VIA_PARTITION_ROOT' parameter"?

2. psql documentation for \dRp[+] needs to be modified.

\dRp
Before:
" .. If + is appended to the command name, the tables associated with
each publication are shown as well."
After:
" .. If + is appended to the command name, the tables and schemas
associated with each publication are shown as well.

Apart from the above, I think we should merge the first four patches
as there doesn't seem to be any big problems pending. We can keep
still keep tests added by 025_rep_changes_for_schema.pl as a separate
patch as there might be some timing-dependent tests in that file.

[1] -
https://www.postgresql.org/message-id/OS0PR01MB57167F45D481F78CDC5986F794B99%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.

Attachment

change_partitions_schema_interaction_1.patch

Re: Added schema level support for publication.

From

vignesh C

Date:

20 October 2021, 16:24:50

On Wed, Oct 20, 2021 at 12:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 19, 2021 at 9:42 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Thanks for the modified patch. I have a few more comments and suggestions:
>
> As the thread [1] is still not concluded, I suggest we fix the
> duplicate data case only when schemas are involved by slightly
> tweaking the code as per attached. This is just to give you an idea
> about what I have in mind, if you find a better solution then feel
> free to let me know.
>
> Few additional minor comments:
> 1.
> +-- Test the list of partitions published
>
> Shall we change the above comment as: "Test the list of partitions
> published with or without 'PUBLISH_VIA_PARTITION_ROOT' parameter"?

Modified

> 2. psql documentation for \dRp[+] needs to be modified.
>
> \dRp
> Before:
> " .. If + is appended to the command name, the tables associated with
> each publication are shown as well."
> After:
> " .. If + is appended to the command name, the tables and schemas
> associated with each publication are shown as well.

Modified

> Apart from the above, I think we should merge the first four patches
> as there doesn't seem to be any big problems pending. We can keep
> still keep tests added by 025_rep_changes_for_schema.pl as a separate
> patch as there might be some timing-dependent tests in that file.

This version of patch retains the changes related to
PublicationRelInfo, I will handle the merging of the patches in the
next version so that this version of patch change related to
PublicationRelInfo can be reviewed easily.

Attached v44 patch as the fixes for the same.

Regards,
Vignesh

On Thu, Oct 21, 2021 at 4:41 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v44 patch as the fixes for the same.
> >
>
> Regarding the documentation, I think some minor updates are needed in:
>  doc/src/sgml/logical-replication.sgml.
> For example, currently it says:
>    Publications may currently only contain tables. Objects must be
> added explicitly, except when a publication is created for
> <literal>ALL TABLES</literal>.
> There is also some security-related information in this file which may
> need updating for ALL TABLES IN SCHEMA.

Modified

> Also, I'm not sure the documentation updates in the patches clearly
> define how partitions relate to ALL TABLES IN SCHEMA.
> For example, if a partitioned table belongs to a different schema to
> that of a child partition that belongs to a schema specified for ALL
> TABLES IN SCHEMA, is it implicitly included in the publication or not
> included?

Added

Thanks for the comments, the attached v45 patch has the fix for the same.

Regards,
Vignesh

On Fri, Oct 22, 2021 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Thanks for the comments, the attached v45 patch has the fix for the same.
> >
>
> The first patch is mostly looking good to me apart from the below
> minor comments:
>
> 1.
> +  <para>
> +   The catalog <structname>pg_publication_namespace</structname> contains the
> +   mapping between schemas and publications in the database.  This is a
> +   many-to-many mapping.
>
> There are extra spaces after mapping at the end which are not required.

Modified

> 2.
> +   <literal>CREATE</literal> privilege on the database.  Also, the new owner
> +   of a <literal>FOR ALL TABLES</literal> publication must be a superuser.
>
> I think we can modify the second line as: "Also, the new owner of a
> <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN
> SCHEMA</literal> publication must be a superuser.

Modified

> 3.
> /table's schema as part of specified schema is not supported./table's
> schema as part of the specified schema is not supported.

Modified

> 4.
> +  <para>
> +   Create a publication that publishes all changes for tables
> +   <structname>users</structname>, <structname>departments</structname> and
> +   that publishes all changes for all the tables present in the schema
> +   <structname>production</structname>:
>
> I don't think '...that publishes...' is required twice in the above sentence.

Modified

> 5.
> +static List *OpenReliIdList(List *relids);
>  static List *OpenTableList(List *tables);
>  static void CloseTableList(List *rels);
>  static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
>   AlterPublicationStmt *stmt);
>  static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
> +static void LockSchemaList(List *schemalist);
> +static void PublicationAddSchemas(Oid pubid, List *schemas, bool if_not_exists,
> +   AlterPublicationStmt *stmt);
> +static void PublicationDropSchemas(Oid pubid, List *schemas, bool missing_ok);
>
> Keep the later definitions also in this order. I suggest move
> LockSchemaList() just after CloseTableList() both in declaration and
> definition.

Modified

> 6.
> +/*
> + * Convert the PublicationObjSpecType list into schema oid list and rangevar
> + * list.
> + */
>
> I think you need to say publication table instead of rangevar in the
> above comment.

Modified

> 7.
> + /*
> + * It is quite possible that for the SET case user has not specified any
> + * schema in which case we need to remove all the existing schemas.
> + */
>
> /schema/schemas

Modified

> 8.
> +/*
> + * Open relations specified by a RangeVar list.
>
> /RangeVar/PublicationTable

Modified

> 9.
> +static bool
> +_equalPublicationObject(const PublicationObjSpec *a,
> + const PublicationObjSpec *b)
> +{
> + COMPARE_SCALAR_FIELD(pubobjtype);
> + COMPARE_STRING_FIELD(name);
> + COMPARE_NODE_FIELD(pubtable);
> + COMPARE_LOCATION_FIELD(location);
> +
> + return true;
> +}
> +
>
> Let's define this immediately before _equalPublicationTable as all
> publication functions are defined there. Also, make the handling of
> T_PublicationObjSpec before T_PublicationTable in equal() function as
> that is the way nodes are defined.

Modified

Thanks for the comments, attached v46 patch has the fix for the same.

Regards,
Vignesh

On Mon, Oct 25, 2021 at 10:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> I am getting a compilation error in the latest patch on HEAD. I think
> was relying on some variable removed by a recent commit
> 92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that
> compilation error, I observed that we don't need the second and third
> parameters in pg_dump.c/getPublicationNamespaces() as those are not
> getting used.

I have fixed this in the v47 version attached.

Regards,
Vignesh

On Thu, Oct 28, 2021 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have fixed this in the v47 version attached.
> > >
> >
> > Thanks, the first patch in the series "Allow publishing the tables of
> > schema." looks good to me. Unless there are more
> > comments/bugs/objections, I am planning to commit it in a day or so.
> >
>
> Yesterday, I have pushed the first patch. Feel free to submit the
> remaining patches.

Thanks for committing the patch, please find the remaining patches attached.
Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments
offline, I have fixed those in the attached patch.

Regards,
Vignesh

On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> >
> > >
> > > Yeah, that is also true. So maybe at this, we can just rename the few
> > > types as suggested by you and we can look at it later if we anytime
> > > have more number of objects to add.
> > >
> >
> > +1
> >
>
> Apart from what you have pointed above, we are using
> "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
> that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".

Thanks for the comments, the attached patch has the changes for the same.

Regards,
Vignesh

Attachment

0001-Renamed-enums-that-included-REL-to-TABLE-in-publish-.patch

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

03 November 2021, 05:37:48

On Wed, Nov 3, 2021 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
> On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> > >
> > >
> > > >
> > > > Yeah, that is also true. So maybe at this, we can just rename the
> > > > few types as suggested by you and we can look at it later if we
> > > > anytime have more number of objects to add.
> > > >
> > >
> > > +1
> > >
> >
> > Apart from what you have pointed above, we are using
> > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
> > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".
> 
> Thanks for the comments, the attached patch has the changes for the same.
> 

Thanks for the patch.
I have only one minor comment:

+    PUBLICATIONOBJ_TABLE_IN_SCHEMA, /* Relations in schema type */

I think the ' Relations' in the comments also need to be changed to 'tables'.

The other part of the patch looks good to me.

Best regards,
Hou zj

Re: Added schema level support for publication.

From

Peter Smith

Date:

03 November 2021, 06:07:38

While you are changing these, maybe also change:

Before: PUBLICATIONOBJ..._CURRSCHEMA
After: PUBLICATIONOBJ..._CUR_SCHEMA

Which seems more readable to me.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

03 November 2021, 06:25:19

On Wed, Nov 3, 2021 at 11:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> While you are changing these, maybe also change:
>
> Before: PUBLICATIONOBJ..._CURRSCHEMA
> After: PUBLICATIONOBJ..._CUR_SCHEMA
>
> Which seems more readable to me.

Thanks for the comment, the attached patch has the changes for the same.

Regards,
Vignesh

Attachment

v2-0001-Renamed-enums-that-was-specified-as-REL-to-TABLE-.patch

Re: Added schema level support for publication.

From

vignesh C

Date:

03 November 2021, 06:27:25

On Wed, Nov 3, 2021 at 11:07 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wed, Nov 3, 2021 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > >
> > > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
> > > <tomas.vondra@enterprisedb.com> wrote:
> > > >
> > > >
> > > > >
> > > > > Yeah, that is also true. So maybe at this, we can just rename the
> > > > > few types as suggested by you and we can look at it later if we
> > > > > anytime have more number of objects to add.
> > > > >
> > > >
> > > > +1
> > > >
> > >
> > > Apart from what you have pointed above, we are using
> > > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
> > > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".
> >
> > Thanks for the comments, the attached patch has the changes for the same.
> >
>
> Thanks for the patch.
> I have only one minor comment:
>
> +       PUBLICATIONOBJ_TABLE_IN_SCHEMA, /* Relations in schema type */
>
> I think the ' Relations' in the comments also need to be changed to 'tables'.
>
> The other part of the patch looks good to me.

Thanks for the comment, the patch at [1] has the changes for the same.
[1] -
https://www.postgresql.org/message-id/CALDaNm3g4ZaJ8h%3D16_A%2BytbyPUdJPaAz94YzQLqkD%3DyPu%2BVPwA%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

Amit Kapila

Date:

03 November 2021, 08:57:38

On Wed, Nov 3, 2021 at 11:55 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Nov 3, 2021 at 11:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > While you are changing these, maybe also change:
> >
> > Before: PUBLICATIONOBJ..._CURRSCHEMA
> > After: PUBLICATIONOBJ..._CUR_SCHEMA
> >
> > Which seems more readable to me.
>
> Thanks for the comment, the attached patch has the changes for the same.
>

LGTM. I'll push this tomorrow unless there are more comments or suggestions.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Peter Smith

Date:

04 November 2021, 00:11:42

FYI - I found a small problem with one of the new PublicationObjSpec
parser error messages that was introduced by the recent schema
publication commit [1].

The error message text is assuming that the error originates from
CREATE PUBLICATION, but actually that same error can also come from
ALTER PUBLICATION.

For example,

e.g.1) Here the error came from CREATE PUBLICATION, so the message
text looks OK.

test_pub=# CREATE PUBLICATION p1 FOR t1;
2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
SCHEMA should be specified before the table/schema name(s) at
character 27
2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
the table/schema name(s)
LINE 1: CREATE PUBLICATION p1 FOR t1;
                                  ^
~~

e.g.2) Here the error came from ALTER PUBLICATION, so the message text
is not OK because the ALTER syntax [2] does not even have a FOR
keyword.

test_pub=# ALTER PUBLICATION p1 SET t1;
2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
SCHEMA should be specified before the table/schema name(s) at
character 26
2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
the table/schema name(s)
LINE 1: ALTER PUBLICATION p1 SET t1;
                                 ^
------
[1] https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd
[2] https://www.postgresql.org/docs/devel/sql-alterpublication.html

Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

04 November 2021, 00:55:42

On Thurs, Nov 4, 2021 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
> FYI - I found a small problem with one of the new PublicationObjSpec parser
> error messages that was introduced by the recent schema publication commit
> [1].
> 
> The error message text is assuming that the error originates from CREATE
> PUBLICATION, but actually that same error can also come from ALTER
> PUBLICATION.
> e.g.2) Here the error came from ALTER PUBLICATION, so the message text is
> not OK because the ALTER syntax [2] does not even have a FOR keyword.
> 
> test_pub=# ALTER PUBLICATION p1 SET t1;
> 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at character 26
> 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET
> t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;

I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be specified before ...".

Best regards,
Hou zj

Re: Added schema level support for publication.

From

Peter Smith

Date:

04 November 2021, 01:39:10

On Thu, Nov 4, 2021 at 11:55 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thurs, Nov 4, 2021 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > FYI - I found a small problem with one of the new PublicationObjSpec parser
> > error messages that was introduced by the recent schema publication commit
> > [1].
> >
> > The error message text is assuming that the error originates from CREATE
> > PUBLICATION, but actually that same error can also come from ALTER
> > PUBLICATION.
> > e.g.2) Here the error came from ALTER PUBLICATION, so the message text is
> > not OK because the ALTER syntax [2] does not even have a FOR keyword.
> >
> > test_pub=# ALTER PUBLICATION p1 SET t1;
> > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at character 26
> > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET
> > t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;
>
> I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be specified before ...".
>

+1 - Yes, I also thought the fix should just be some simple/generic
change of the wording like your suggestion (rather than a different
message for both cases).

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Added schema level support for publication.

From

vignesh C

Date:

04 November 2021, 04:23:56

On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI - I found a small problem with one of the new PublicationObjSpec
> parser error messages that was introduced by the recent schema
> publication commit [1].
>
> The error message text is assuming that the error originates from
> CREATE PUBLICATION, but actually that same error can also come from
> ALTER PUBLICATION.
>
> For example,
>
> e.g.1) Here the error came from CREATE PUBLICATION, so the message
> text looks OK.
>
> test_pub=# CREATE PUBLICATION p1 FOR t1;
> 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at
> character 27
> 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s)
> LINE 1: CREATE PUBLICATION p1 FOR t1;
>                                   ^
> ~~
>
> e.g.2) Here the error came from ALTER PUBLICATION, so the message text
> is not OK because the ALTER syntax [2] does not even have a FOR
> keyword.
>
> test_pub=# ALTER PUBLICATION p1 SET t1;
> 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at
> character 26
> 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s)
> LINE 1: ALTER PUBLICATION p1 SET t1;
>                                  ^

Thanks for the comment, I changed the error message to remove the FOR
keyword. The attached patch has the changes for the same.

Regards,
Vignesh

Attachment

v3-0001-Renamed-few-enums-and-changed-error-message-in-pu.patch

Re: Added schema level support for publication.

From

Peter Smith

Date:

04 November 2021, 05:35:50

On Thu, Nov 4, 2021 at 3:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > FYI - I found a small problem with one of the new PublicationObjSpec
> > parser error messages that was introduced by the recent schema
> > publication commit [1].
> >
> > The error message text is assuming that the error originates from
> > CREATE PUBLICATION, but actually that same error can also come from
> > ALTER PUBLICATION.
> >
> > For example,
> >
> > e.g.1) Here the error came from CREATE PUBLICATION, so the message
> > text looks OK.
> >
> > test_pub=# CREATE PUBLICATION p1 FOR t1;
> > 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at
> > character 27
> > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s)
> > LINE 1: CREATE PUBLICATION p1 FOR t1;
> >                                   ^
> > ~~
> >
> > e.g.2) Here the error came from ALTER PUBLICATION, so the message text
> > is not OK because the ALTER syntax [2] does not even have a FOR
> > keyword.
> >
> > test_pub=# ALTER PUBLICATION p1 SET t1;
> > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at
> > character 26
> > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s)
> > LINE 1: ALTER PUBLICATION p1 SET t1;
> >                                  ^
>
> Thanks for the comment, I changed the error message to remove the FOR
> keyword. The attached patch has the changes for the same.
>

LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Added schema level support for publication.

From

"houzj.fnst@fujitsu.com"

Date:

05 November 2021, 07:44:56

> On Thu, Nov 4, 2021 at 3:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com>
> wrote:
> > >
> > > FYI - I found a small problem with one of the new PublicationObjSpec
> > > parser error messages that was introduced by the recent schema
> > > publication commit [1].
> > >
> > > The error message text is assuming that the error originates from
> > > CREATE PUBLICATION, but actually that same error can also come from
> > > ALTER PUBLICATION.
> > >
> > > For example,
> > >
> > > e.g.1) Here the error came from CREATE PUBLICATION, so the message
> > > text looks OK.
> > >
> > > test_pub=# CREATE PUBLICATION p1 FOR t1;
> > > 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES
> > > IN SCHEMA should be specified before the table/schema name(s) at
> > > character 27
> > > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1
> > > FOR t1;
> > > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified
> > > before the table/schema name(s) LINE 1: CREATE PUBLICATION p1 FOR
> > > t1;
> > >                                   ^
> > > ~~
> > >
> > > e.g.2) Here the error came from ALTER PUBLICATION, so the message
> > > text is not OK because the ALTER syntax [2] does not even have a FOR
> > > keyword.
> > >
> > > test_pub=# ALTER PUBLICATION p1 SET t1;
> > > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES
> > > IN SCHEMA should be specified before the table/schema name(s) at
> > > character 26
> > > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1
> > > SET t1;
> > > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified
> > > before the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;
> > >                                  ^
> >
> > Thanks for the comment, I changed the error message to remove the FOR
> > keyword. The attached patch has the changes for the same.
> >
> 
> LGTM.

+1

Best regards,
Hou zj

Re: Added schema level support for publication.

From

Peter Smith

Date:

09 November 2021, 01:49:50

FYI -  I spotted a trivial SQL mistake (?) of the schema publication patch [1].

See the file describe.c, function describeOneTableDetails.
The new SQL has a 3rd UNION that looks like:

...
"UNION\n"
"SELECT pubname\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
oid, oid, oid, oid);

Notice that there is a table alias "p" but it is never used. It seems
to me like it is just an accidental omission. I think it should be
written like -

BEFORE:
"WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
AFTER:
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"

Doing this will make it consistent with the SQL of the nearby "else"
case which uses the same alias as expected.

------
[1] https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Added schema level support for publication.

From

Amit Kapila

Date:

09 November 2021, 09:21:12

On Tue, Nov 9, 2021 at 7:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI -  I spotted a trivial SQL mistake (?) of the schema publication patch [1].
>
> See the file describe.c, function describeOneTableDetails.
> The new SQL has a 3rd UNION that looks like:
>
> ...
> "UNION\n"
> "SELECT pubname\n"
> "FROM pg_catalog.pg_publication p\n"
> "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> "ORDER BY 1;",
> oid, oid, oid, oid);
>
> Notice that there is a table alias "p" but it is never used. It seems
> to me like it is just an accidental omission. I think it should be
> written like -
>
> BEFORE:
> "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> AFTER:
> "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
>
> Doing this will make it consistent with the SQL of the nearby "else"
> case which uses the same alias as expected.
>

The above makes sense to me. So, pushed a fix for this along with
Vignesh's patch to fix other comments related to this work.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

16 November 2021, 04:44:26

On Tue, Nov 9, 2021 at 2:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 7:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > FYI -  I spotted a trivial SQL mistake (?) of the schema publication patch [1].
> >
> > See the file describe.c, function describeOneTableDetails.
> > The new SQL has a 3rd UNION that looks like:
> >
> > ...
> > "UNION\n"
> > "SELECT pubname\n"
> > "FROM pg_catalog.pg_publication p\n"
> > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> > "ORDER BY 1;",
> > oid, oid, oid, oid);
> >
> > Notice that there is a table alias "p" but it is never used. It seems
> > to me like it is just an accidental omission. I think it should be
> > written like -
> >
> > BEFORE:
> > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> > AFTER:
> > "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> >
> > Doing this will make it consistent with the SQL of the nearby "else"
> > case which uses the same alias as expected.
> >
>
> The above makes sense to me. So, pushed a fix for this along with
> Vignesh's patch to fix other comments related to this work.

Thanks for committing the patch. I have changed the status of the
Commitfest entry for this patch to Committed. Adding of the view can
be handled once the Sequence patch is committed.

Regards,
Vignesh

Re: Added schema level support for publication.

From

Alvaro Herrera

Date:

10 December 2021, 00:54:18

I just noticed that this (commit 5a2832465fd8) added a separate catalog
to store schemas which are part of a publication, side-by-side with the
catalog to store relations which are part of a publication.  This seems
a strange way to represent publication membership: in order to find out
what objects are members of a publication, you have to scan both
pg_publication_rel and pg_publication_namespace.  Wouldn't it make more
sense to have a single catalog for both things, maybe something like

pg_publication_object
oid        OID    -- unique key (for pg_depend)
prpubid        OID    -- of pg_publication
prrelid        OID    -- OID of relation, or 0 if not a relation
prnspid        OID    -- OID of namespace, or 0 if not a namespace

which seems more natural to me, and pollutes the system less with weird
syscaches, etc.

What do you think?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Added schema level support for publication.

From

Amit Kapila

Date:

10 December 2021, 03:08:50

On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> I just noticed that this (commit 5a2832465fd8) added a separate catalog
> to store schemas which are part of a publication, side-by-side with the
> catalog to store relations which are part of a publication.  This seems
> a strange way to represent publication membership: in order to find out
> what objects are members of a publication, you have to scan both
> pg_publication_rel and pg_publication_namespace.  Wouldn't it make more
> sense to have a single catalog for both things, maybe something like
>
> pg_publication_object
> oid             OID     -- unique key (for pg_depend)
> prpubid         OID     -- of pg_publication
> prrelid         OID     -- OID of relation, or 0 if not a relation
> prnspid         OID     -- OID of namespace, or 0 if not a namespace
>
> which seems more natural to me, and pollutes the system less with weird
> syscaches, etc.
>

It will unnecessarily increase the size per row for FOR TABLE
publication both because of the additional column and additional index
(schema_id, pub_id) and vice versa. I think future projects (like
row_filter, column_filter, etc) will make this impact much bigger as
new columns for those won't be required for schema publications.
Basically, as soon as we start to store additional properties for
different objects, storing different objects together would start
becoming more and more worse. This will also in turn increase the scan
cost where we need only schema or rel publication access like where
ever we are using PUBLICATIONNAMESPACEMAP. I think it will increase
the cost of scanning table publications as its corresponding index
will be larger.

-- 
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

10 December 2021, 14:02:00

On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> I just noticed that this (commit 5a2832465fd8) added a separate catalog
> to store schemas which are part of a publication, side-by-side with the
> catalog to store relations which are part of a publication.  This seems
> a strange way to represent publication membership: in order to find out
> what objects are members of a publication, you have to scan both
> pg_publication_rel and pg_publication_namespace.  Wouldn't it make more
> sense to have a single catalog for both things, maybe something like
>
> pg_publication_object
> oid             OID     -- unique key (for pg_depend)
> prpubid         OID     -- of pg_publication
> prrelid         OID     -- OID of relation, or 0 if not a relation
> prnspid         OID     -- OID of namespace, or 0 if not a namespace
>
> which seems more natural to me, and pollutes the system less with weird
> syscaches, etc.
>
> What do you think?

I felt the existing tables are better normalized than the proposed one.

Regards,
Vignesh

Re: Added schema level support for publication.

From

Tom Lane

Date:

14 December 2024, 02:57:52

[ reviving one aspect of an old thread ]

vignesh C <vignesh21@gmail.com> writes:
> On Mon, Jul 19, 2021 at 9:32 AM tanghy.fnst@fujitsu.com <
> tanghy.fnst@fujitsu.com> wrote:
>> I tested your v12 patch and found a problem in the following case.
>>
>> Step 1:
>> postgres=# create schema s1;
>> CREATE SCHEMA
>> postgres=# create table s1.t1 (a int);
>> CREATE TABLE
>> postgres=# create publication pub_t for table s1.t1;
>> CREATE PUBLICATION
>> postgres=# create publication pub_s for schema s1;
>> CREATE PUBLICATION
>>
>> Step 2:
>> pg_dump -N s1
>>
>> I dumped and excluded schema s1, pg_dump generated the following SQL:
>> -------------------------------
>> ALTER PUBLICATION pub_s ADD SCHEMA s1;
>>
>> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD
> TABLE s1.t1" was not generated in my case. Thoughts?

> Thanks for reporting this issue, this issue is fixed in the v13 patch

I suppose this exchange is what led to this bit in
getPublicationNamespaces:

        /*
         * We always dump publication namespaces unless the corresponding
         * namespace is excluded from the dump.
         */
        if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
            continue;

I'd like to push back against this on three separate grounds:

1. The behavior this produces is extremely non-obvious and not
adequately explained by the comment, which makes one wonder how
much of it was intended.  For example:

* The public schema will be included if listed in FOR ALL TABLES IN,
even though it's not dumped explicitly in the dump, because its dump
mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
was probably intentional, but you wouldn't know it from the comment.

* Schemas created within extensions will be included if listed in FOR
ALL TABLES IN, even though they're not dumped explicitly in the dump.
This seems like a quite accidental by-product of the fact that
checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
member objects, thus making their dump mask not NONE.  If this
behavior was intentional, it needs a less-fragile implementation.

* The information_schema will NOT be included, even if it was listed in
FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
contain any tables that'd be useful to publish.  But still, this seems
like randomly ignoring the user's intent.

2. The complaint was that if a schema is excluded from the dump
by --exclude-schema, then it should not get included in the
publication either.  I think this is at best highly debatable:
arguably it amounts to breaking the publication.  It seems
analogous to deciding that if a function is excluded from the
dump, while a view using the function is included, we should
silently adjust the view by removing the output columns or
WHERE clauses that use the function.  I'm pretty sure that
nobody would think that was sane.  Perhaps there's a case for
excluding the view as a whole, but we don't do that.  Besides, the
corresponding behavior would be to exclude the whole publication,
not silently modify its definition.

3. The corresponding test for individual tables listed in
a publication is coded differently:

        /*
         * Ignore publication membership of tables whose definitions are not
         * to be dumped.
         */
        if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
            continue;

This is considerably easier to understand the effects of than a test
on the whole dump mask: it will list the table if we intend to emit
CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
But why is it different from the code for schemas?

So I think that this was just wrongly thought through.  My
preference would be to either delete the above-quoted bit in
getPublicationNamespaces entirely, or make it look like the
test in getPublicationTables.  Or maybe we should delete
both of these tests, on the grounds that redefining the
contents of the publication is far outside pg_dump's charter.

BTW, the discussion that caused me to notice this is at [1].
I'd come to the conclusion that doing something on the basis of
"dobj->dump == DUMP_COMPONENT_NONE" is probably an anti-pattern.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAKNkYnwXFBf136%3Du9UqUxFUVagevLQJ%3DzGd5BsLhCsatDvQsKQ%40mail.gmail.com

Re: Added schema level support for publication.

From

Amit Kapila

Date:

16 December 2024, 08:57:34

On Sat, Dec 14, 2024 at 5:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I suppose this exchange is what led to this bit in
> getPublicationNamespaces:
>
>         /*
>          * We always dump publication namespaces unless the corresponding
>          * namespace is excluded from the dump.
>          */
>         if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
>             continue;
>
> I'd like to push back against this on three separate grounds:
>
>
> 1. The behavior this produces is extremely non-obvious and not
> adequately explained by the comment, which makes one wonder how
> much of it was intended.  For example:
>
> * The public schema will be included if listed in FOR ALL TABLES IN,
> even though it's not dumped explicitly in the dump, because its dump
> mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
> was probably intentional, but you wouldn't know it from the comment.
>
> * Schemas created within extensions will be included if listed in FOR
> ALL TABLES IN, even though they're not dumped explicitly in the dump.
> This seems like a quite accidental by-product of the fact that
> checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
> member objects, thus making their dump mask not NONE.  If this
> behavior was intentional, it needs a less-fragile implementation.
>
> * The information_schema will NOT be included, even if it was listed in
> FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
> contain any tables that'd be useful to publish.  But still, this seems
> like randomly ignoring the user's intent.
>
>
> 2. The complaint was that if a schema is excluded from the dump
> by --exclude-schema, then it should not get included in the
> publication either.  I think this is at best highly debatable:
> arguably it amounts to breaking the publication.  It seems
> analogous to deciding that if a function is excluded from the
> dump, while a view using the function is included, we should
> silently adjust the view by removing the output columns or
> WHERE clauses that use the function.  I'm pretty sure that
> nobody would think that was sane.  Perhaps there's a case for
> excluding the view as a whole, but we don't do that.  Besides, the
> corresponding behavior would be to exclude the whole publication,
> not silently modify its definition.
>
>
> 3. The corresponding test for individual tables listed in
> a publication is coded differently:
>
>         /*
>          * Ignore publication membership of tables whose definitions are not
>          * to be dumped.
>          */
>         if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
>             continue;
>
> This is considerably easier to understand the effects of than a test
> on the whole dump mask: it will list the table if we intend to emit
> CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
> But why is it different from the code for schemas?
>
>
> So I think that this was just wrongly thought through.  My
> preference would be to either delete the above-quoted bit in
> getPublicationNamespaces entirely, or make it look like the
> test in getPublicationTables.  Or maybe we should delete
> both of these tests, on the grounds that redefining the
> contents of the publication is far outside pg_dump's charter.
>

I see a merit in your second suggestion which is to delete these tests
in getPublicationTables() and getPublicationNamespaces() because we
follow similar behavior in the somewhat related subscription case as
well. When a subscription points to a set of publications and we use
'--no-publications' option in pg_dump, it still dumps the
subscription. I tried it with the following test:

Publisher:
postgres=# create schema s1;
CREATE SCHEMA
postgres=# create table t1(c1 int);
CREATE TABLE
postgres=# create publication pub1 for table t1;
CREATE PUBLICATION
postgres=# create publication pub2 for tables in schema s1;
CREATE PUBLICATION

Subscriber:
postgres=# create table t1 (c1 int);
CREATE TABLE
postgres=# create publication pub_on_sub_1 for table t1;
CREATE PUBLICATION
postgres=# create subscription sub1 connection 'dbname = postgres'
publication pub1, pub2;
NOTICE:  created replication slot "sub1" on publisher
CREATE SUBSCRIPTION

Now when I performed the dump with '--no-publications' option on the
subscriber node, it didn't include publications which is expected but
did include a subscription definition pointing to the publications as
defined originally.

--
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

16 December 2024, 14:04:45

On Sat, 14 Dec 2024 at 05:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> [ reviving one aspect of an old thread ]
>
> vignesh C <vignesh21@gmail.com> writes:
> > On Mon, Jul 19, 2021 at 9:32 AM tanghy.fnst@fujitsu.com <
> > tanghy.fnst@fujitsu.com> wrote:
> >> I tested your v12 patch and found a problem in the following case.
> >>
> >> Step 1:
> >> postgres=# create schema s1;
> >> CREATE SCHEMA
> >> postgres=# create table s1.t1 (a int);
> >> CREATE TABLE
> >> postgres=# create publication pub_t for table s1.t1;
> >> CREATE PUBLICATION
> >> postgres=# create publication pub_s for schema s1;
> >> CREATE PUBLICATION
> >>
> >> Step 2:
> >> pg_dump -N s1
> >>
> >> I dumped and excluded schema s1, pg_dump generated the following SQL:
> >> -------------------------------
> >> ALTER PUBLICATION pub_s ADD SCHEMA s1;
> >>
> >> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD
> > TABLE s1.t1" was not generated in my case. Thoughts?
>
> > Thanks for reporting this issue, this issue is fixed in the v13 patch
>
> I suppose this exchange is what led to this bit in
> getPublicationNamespaces:
>
>         /*
>          * We always dump publication namespaces unless the corresponding
>          * namespace is excluded from the dump.
>          */
>         if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
>             continue;
>
> I'd like to push back against this on three separate grounds:
>
>
> 1. The behavior this produces is extremely non-obvious and not
> adequately explained by the comment, which makes one wonder how
> much of it was intended.  For example:
>
> * The public schema will be included if listed in FOR ALL TABLES IN,
> even though it's not dumped explicitly in the dump, because its dump
> mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
> was probably intentional, but you wouldn't know it from the comment.
>
> * Schemas created within extensions will be included if listed in FOR
> ALL TABLES IN, even though they're not dumped explicitly in the dump.
> This seems like a quite accidental by-product of the fact that
> checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
> member objects, thus making their dump mask not NONE.  If this
> behavior was intentional, it needs a less-fragile implementation.
>
> * The information_schema will NOT be included, even if it was listed in
> FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
> contain any tables that'd be useful to publish.  But still, this seems
> like randomly ignoring the user's intent.
>
>
> 2. The complaint was that if a schema is excluded from the dump
> by --exclude-schema, then it should not get included in the
> publication either.  I think this is at best highly debatable:
> arguably it amounts to breaking the publication.  It seems
> analogous to deciding that if a function is excluded from the
> dump, while a view using the function is included, we should
> silently adjust the view by removing the output columns or
> WHERE clauses that use the function.  I'm pretty sure that
> nobody would think that was sane.  Perhaps there's a case for
> excluding the view as a whole, but we don't do that.  Besides, the
> corresponding behavior would be to exclude the whole publication,
> not silently modify its definition.
>
>
> 3. The corresponding test for individual tables listed in
> a publication is coded differently:
>
>         /*
>          * Ignore publication membership of tables whose definitions are not
>          * to be dumped.
>          */
>         if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
>             continue;
>
> This is considerably easier to understand the effects of than a test
> on the whole dump mask: it will list the table if we intend to emit
> CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
> But why is it different from the code for schemas?
>
>
> So I think that this was just wrongly thought through.  My
> preference would be to either delete the above-quoted bit in
> getPublicationNamespaces entirely, or make it look like the
> test in getPublicationTables.  Or maybe we should delete
> both of these tests, on the grounds that redefining the
> contents of the publication is far outside pg_dump's charter.

We cannot keep the code identical for getPublicationNamespaces and
getPublicationTables because selectDumpableNamespace performs special
handling for the public schema. Specifically, it unsets
DUMP_COMPONENT_DEFINITION for the public namespace, which prevents the
inclusion of 'TABLES IN SCHEMA public' in the publication. That is the
reason we did not keep the code similar to getPublicationTables.
I prefer the other approach to remove both the checks in
getPublicationTables() and getPublicationNamespaces() which also makes
it consistent with the other case that Amit mentioned at [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BZYanA51c9NzKM31AqJSw-j0-edGz91%2BVh-nsoKdzKfQ%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From

Artur Zakirov

Date:

16 December 2024, 14:51:20

On Mon, 16 Dec 2024 at 12:05, vignesh C <vignesh21@gmail.com> wrote:
> I prefer the other approach to remove both the checks in
> getPublicationTables() and getPublicationNamespaces() which also makes
> it consistent with the other case that Amit mentioned at [1].

If I understand your suggestion correctly I think this will break the
"--exclude-schema" option of pg_dump. That change will dump all
mappings between publications and schemas for publications which are
dumped.

That solves the issue with special schemas, but restore will fail if
some schemas were explicitly excluded. pg_dump will include in the
dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
those schemas which are not created during restore.

I think getPublicationNamespaces() should handle both special schemas
and excluded schemas, which can make the code a bit more complex
though.

-- 
Kind regards,
Artur

Re: Added schema level support for publication.

From

vignesh C

Date:

17 December 2024, 12:43:08

On Mon, 16 Dec 2024 at 17:21, Artur Zakirov <zaartur@gmail.com> wrote:
>
> On Mon, 16 Dec 2024 at 12:05, vignesh C <vignesh21@gmail.com> wrote:
> > I prefer the other approach to remove both the checks in
> > getPublicationTables() and getPublicationNamespaces() which also makes
> > it consistent with the other case that Amit mentioned at [1].
>
> If I understand your suggestion correctly I think this will break the
> "--exclude-schema" option of pg_dump. That change will dump all
> mappings between publications and schemas for publications which are
> dumped.
>
> That solves the issue with special schemas, but restore will fail if
> some schemas were explicitly excluded. pg_dump will include in the
> dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> those schemas which are not created during restore.

This is already the case in the existing implementation, so users
should not be surprised by the proposed change.
This can be reproduced with the following steps:
-- Create schema and user defined function in schema sch2
create schema sch2;
CREATE FUNCTION sch2.add1(integer, integer)
RETURNS integer
LANGUAGE sql IMMUTABLE STRICT
AS $_$select $1 + $2;$_$;

-- Create a view which references user defined function of a different schema
create schema sch1;
CREATE TABLE sch1.t1 (c1 integer, c2 integer);
CREATE VIEW sch1.v1 AS SELECT c1 FROM sch1.t1 WHERE (sch2.add1(c1, c2) >= 10);

-- Exclude schema sch2 which has the user defined function while dumping
./pg_dump -d postgres -N sch2

You will notice that the schema sch2 and the user defined function in
schema sch2 will not be dumped.

Regards,
Vignesh

Re: Added schema level support for publication.

From

Artur Zakirov

Date:

18 December 2024, 14:04:41

On Tue, 17 Dec 2024 at 10:43, vignesh C <vignesh21@gmail.com> wrote:
> > If I understand your suggestion correctly I think this will break the
> > "--exclude-schema" option of pg_dump. That change will dump all
> > mappings between publications and schemas for publications which are
> > dumped.
> >
> > That solves the issue with special schemas, but restore will fail if
> > some schemas were explicitly excluded. pg_dump will include in the
> > dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> > those schemas which are not created during restore.
>
> This is already the case in the existing implementation, so users
> should not be surprised by the proposed change.

Currently the behavior isn't the same as the proposed change.

Sorry, I might have been not clear when I described what might be
wrong with this. Here is the example with the proposed patch [1].

Create necessary objects to test:

    create schema nsp;
    create publication pub for tables in schema nsp;

If you run pg_dump excluding the schema "nsp":

    pg_dump -d postgres -U postgres -f backup --exclude-schema=nsp

In the resulting file "backup" you will have:

    ...
    ALTER PUBLICATION pub ADD TABLES IN SCHEMA nsp;
    ...

which you won't have on the current master. And I think this is not
what users might expect and it can break some of the scenarios because
during restore they will have an error:

    ERROR:  schema "nsp" does not exist

1. https://www.postgresql.org/message-id/CALDaNm1TQqBC5ZP5BsNf2LKVu1kEJNJn2spFwbAtyLn1FoAFGQ%40mail.gmail.com

-- 
Kind regards,
Artur

Re: Added schema level support for publication.

From

vignesh C

Date:

18 December 2024, 21:32:34

On Wed, 18 Dec 2024 at 16:34, Artur Zakirov <zaartur@gmail.com> wrote:
>
> On Tue, 17 Dec 2024 at 10:43, vignesh C <vignesh21@gmail.com> wrote:
> > > If I understand your suggestion correctly I think this will break the
> > > "--exclude-schema" option of pg_dump. That change will dump all
> > > mappings between publications and schemas for publications which are
> > > dumped.
> > >
> > > That solves the issue with special schemas, but restore will fail if
> > > some schemas were explicitly excluded. pg_dump will include in the
> > > dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> > > those schemas which are not created during restore.
> >
> > This is already the case in the existing implementation, so users
> > should not be surprised by the proposed change.
>
> Currently the behavior isn't the same as the proposed change.
>
> Sorry, I might have been not clear when I described what might be
> wrong with this. Here is the example with the proposed patch [1].
>
> Create necessary objects to test:
>
>     create schema nsp;
>     create publication pub for tables in schema nsp;
>
> If you run pg_dump excluding the schema "nsp":
>
>     pg_dump -d postgres -U postgres -f backup --exclude-schema=nsp
>
> In the resulting file "backup" you will have:
>
>     ...
>     ALTER PUBLICATION pub ADD TABLES IN SCHEMA nsp;
>     ...
>
> which you won't have on the current master. And I think this is not
> what users might expect and it can break some of the scenarios because
> during restore they will have an error:
>
>     ERROR:  schema "nsp" does not exist

Yes, this is done intentionally in the proposed patch to keep it
consistent with other scenarios in HEAD.
For example, consider the following case:
-- Create schema and user defined function in schema sch2
create schema sch2;
CREATE FUNCTION sch2.add1(integer, integer)
RETURNS integer
LANGUAGE sql IMMUTABLE STRICT
AS $_$select $1 + $2;$_$;

-- Create a view which references user defined function of a different schema
create schema sch1;
CREATE TABLE sch1.t1 (c1 integer, c2 integer);
CREATE VIEW sch1.v1 AS SELECT c1 FROM sch1.t1 WHERE (sch2.add1(c1, c2) >= 10);

-- Exclude schema sch2 which has the user defined function while dumping
./pg_dump -d postgres -Fc -f dump1 -N sch2

The dump file has the reference to sch2.add1 even though sch2 schema
was excluded, dump will not have the user defined functions defined in
schema sch2:
CREATE VIEW sch1.v1 AS
 SELECT c1
   FROM sch1.t1
  WHERE (sch2.add1(c1, c2) >= 10);

Restore using the above dump that was generated will fail with the below error:
./pg_restore -d test1 dump1
pg_restore: error: could not execute query: ERROR:  schema "sch2" does not exist
LINE 4:   WHERE (sch2.add1(c1, c2) >= 10);
                 ^
Command was: CREATE VIEW sch1.v1 AS
 SELECT c1
   FROM sch1.t1
  WHERE (sch2.add1(c1, c2) >= 10);

The proposed patch is in similar lines.

Regards,
Vignesh

Re: Added schema level support for publication.

From

Amit Kapila

Date:

20 December 2024, 09:04:08

On Thu, Dec 19, 2024 at 12:02 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 18 Dec 2024 at 16:34, Artur Zakirov <zaartur@gmail.com> wrote:
> >
> > On Tue, 17 Dec 2024 at 10:43, vignesh C <vignesh21@gmail.com> wrote:
> > > > If I understand your suggestion correctly I think this will break the
> > > > "--exclude-schema" option of pg_dump. That change will dump all
> > > > mappings between publications and schemas for publications which are
> > > > dumped.
> > > >
> > > > That solves the issue with special schemas, but restore will fail if
> > > > some schemas were explicitly excluded. pg_dump will include in the
> > > > dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> > > > those schemas which are not created during restore.
> > >
> > > This is already the case in the existing implementation, so users
> > > should not be surprised by the proposed change.
> >
> > Currently the behavior isn't the same as the proposed change.
> >
> > Sorry, I might have been not clear when I described what might be
> > wrong with this. Here is the example with the proposed patch [1].
> >
> > Create necessary objects to test:
> >
> >     create schema nsp;
> >     create publication pub for tables in schema nsp;
> >
> > If you run pg_dump excluding the schema "nsp":
> >
> >     pg_dump -d postgres -U postgres -f backup --exclude-schema=nsp
> >
> > In the resulting file "backup" you will have:
> >
> >     ...
> >     ALTER PUBLICATION pub ADD TABLES IN SCHEMA nsp;
> >     ...
> >
> > which you won't have on the current master. And I think this is not
> > what users might expect and it can break some of the scenarios because
> > during restore they will have an error:
> >
> >     ERROR:  schema "nsp" does not exist
>
> Yes, this is done intentionally in the proposed patch to keep it
> consistent with other scenarios in HEAD.
> For example, consider the following case:
> -- Create schema and user defined function in schema sch2
> create schema sch2;
> CREATE FUNCTION sch2.add1(integer, integer)
> RETURNS integer
> LANGUAGE sql IMMUTABLE STRICT
> AS $_$select $1 + $2;$_$;
>
> -- Create a view which references user defined function of a different schema
> create schema sch1;
> CREATE TABLE sch1.t1 (c1 integer, c2 integer);
> CREATE VIEW sch1.v1 AS SELECT c1 FROM sch1.t1 WHERE (sch2.add1(c1, c2) >= 10);
>
> -- Exclude schema sch2 which has the user defined function while dumping
> ./pg_dump -d postgres -Fc -f dump1 -N sch2
>
> The dump file has the reference to sch2.add1 even though sch2 schema
> was excluded, dump will not have the user defined functions defined in
> schema sch2:
> CREATE VIEW sch1.v1 AS
>  SELECT c1
>    FROM sch1.t1
>   WHERE (sch2.add1(c1, c2) >= 10);
>
> Restore using the above dump that was generated will fail with the below error:
> ./pg_restore -d test1 dump1
> pg_restore: error: could not execute query: ERROR:  schema "sch2" does not exist
> LINE 4:   WHERE (sch2.add1(c1, c2) >= 10);
>                  ^
> Command was: CREATE VIEW sch1.v1 AS
>  SELECT c1
>    FROM sch1.t1
>   WHERE (sch2.add1(c1, c2) >= 10);
>
> The proposed patch is in similar lines.
>

I agree with the proposed patch and this behavior. It is on the lines
of what Tom proposed as one of the ways to address the issue raised in
his initial email. Also, it follows the existing behavior in cases
like view<->function dependency as shown by you in this email, and to
some extent subscription<->publication dependency as shown in email
[1].

Let's wait till the beginning of the next CF to see if there are other
suggestions or any arguments against this proposed change.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BZYanA51c9NzKM31AqJSw-j0-edGz91%2BVh-nsoKdzKfQ%40mail.gmail.com
--
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

18 February, 12:26:07

On Fri, Dec 20, 2024 at 11:34 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I agree with the proposed patch and this behavior. It is on the lines
> of what Tom proposed as one of the ways to address the issue raised in
> his initial email. Also, it follows the existing behavior in cases
> like view<->function dependency as shown by you in this email, and to
> some extent subscription<->publication dependency as shown in email
> [1].
>
> Let's wait till the beginning of the next CF to see if there are other
> suggestions or any arguments against this proposed change.
>

I am planning to push Vignesh's proposed patch in the email (1) in a
day or two unless there are objections or any other comments on the
patch.

(1) - https://www.postgresql.org/message-id/CALDaNm1TQqBC5ZP5BsNf2LKVu1kEJNJn2spFwbAtyLn1FoAFGQ%40mail.gmail.com

--
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

Amit Kapila

Date:

19 February, 13:26:55

On Wed, Dec 18, 2024 at 12:51 PM vignesh C <vignesh21@gmail.com> wrote:
>
> The attached patch has the changes for the same.
>

@@ -3037,10 +3037,6 @@ my %tests = (
  \QALTER PUBLICATION pub1 ADD TABLE ONLY dump_test.test_table;\E
  /xm,
  like => { %full_runs, section_post_data => 1, },
- unlike => {
- exclude_dump_test_schema => 1,
- exclude_test_table => 1,
- },

Shall we change the above 'unlike' to 'like' instead of removing it
with an additional comment as to why we expect a different behavior
here?

--
With Regards,
Amit Kapila.

Re: Added schema level support for publication.

From

vignesh C

Date:

19 February, 15:48:15

On Wed, 19 Feb 2025 at 15:57, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 18, 2024 at 12:51 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The attached patch has the changes for the same.
> >
>
> @@ -3037,10 +3037,6 @@ my %tests = (
>   \QALTER PUBLICATION pub1 ADD TABLE ONLY dump_test.test_table;\E
>   /xm,
>   like => { %full_runs, section_post_data => 1, },
> - unlike => {
> - exclude_dump_test_schema => 1,
> - exclude_test_table => 1,
> - },
>
> Shall we change the above 'unlike' to 'like' instead of removing it
> with an additional comment as to why we expect a different behavior
> here?

The updated v2 version patch has the changes for the same.

Regards,
Vignesh

Attachment

v2-0001-Include-information_schema-publications-and-exclu.patch