Thread: Added schema level support for publication.

Added schema level support for publication.

From
vignesh C
Date:
Hi,

This feature adds schema option while creating publication. Users will
be able to specify one or more schemas while creating publication,
when the user specifies schema option, then the data changes for the
tables present in the schema specified by the user will be replicated
to the subscriber. Few examples have been listed below:

Create a publication that publishes all changes for all the tables
present in production schema:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;

Create a publication that publishes all changes for all the tables
present in marketing and sales schemas:
CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;

Add some schemas to the publication:
ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;

Drop some schema from the publication:
ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;

Attached is a POC patch for the same. I felt this feature would be quite useful.
Thoughts?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
>
> This feature adds schema option while creating publication. Users will
> be able to specify one or more schemas while creating publication,
> when the user specifies schema option, then the data changes for the
> tables present in the schema specified by the user will be replicated
> to the subscriber. Few examples have been listed below:
>
> Create a publication that publishes all changes for all the tables
> present in production schema:
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
>
> Create a publication that publishes all changes for all the tables
> present in marketing and sales schemas:
> CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
>
> Add some schemas to the publication:
> ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
>
> Drop some schema from the publication:
> ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
>
> Attached is a POC patch for the same. I felt this feature would be quite useful.
>

What do we do if the user Drops the schema? Do we automatically remove
it from the publication?

I see some use of such a feature but you haven't described the use
case or how did you arrive at the conclusion that it would be quite
useful?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jan 8, 2021 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > This feature adds schema option while creating publication. Users will
> > be able to specify one or more schemas while creating publication,
> > when the user specifies schema option, then the data changes for the
> > tables present in the schema specified by the user will be replicated
> > to the subscriber. Few examples have been listed below:
> >
> > Create a publication that publishes all changes for all the tables
> > present in production schema:
> > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
> >
> > Create a publication that publishes all changes for all the tables
> > present in marketing and sales schemas:
> > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
> >
> > Add some schemas to the publication:
> > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
> >
> > Drop some schema from the publication:
> > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
> >
> > Attached is a POC patch for the same. I felt this feature would be quite useful.
> >
>
> What do we do if the user Drops the schema? Do we automatically remove
> it from the publication?
>
I have not yet handled this scenario yet, I will handle this and
adding of tests in the next patch.

> I see some use of such a feature but you haven't described the use
> case or how did you arrive at the conclusion that it would be quite
> useful?
>
Currently there are a couple of options "FOR All TABLES" and "FOR
TABLE" when a user creates a publication, 1) either to subscribe to
the changes of all the tables or 2) subscribe to a few tables. There
is no option for users to subscribe to relations present in the
schemas. User has to manually identify the list of tables present in
the schema and specify the list of tables in that schema using the
"FOR TABLE" option. Similarly if a user wants to subscribe to n number
of schemas, the user has to do this for the required schemas, this is
a tedious process. This feature helps the user to take care of this
internally using schema option.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Sat, Jan 9, 2021 at 5:21 PM vignesh C <vignesh21@gmail.com> wrote:
> > What do we do if the user Drops the schema? Do we automatically remove
> > it from the publication?
> >
> I have not yet handled this scenario yet, I will handle this and
> adding of tests in the next patch.
>
> > I see some use of such a feature but you haven't described the use
> > case or how did you arrive at the conclusion that it would be quite
> > useful?
> >
> Currently there are a couple of options "FOR All TABLES" and "FOR
> TABLE" when a user creates a publication, 1) either to subscribe to
> the changes of all the tables or 2) subscribe to a few tables. There
> is no option for users to subscribe to relations present in the
> schemas. User has to manually identify the list of tables present in
> the schema and specify the list of tables in that schema using the
> "FOR TABLE" option. Similarly if a user wants to subscribe to n number
> of schemas, the user has to do this for the required schemas, this is
> a tedious process. This feature helps the user to take care of this
> internally using schema option.

I think this feature can be useful, in case a user has a lot of tables
to publish inside a schema. Having said that, I wonder if this feature
mandates users to create the same schema with same
permissions/authorizations manually on the subscriber, because logical
replication doesn't propagate any ddl's so are the schema or schema
changes? Or is it that the list of tables from the publisher can go
into a different schema on the subscriber?

Since the schema can have other objects such as data types, functions,
operators, I'm sure with your feature, non-table objects will be
skipped.

As Amit pointed out earlier, the behaviour when schema dropped, I
think we should also consider when schema is altered, say altered to a
different name, maybe we should change that in the publication too.

In general, what happens if we have some temporary tables or foreign
tables inside the schema, will they be allowed to send the data to
subscribers?

And, with this feature, since there can be many huge tables inside a
schema, the initial table sync phase of the replication can take a
while.

Say a user has created a publication for a schema with hundreds of
tables in it, at some point later, can he stop replicating a single or
some tables from that schema?

IMO, it's better to have the syntax - CREATE PUBLICATION
production_publication FOR ALL TABLES IN SCHEMA production - just
added IN between for all tables and schema.

Say a user has a schema with 121 tables in it, and wants to replicate
only 120 or 199 or even lesser tables out of it, so can we have some
skip option to the new syntax, something like below?
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
production WITH skip = marketing, accounts, sales;  --> meaning is,
replicate all the tables in the schema production except marketing,
accounts, sales tables.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> I think this feature can be useful, in case a user has a lot of tables
> to publish inside a schema. Having said that, I wonder if this feature
> mandates users to create the same schema with same
> permissions/authorizations manually on the subscriber, because logical
> replication doesn't propagate any ddl's so are the schema or schema
> changes? Or is it that the list of tables from the publisher can go
> into a different schema on the subscriber?
>
> Since the schema can have other objects such as data types, functions,
> operators, I'm sure with your feature, non-table objects will be
> skipped.
>
> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>
> In general, what happens if we have some temporary tables or foreign
> tables inside the schema, will they be allowed to send the data to
> subscribers?
>
> And, with this feature, since there can be many huge tables inside a
> schema, the initial table sync phase of the replication can take a
> while.
>
> Say a user has created a publication for a schema with hundreds of
> tables in it, at some point later, can he stop replicating a single or
> some tables from that schema?
>
> IMO, it's better to have the syntax - CREATE PUBLICATION
> production_publication FOR ALL TABLES IN SCHEMA production - just
> added IN between for all tables and schema.
>
> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales;  --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.

One more point - if the publication is created for a schema with no or
some initial tables, will all the future tables that may get added to
the schema will be replicated too?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Dilip Kumar
Date:
On Sat, Jan 9, 2021 at 8:14 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > I think this feature can be useful, in case a user has a lot of tables
> > to publish inside a schema. Having said that, I wonder if this feature
> > mandates users to create the same schema with same
> > permissions/authorizations manually on the subscriber, because logical
> > replication doesn't propagate any ddl's so are the schema or schema
> > changes? Or is it that the list of tables from the publisher can go
> > into a different schema on the subscriber?
> >
> > Since the schema can have other objects such as data types, functions,
> > operators, I'm sure with your feature, non-table objects will be
> > skipped.
> >
> > As Amit pointed out earlier, the behaviour when schema dropped, I
> > think we should also consider when schema is altered, say altered to a
> > different name, maybe we should change that in the publication too.
> >
> > In general, what happens if we have some temporary tables or foreign
> > tables inside the schema, will they be allowed to send the data to
> > subscribers?
> >
> > And, with this feature, since there can be many huge tables inside a
> > schema, the initial table sync phase of the replication can take a
> > while.
> >
> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
> > IMO, it's better to have the syntax - CREATE PUBLICATION
> > production_publication FOR ALL TABLES IN SCHEMA production - just
> > added IN between for all tables and schema.
> >
> > Say a user has a schema with 121 tables in it, and wants to replicate
> > only 120 or 199 or even lesser tables out of it, so can we have some
> > skip option to the new syntax, something like below?
> > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> > production WITH skip = marketing, accounts, sales;  --> meaning is,
> > replicate all the tables in the schema production except marketing,
> > accounts, sales tables.
>
> One more point - if the publication is created for a schema with no or
> some initial tables, will all the future tables that may get added to
> the schema will be replicated too?
>

I expect this should be the behavior.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
vignesh C
Date:
Thanks for your comments Bharath, please find my opinion below.

On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> I think this feature can be useful, in case a user has a lot of tables
> to publish inside a schema. Having said that, I wonder if this feature
> mandates users to create the same schema with same
> permissions/authorizations manually on the subscriber, because logical
> replication doesn't propagate any ddl's so are the schema or schema
> changes? Or is it that the list of tables from the publisher can go
> into a different schema on the subscriber?
>

DDL's will not be propagated to the subscriber. Users have to create
the schema & tables in the subscriber. No change in
Permissions/authorizations handling, it will be the same as the
existing behavior for relations.

> Since the schema can have other objects such as data types, functions,
> operators, I'm sure with your feature, non-table objects will be
> skipped.
>

Yes, only table data will be sent to subscribers, non-table objects
will be skipped.

> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>

I agree that when schema is altered the renamed schema should be
reflected in the publication.

> In general, what happens if we have some temporary tables or foreign
> tables inside the schema, will they be allowed to send the data to
> subscribers?
>

Temporary tables & foreign tables will not be added to the publications.

> And, with this feature, since there can be many huge tables inside a
> schema, the initial table sync phase of the replication can take a
> while.
>

Yes this is required.

> Say a user has created a publication for a schema with hundreds of
> tables in it, at some point later, can he stop replicating a single or
> some tables from that schema?
>

There is no provision for this currently.

> IMO, it's better to have the syntax - CREATE PUBLICATION
> production_publication FOR ALL TABLES IN SCHEMA production - just
> added IN between for all tables and schema.
>

I'm ok with the proposed syntax, I would like others' opinion too
before making the change.

> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales;  --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.
>

Yes this is a good use case, will include this change.

Thanks for the comments, I will handle the comments and post a patch for this.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Jan 9, 2021 at 8:14 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> One more point - if the publication is created for a schema with no or
> some initial tables, will all the future tables that may get added to
> the schema will be replicated too?
>

I agree on this, when a relation is added to the schema it should be
added to the publication.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > I think this feature can be useful, in case a user has a lot of tables
> > to publish inside a schema. Having said that, I wonder if this feature
> > mandates users to create the same schema with same
> > permissions/authorizations manually on the subscriber, because logical
> > replication doesn't propagate any ddl's so are the schema or schema
> > changes? Or is it that the list of tables from the publisher can go
> > into a different schema on the subscriber?
> >
>
> DDL's will not be propagated to the subscriber. Users have to create
> the schema & tables in the subscriber. No change in
> Permissions/authorizations handling, it will be the same as the
> existing behavior for relations.

Looks like the existing behaviour already requires users to create the
schema on the subscriber when publishing the tables from that schema.
Otherwise, an error is thrown on the subscriber [1].

[1] on publisher:
CREATE SCHEMA myschema;
CREATE TABLE myschema.t1(a1 int, b1 int);
INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
CREATE PUBLICATION testpub FOR TABLE myschema.t1;

on subscriber:
postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
dbname=postgres user=bharath port=5432' PUBLICATION testpub;
ERROR:  schema "myschema" does not exist
CREATE SCHEMA myschema;
CREATE TABLE myschema.t1(a1 int, b1 int);
postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
dbname=postgres user=bharath port=5432' PUBLICATION testpub;
NOTICE:  created replication slot "testsub" on publisher
CREATE SUBSCRIPTION

> > Since the schema can have other objects such as data types, functions,
> > operators, I'm sure with your feature, non-table objects will be
> > skipped.
> >
>
> Yes, only table data will be sent to subscribers, non-table objects
> will be skipped.

Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
for all the tables in the database, does this i.e. skips non-table
objects and temporary tables, foreign tables and so on. So, your
feature also can behave the same way, but within the scope of the
given schema/s.

> > As Amit pointed out earlier, the behaviour when schema dropped, I
> > think we should also consider when schema is altered, say altered to a
> > different name, maybe we should change that in the publication too.
> >
>
> I agree that when schema is altered the renamed schema should be
> reflected in the publication.

I think, it's not only making sure that the publisher side has the new
altered schema, but also the subscriber needs those alters. Having
said that, since these alters come under DDL changes and in logical
replication we don't publish the scheme changes to the subscriber, we
may not need to anything extra for informing the schema alters to the
subscriber from the publisher, the users might have to do the same
schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION;  should work for them? If this understanding is
correct, then we should document this.

> > In general, what happens if we have some temporary tables or foreign
> > tables inside the schema, will they be allowed to send the data to
> > subscribers?
> >
>
> Temporary tables & foreign tables will not be added to the publications.

Yes the existing logical replication framework doesn't allow
replication of temporary, unlogged, foreign tables and other non-table
relations such as materialized views, indexes etc [1]. The CREATE
PUBLICATION statement either fails in check_publication_add_relation
or before that.

CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
single table tab1 is any of the above restricted tables, seems fine.
But, if there's a list of tables with CREATE PUBLICATION testpub FOR
TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
first encounter of the restricted table, say at temp_tab2. Whereas,
CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
tables and continue to add the accepted tables into the publication
within the database.

IMHO, if there's a list of tables specified with FOR TABLE, then
instead of throwing an error in case of any restricted table, we can
issue a warning and continue with the addition of accepted tables into
the publication. If done, this behaviour will be in sync with FOR ALL
TABLES;

Thoughts? If okay, I can work on a patch.

Related to error messages: when foreign table is specified in CREATE
PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
[1], how about the error message "ERROR: foerign table "f1" cannot be
replicated". In general, it would be good if we could have the error
messages something like in [2] instead of the existing [1].

Thoughts? If okay, I can work on a patch.

[1]
t1 is a temporary table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  table "t1" cannot be replicated
DETAIL:  Temporary and unlogged relations cannot be replicated.

t1 is an unlogged table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  table "t1" cannot be replicated
DETAIL:  Temporary and unlogged relations cannot be replicated.

f1 is a foreign table:
postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
ERROR:  "f1" is not a table
DETAIL:  Only tables can be added to publications.

mv1 is a mat view:
postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
ERROR:  "mv1" is not a table

idx1 is an index:
postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
ERROR:  "idx1" is an index

[2]
t1 is a temporary table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  temporary table "t1" cannot be replicated
DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.

t1 is an unlogged table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR:  unlogged table "t1" cannot be replicated
DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.

f1 is a foreign table:
postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
ERROR:  foreign table "f1" cannot be replicated
DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.

> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
>
> There is no provision for this currently.

The documentation [1] says, we can ALTER PUBLICATION testpub DROP
TABLE t1; which removes the table from the list of published tables,
but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
PUBLICATION; for the changes to become effective on the subscriber. I
have done some testing for this case:
1) created publication for table t1, see \d+ t1, the associated
publication is visible in the output
2) created subscription on the subscriber, initial available data from
the publisher for table t1 is received
3) insert into table t1 on the publisher
4) inserted data in (3) is received in the subscriber table t1
5) alter publication to drop the table t1 on the publisher, see \d+
t1, there will not be any associated publication in the output
6) execute alter subscription refresh publication on the subscriber,
with the expectation that it should not receive the data from the
publisher for the table t1 since it's dropped from the publication in
(5)
7) insert into table t1 on the publisher
8) still the newly inserted data in (7) from the publisher, will be
received into the table t1 in the subscriber

IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
the above use case, it looks like a bug to me. If I'm wrong, can
someone correct me?

Thoughts?

[1] - https://www.postgresql.org/docs/devel/sql-alterpublication.html

> > IMO, it's better to have the syntax - CREATE PUBLICATION
> > production_publication FOR ALL TABLES IN SCHEMA production - just
> > added IN between for all tables and schema.
> >
>
> I'm ok with the proposed syntax, I would like others' opinion too
> before making the change.

Thanks.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
japin
Date:
On Mon, 11 Jan 2021 at 14:15, Bharath Rupireddy wrote:
> On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
>> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
>> <bharath.rupireddyforpostgres@gmail.com> wrote:
>> > I think this feature can be useful, in case a user has a lot of tables
>> > to publish inside a schema. Having said that, I wonder if this feature
>> > mandates users to create the same schema with same
>> > permissions/authorizations manually on the subscriber, because logical
>> > replication doesn't propagate any ddl's so are the schema or schema
>> > changes? Or is it that the list of tables from the publisher can go
>> > into a different schema on the subscriber?
>> >
>>
>> DDL's will not be propagated to the subscriber. Users have to create
>> the schema & tables in the subscriber. No change in
>> Permissions/authorizations handling, it will be the same as the
>> existing behavior for relations.
>
> Looks like the existing behaviour already requires users to create the
> schema on the subscriber when publishing the tables from that schema.
> Otherwise, an error is thrown on the subscriber [1].
>
> [1] on publisher:
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
> CREATE PUBLICATION testpub FOR TABLE myschema.t1;
>
> on subscriber:
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> ERROR:  schema "myschema" does not exist
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> NOTICE:  created replication slot "testsub" on publisher
> CREATE SUBSCRIPTION
>
>> > Since the schema can have other objects such as data types, functions,
>> > operators, I'm sure with your feature, non-table objects will be
>> > skipped.
>> >
>>
>> Yes, only table data will be sent to subscribers, non-table objects
>> will be skipped.
>
> Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
> for all the tables in the database, does this i.e. skips non-table
> objects and temporary tables, foreign tables and so on. So, your
> feature also can behave the same way, but within the scope of the
> given schema/s.
>
>> > As Amit pointed out earlier, the behaviour when schema dropped, I
>> > think we should also consider when schema is altered, say altered to a
>> > different name, maybe we should change that in the publication too.
>> >
>>
>> I agree that when schema is altered the renamed schema should be
>> reflected in the publication.
>
> I think, it's not only making sure that the publisher side has the new
> altered schema, but also the subscriber needs those alters. Having
> said that, since these alters come under DDL changes and in logical
> replication we don't publish the scheme changes to the subscriber, we
> may not need to anything extra for informing the schema alters to the
> subscriber from the publisher, the users might have to do the same
> schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
> REFRESH PUBLICATION;  should work for them? If this understanding is
> correct, then we should document this.
>
>> > In general, what happens if we have some temporary tables or foreign
>> > tables inside the schema, will they be allowed to send the data to
>> > subscribers?
>> >
>>
>> Temporary tables & foreign tables will not be added to the publications.
>
> Yes the existing logical replication framework doesn't allow
> replication of temporary, unlogged, foreign tables and other non-table
> relations such as materialized views, indexes etc [1]. The CREATE
> PUBLICATION statement either fails in check_publication_add_relation
> or before that.
>
> CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
> single table tab1 is any of the above restricted tables, seems fine.
> But, if there's a list of tables with CREATE PUBLICATION testpub FOR
> TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
> unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
> first encounter of the restricted table, say at temp_tab2. Whereas,
> CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
> tables and continue to add the accepted tables into the publication
> within the database.
>
> IMHO, if there's a list of tables specified with FOR TABLE, then
> instead of throwing an error in case of any restricted table, we can
> issue a warning and continue with the addition of accepted tables into
> the publication. If done, this behaviour will be in sync with FOR ALL
> TABLES;
>
> Thoughts? If okay, I can work on a patch.
>
> Related to error messages: when foreign table is specified in CREATE
> PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
> [1], how about the error message "ERROR: foerign table "f1" cannot be
> replicated". In general, it would be good if we could have the error
> messages something like in [2] instead of the existing [1].
>
> Thoughts? If okay, I can work on a patch.
>
> [1]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  "f1" is not a table
> DETAIL:  Only tables can be added to publications.
>
> mv1 is a mat view:
> postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
> ERROR:  "mv1" is not a table
>
> idx1 is an index:
> postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
> ERROR:  "idx1" is an index
>
> [2]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  temporary table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  unlogged table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  foreign table "f1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
>> > Say a user has created a publication for a schema with hundreds of
>> > tables in it, at some point later, can he stop replicating a single or
>> > some tables from that schema?
>> >
>>
>> There is no provision for this currently.
>
> The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> TABLE t1; which removes the table from the list of published tables,
> but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> PUBLICATION; for the changes to become effective on the subscriber. I
> have done some testing for this case:
> 1) created publication for table t1, see \d+ t1, the associated
> publication is visible in the output
> 2) created subscription on the subscriber, initial available data from
> the publisher for table t1 is received
> 3) insert into table t1 on the publisher
> 4) inserted data in (3) is received in the subscriber table t1
> 5) alter publication to drop the table t1 on the publisher, see \d+
> t1, there will not be any associated publication in the output
> 6) execute alter subscription refresh publication on the subscriber,
> with the expectation that it should not receive the data from the
> publisher for the table t1 since it's dropped from the publication in
> (5)
> 7) insert into table t1 on the publisher
> 8) still the newly inserted data in (7) from the publisher, will be
> received into the table t1 in the subscriber
>
> IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> the above use case, it looks like a bug to me. If I'm wrong, can
> someone correct me?
>

Yes, if we modify the publication, we should refresh the subscription on
each subscriber.  It looks strange for me, especially for partitioned
tables [1].

> Thoughts?
>

Can we trace the different between publication and subscription, and
auto-refresh subscription on subscriber?

[1]
https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
> >> > Say a user has created a publication for a schema with hundreds of
> >> > tables in it, at some point later, can he stop replicating a single or
> >> > some tables from that schema?
> >> >
> >>
> >> There is no provision for this currently.
> >
> > The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> > TABLE t1; which removes the table from the list of published tables,
> > but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> > PUBLICATION; for the changes to become effective on the subscriber. I
> > have done some testing for this case:
> > 1) created publication for table t1, see \d+ t1, the associated
> > publication is visible in the output
> > 2) created subscription on the subscriber, initial available data from
> > the publisher for table t1 is received
> > 3) insert into table t1 on the publisher
> > 4) inserted data in (3) is received in the subscriber table t1
> > 5) alter publication to drop the table t1 on the publisher, see \d+
> > t1, there will not be any associated publication in the output
> > 6) execute alter subscription refresh publication on the subscriber,
> > with the expectation that it should not receive the data from the
> > publisher for the table t1 since it's dropped from the publication in
> > (5)
> > 7) insert into table t1 on the publisher
> > 8) still the newly inserted data in (7) from the publisher, will be
> > received into the table t1 in the subscriber
> >
> > IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> > the above use case, it looks like a bug to me. If I'm wrong, can
> > someone correct me?
> >
>
> Yes, if we modify the publication, we should refresh the subscription on
> each subscriber.  It looks strange for me, especially for partitioned
> tables [1].
>
> > Thoughts?
> >
>
> Can we trace the different between publication and subscription, and
> auto-refresh subscription on subscriber?
>
> [1]
> https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com

As Amit stated in your thread [1], DDLs like creation of the new
tables or partitions, schema changes etc. on the publisher can not be
replicated automatically by the logical replication framework to the
subscriber. Users have to perform those DDLs on the subscribers by
themselves.

If your point is to at least issue the ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION; from the publication whenever the publication is
altered i.e. added or dropped tables, IMO, we cannot do this, because
running this command on the subscriber only makes sense, after user
runs the same DDLs (which were run on the publisher) also on the
subscriber. To illustrate this:
1) create a new table or partition on the publisher and add it to
publisher, note that the same table has not yet been created on the
subscriber
2) imagine the publisher issuing an auto refresh command to all the
subscribers, then, no point in that right, because the new table or
the partition is not yet created on all the subscribers.

So, IMO, we can not have an auto refresh mechanism, until we have the
feature to replicate the DDL changes to all the subscribers.

What I stated in my earlier mail [1] is that even though we drop the
table from the publication in the publisher and run a refresh
publication on the subscriber, still the data is being replicated from
the publisher to the subscriber table. I just wanted to know whether
this is the expected behaviour or what exactly means. a user running
ALTER PUBLICATION mypub DROP TABLE mytable;

[1] - https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Li Japin
Date:

On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
Say a user has created a publication for a schema with hundreds of
tables in it, at some point later, can he stop replicating a single or
some tables from that schema?


There is no provision for this currently.

The documentation [1] says, we can ALTER PUBLICATION testpub DROP
TABLE t1; which removes the table from the list of published tables,
but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
PUBLICATION; for the changes to become effective on the subscriber. I
have done some testing for this case:
1) created publication for table t1, see \d+ t1, the associated
publication is visible in the output
2) created subscription on the subscriber, initial available data from
the publisher for table t1 is received
3) insert into table t1 on the publisher
4) inserted data in (3) is received in the subscriber table t1
5) alter publication to drop the table t1 on the publisher, see \d+
t1, there will not be any associated publication in the output
6) execute alter subscription refresh publication on the subscriber,
with the expectation that it should not receive the data from the
publisher for the table t1 since it's dropped from the publication in
(5)
7) insert into table t1 on the publisher
8) still the newly inserted data in (7) from the publisher, will be
received into the table t1 in the subscriber

IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
the above use case, it looks like a bug to me. If I'm wrong, can
someone correct me?


Yes, if we modify the publication, we should refresh the subscription on
each subscriber.  It looks strange for me, especially for partitioned
tables [1].

Thoughts?


Can we trace the different between publication and subscription, and
auto-refresh subscription on subscriber?

[1]
https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com

As Amit stated in your thread [1], DDLs like creation of the new
tables or partitions, schema changes etc. on the publisher can not be
replicated automatically by the logical replication framework to the
subscriber. Users have to perform those DDLs on the subscribers by
themselves.

Yeah, DDLs is not supported now. On publisher, the partitions are added to the
publication automatically.  However, even if we created the partitions on subscriber,
it will not sync the new partitions, because it likes normal table, we must execute
ALTER SUBSCRIPTION my_test REFRESH PUBLICATION; 
I preferred it will automatically add to subscription when we create the new partitions
if the partitions is already in publication.

If your point is to at least issue the ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION; from the publication whenever the publication is
altered i.e. added or dropped tables, IMO, we cannot do this, because
running this command on the subscriber only makes sense, after user
runs the same DDLs (which were run on the publisher) also on the
subscriber. To illustrate this:
1) create a new table or partition on the publisher and add it to
publisher, note that the same table has not yet been created on the
subscriber
2) imagine the publisher issuing an auto refresh command to all the
subscribers, then, no point in that right, because the new table or
the partition is not yet created on all the subscribers.

So, IMO, we can not have an auto refresh mechanism, until we have the
feature to replicate the DDL changes to all the subscribers.

Thanks for clarification. 

What I stated in my earlier mail [1] is that even though we drop the
table from the publication in the publisher and run a refresh
publication on the subscriber, still the data is being replicated from
the publisher to the subscriber table. I just wanted to know whether
this is the expected behaviour or what exactly means. a user running
ALTER PUBLICATION mypub DROP TABLE mytable;

[1] - https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com

Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
between step (5) and (6), it will not ship the new records, however, if we insert
data between step (5) and (6), it will ship the new records.

(1) created publication for table t1, t2
postgres[8765]=# CREATE TABLE t1 (a int);
CREATE TABLE
postgres[8765]=# CREATE TABLE t2 (a int);
CREATE TABLE
postgres[8765]=# INSERT INTO t1 VALUES (1);
INSERT 0 1
postgres[8765]=# INSERT INTO t2 VALUES (1);
INSERT 0 1
postgres[8765]=# CREATE PUBLICATION mypub1 FOR TABLE t1;
CREATE PUBLICATION
postgres[8765]=# CREATE PUBLICATION mypub2 FOR TABLE t2;
CREATE PUBLICATION

(2) created subscription on the subscriber
postgres[9812]=# CREATE TABLE t1 (a int);
CREATE TABLE
postgres[9812]=# CREATE TABLE t2 (a int);
CREATE TABLE
postgres[9812]=# CREATE SUBSCRIPTION mysub1 CONNECTION 'host=localhost port=8765 dbname=postgres' PUBLICATION mypub1;
NOTICE:  created replication slot "mysub1" on publisher
CREATE SUBSCRIPTION
postgres[9812]=# CREATE SUBSCRIPTION mysub2 CONNECTION 'host=localhost port=8765 dbname=postgres' PUBLICATION mypub2;
NOTICE:  created replication slot "mysub2" on publisher
CREATE SUBSCRIPTION
postgres[9812]=# TABLE t1;
 a
---
 1
(1 row)

postgres[9812]=# TABLE t2;
 a
---
 1
(1 row)

(3) insert into table t1, t2 on the publisher
postgres[8765]=# INSERT INTO t1 VALUES (2);
INSERT 0 1
postgres[8765]=# INSERT INTO t2 VALUES (2);
INSERT 0 1

(4) inserted data in (3) is received in the subscriber table t1, t2
postgres[9812]=# TABLE t1;
 a
---
 1
 2
(2 rows)

postgres[9812]=# TABLE t2;
 a
---
 1
 2
(2 rows)

(5) alter publication to drop table, we insert a record into t1 on publisher
postgres[8765]=# ALTER PUBLICATION mypub1 DROP TABLE t1;
ALTER PUBLICATION
postgres[8765]=# ALTER PUBLICATION mypub2 DROP TABLE t2;
ALTER PUBLICATION
postgres[8765]=# INSERT INTO t1 VALUES (3);
INSERT 0 1

(6) check the data on subscriber
postgres[9812]=# TABLE t1;
 a
---
 1
 2
 3
(3 rows)

postgres[9812]=# TABLE t2;
 a
---
 1
 2
(2 rows)

(7) refresh subscription on the subscriber
postgres[9812]=# ALTER SUBSCRIPTION mysub1 REFRESH PUBLICATION;
ALTER SUBSCRIPTION
postgres[9812]=# ALTER SUBSCRIPTION mysub2 REFRESH PUBLICATION;
ALTER SUBSCRIPTION

(8) insert into table t1, t2 on the publisher
postgres[8765]=# INSERT INTO t1 VALUES (4);
INSERT 0 1
postgres[8765]=# INSERT INTO t2 VALUES (4);
INSERT 0 1

(9) the newly inserted data in (5), (7) for table t1 shipped to subscriber, however
t2 doesn’t
postgres[9812]=# TABLE t1;
 a
---
 1
 2
 3
 4
(4 rows)

postgres[9812]=# TABLE t2;
 a
---
 1
 2
(2 rows)

It might be a bug.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote:
>
>
> On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
>
> Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
> between step (5) and (6), it will not ship the new records, however, if we insert
> data between step (5) and (6), it will ship the new records.
>
>
..
> It might be a bug.
>

Can you check pg_publication_rel and pg_subscription_rel? Also, this
is not related to the feature proposed in this thread, so it is better
to start a new thread to conclude whether this is a bug or not.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Mon, Jan 11, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote:
> >
> >
> > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
> >
> > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
> > between step (5) and (6), it will not ship the new records, however, if we insert
> > data between step (5) and (6), it will ship the new records.
> >
> >
> ..
> > It might be a bug.
> >
>
> Can you check pg_publication_rel and pg_subscription_rel? Also, this
> is not related to the feature proposed in this thread, so it is better
> to start a new thread to conclude whether this is a bug or not.

Thanks Amit, sure I will verify and start a new thread.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Mon, Jan 11, 2021 at 5:28 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote:
> > >
> > >
> > > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:
> > >
> > > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data
> > > between step (5) and (6), it will not ship the new records, however, if we insert
> > > data between step (5) and (6), it will ship the new records.
> > >
> > >
> > ..
> > > It might be a bug.
> > >
> >
> > Can you check pg_publication_rel and pg_subscription_rel? Also, this
> > is not related to the feature proposed in this thread, so it is better
> > to start a new thread to conclude whether this is a bug or not.
>
> Thanks Amit, sure I will verify and start a new thread.

I started a new thread [1] for this, please have a look.

[1] - https://www.postgresql.org/message-id/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
Rahila Syed
Date:
Hi Vignesh,

I had a look at the patch, please consider following comments.

On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
Hi,

This feature adds schema option while creating publication. Users will
be able to specify one or more schemas while creating publication,
when the user specifies schema option, then the data changes for the
tables present in the schema specified by the user will be replicated
to the subscriber. Few examples have been listed below:

Create a publication that publishes all changes for all the tables
present in production schema:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;

Should it be FOR TABLES IN SCHEMA instead of FOR ALL TABLES SCHEMA?
 
Create a publication that publishes all changes for all the tables
present in marketing and sales schemas:
CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;

Add some schemas to the publication:
ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;

As per current implementation this command fails even if one of the schemas does not 
exist. I think this is counterintuitive, it should throw a warning and continue adding the rest.
 
Drop some schema from the publication:
ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;

Same for drop schema, if one of these schemas does not exist in publication,
the entire DROP operation is aborted.
 
Thank you,
Rahila Syed
 

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Jan 11, 2021 at 11:45 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > I think this feature can be useful, in case a user has a lot of tables
> > > to publish inside a schema. Having said that, I wonder if this feature
> > > mandates users to create the same schema with same
> > > permissions/authorizations manually on the subscriber, because logical
> > > replication doesn't propagate any ddl's so are the schema or schema
> > > changes? Or is it that the list of tables from the publisher can go
> > > into a different schema on the subscriber?
> > >
> >
> > DDL's will not be propagated to the subscriber. Users have to create
> > the schema & tables in the subscriber. No change in
> > Permissions/authorizations handling, it will be the same as the
> > existing behavior for relations.
>
> Looks like the existing behaviour already requires users to create the
> schema on the subscriber when publishing the tables from that schema.
> Otherwise, an error is thrown on the subscriber [1].
>
> [1] on publisher:
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
> CREATE PUBLICATION testpub FOR TABLE myschema.t1;
>
> on subscriber:
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> ERROR:  schema "myschema" does not exist
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> NOTICE:  created replication slot "testsub" on publisher
> CREATE SUBSCRIPTION
>

Yes this feature will also have the same behavior, DDL creation should
be taken care of by DBA similar to how it is handled may be using
pg_dump or use sql scripts/statements to update.

> > > Since the schema can have other objects such as data types, functions,
> > > operators, I'm sure with your feature, non-table objects will be
> > > skipped.
> > >
> >
> > Yes, only table data will be sent to subscribers, non-table objects
> > will be skipped.
>
> Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
> for all the tables in the database, does this i.e. skips non-table
> objects and temporary tables, foreign tables and so on. So, your
> feature also can behave the same way, but within the scope of the
> given schema/s.
>

Yes, it will support only normal tables. Non table objects, foreign
tables & temporary tables will not be supported.

> > > As Amit pointed out earlier, the behaviour when schema dropped, I
> > > think we should also consider when schema is altered, say altered to a
> > > different name, maybe we should change that in the publication too.
> > >
> >
> > I agree that when schema is altered the renamed schema should be
> > reflected in the publication.
>
> I think, it's not only making sure that the publisher side has the new
> altered schema, but also the subscriber needs those alters. Having
> said that, since these alters come under DDL changes and in logical
> replication we don't publish the scheme changes to the subscriber, we
> may not need to anything extra for informing the schema alters to the
> subscriber from the publisher, the users might have to do the same
> schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
> REFRESH PUBLICATION;  should work for them? If this understanding is
> correct, then we should document this.
>

Yes, alter schema changes will be reflected in the publication, the
corresponding change needs to be done by the user on the subscriber
side. Once a user does ALTER SUBSCRIPTION testsub REFRESH PUBLICATION,
the new altered schema changes will be reflected in the subscriber. I
will update the documentation that user need to take care for
subscription refresh.

> > > In general, what happens if we have some temporary tables or foreign
> > > tables inside the schema, will they be allowed to send the data to
> > > subscribers?
> > >
> >
> > Temporary tables & foreign tables will not be added to the publications.
>
> Yes the existing logical replication framework doesn't allow
> replication of temporary, unlogged, foreign tables and other non-table
> relations such as materialized views, indexes etc [1]. The CREATE
> PUBLICATION statement either fails in check_publication_add_relation
> or before that.
>
> CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
> single table tab1 is any of the above restricted tables, seems fine.
> But, if there's a list of tables with CREATE PUBLICATION testpub FOR
> TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
> unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
> first encounter of the restricted table, say at temp_tab2. Whereas,
> CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
> tables and continue to add the accepted tables into the publication
> within the database.
>
> IMHO, if there's a list of tables specified with FOR TABLE, then
> instead of throwing an error in case of any restricted table, we can
> issue a warning and continue with the addition of accepted tables into
> the publication. If done, this behaviour will be in sync with FOR ALL
> TABLES;
>
> Thoughts? If okay, I can work on a patch.
>

I feel we can start a new thread for this to seek opinion if this base
change is required and reach consensus.

> Related to error messages: when foreign table is specified in CREATE
> PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
> [1], how about the error message "ERROR: foerign table "f1" cannot be
> replicated". In general, it would be good if we could have the error
> messages something like in [2] instead of the existing [1].
>
> Thoughts? If okay, I can work on a patch.
>

I feel we can start a new thread for this to seek opinion and reach consensus.

> [1]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  "f1" is not a table
> DETAIL:  Only tables can be added to publications.
>
> mv1 is a mat view:
> postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
> ERROR:  "mv1" is not a table
>
> idx1 is an index:
> postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
> ERROR:  "idx1" is an index
>
> [2]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  temporary table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  unlogged table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  foreign table "f1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> > > Say a user has created a publication for a schema with hundreds of
> > > tables in it, at some point later, can he stop replicating a single or
> > > some tables from that schema?
> > >
> >
> > There is no provision for this currently.
>
> The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> TABLE t1; which removes the table from the list of published tables,
> but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> PUBLICATION; for the changes to become effective on the subscriber. I
> have done some testing for this case:
> 1) created publication for table t1, see \d+ t1, the associated
> publication is visible in the output
> 2) created subscription on the subscriber, initial available data from
> the publisher for table t1 is received
> 3) insert into table t1 on the publisher
> 4) inserted data in (3) is received in the subscriber table t1
> 5) alter publication to drop the table t1 on the publisher, see \d+
> t1, there will not be any associated publication in the output
> 6) execute alter subscription refresh publication on the subscriber,
> with the expectation that it should not receive the data from the
> publisher for the table t1 since it's dropped from the publication in
> (5)
> 7) insert into table t1 on the publisher
> 8) still the newly inserted data in (7) from the publisher, will be
> received into the table t1 in the subscriber
>
> IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> the above use case, it looks like a bug to me. If I'm wrong, can
> someone correct me?
>
> Thoughts?

I think you started a new thread for this, let's conclude on this there.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
vignesh C
Date:
Thanks Rahila for your comments, please find my thoughts below.

On Tue, Jan 12, 2021 at 5:16 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
> I had a look at the patch, please consider following comments.
>
> On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> Hi,
>>
>> This feature adds schema option while creating publication. Users will
>> be able to specify one or more schemas while creating publication,
>> when the user specifies schema option, then the data changes for the
>> tables present in the schema specified by the user will be replicated
>> to the subscriber. Few examples have been listed below:
>>
>> Create a publication that publishes all changes for all the tables
>> present in production schema:
>> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
>>
> Should it be FOR TABLES IN SCHEMA instead of FOR ALL TABLES SCHEMA?
>  

For adding tables into publication we have syntax like:
CREATE PUBLICATION mypub FOR TABLE tbl1, tbl2;
For all tables we have syntax like:
CREATE PUBLICATION mypub FOR ALL TABLES;

Initial syntax that I proposed was:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;

I feel the below syntax is better, as it is consistent with others:
CREATE PUBLICATION mypub FOR SCHEMA sch1, sch2;

>>
>> Create a publication that publishes all changes for all the tables
>> present in marketing and sales schemas:
>> CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
>>
>> Add some schemas to the publication:
>> ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
>>
> As per current implementation this command fails even if one of the schemas does not
> exist. I think this is counterintuitive, it should throw a warning and continue adding the rest.
>  

We have the similar behavior in case of adding non-existent table while creating a publication:
CREATE PUBLICATION mypub3 FOR TABLE non_existent_table;
ERROR:  relation "non_existent_table" does not exist
I feel we can keep the behavior similarly to maintain the consistency.

>>
>> Drop some schema from the publication:
>> ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
>>
> Same for drop schema, if one of these schemas does not exist in publication,
> the entire DROP operation is aborted.

We have similar behavior in case of dropping non-existent table while altering publication
alter publication mypub5 drop table test1,testx;
ERROR:  relation "testx" does not exist
I feel we can keep the behavior similarly to maintain the consistency.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Jan 9, 2021 at 5:21 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jan 8, 2021 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > This feature adds schema option while creating publication. Users will
> > > be able to specify one or more schemas while creating publication,
> > > when the user specifies schema option, then the data changes for the
> > > tables present in the schema specified by the user will be replicated
> > > to the subscriber. Few examples have been listed below:
> > >
> > > Create a publication that publishes all changes for all the tables
> > > present in production schema:
> > > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
> > >
> > > Create a publication that publishes all changes for all the tables
> > > present in marketing and sales schemas:
> > > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
> > >
> > > Add some schemas to the publication:
> > > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
> > >
> > > Drop some schema from the publication:
> > > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
> > >
> > > Attached is a POC patch for the same. I felt this feature would be quite useful.
> > >
> >
> > What do we do if the user Drops the schema? Do we automatically remove
> > it from the publication?
> >
> I have not yet handled this scenario yet, I will handle this and
> adding of tests in the next patch.
>

I have handled the above scenario(drop schema should automatically
remove the schema entry from publication schema relation) & addition
of tests in the new v2 patch attached.
Thoughts?


Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
>
> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>

This scenario is handled now in the patch posted at [1].

>
> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales;  --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.

I have not yet handled this, I'm working on this and will try post a patch for this in the next version.

Re: Added schema level support for publication.

From
Rahila Syed
Date:
Hi Vignesh,


I have handled the above scenario(drop schema should automatically
remove the schema entry from publication schema relation) & addition
of tests in the new v2 patch attached.
Thoughts?

Please see some initial comments:

1. I think there should be more tests to show that the schema data is actually replicated 
to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.

2. How does replication behave when a table is added or removed from a subscribed schema 
using ALTER TABLE SET SCHEMA?

3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't 
specify one, this can be handy to replicate current schema tables.

4. +   The fourth, fifth and sixth variants change which schemas are part of the
+   publication.  The <literal>SET TABLE</literal> clause will replace the list
+   of schemas in the publication with the specified one.  The <literal>ADD

There is a typo above s/SET TABLE/SET SCHEMA

Thank you,
Rahila Syed

Re: Added schema level support for publication.

From
vignesh C
Date:
Thanks Rahila for your comments. Please find my thoughts below:

On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
>>
>> I have handled the above scenario(drop schema should automatically
>> remove the schema entry from publication schema relation) & addition
>> of tests in the new v2 patch attached.
>> Thoughts?
>
>
> Please see some initial comments:
>
> 1. I think there should be more tests to show that the schema data is actually replicated
> to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
>
I will fix this issue and include more tests in my next version of the patch.

> 2. How does replication behave when a table is added or removed from a subscribed schema
> using ALTER TABLE SET SCHEMA?
>
I would like to keep the behavior similar to the table behavior. I
will post more details for this along with my next version of the
patch.

> 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> specify one, this can be handy to replicate current schema tables.
>
It looks like a good use case, I will check on the feasibility of this
and try to implement this.

> 4. +   The fourth, fifth and sixth variants change which schemas are part of the
> +   publication.  The <literal>SET TABLE</literal> clause will replace the list
> +   of schemas in the publication with the specified one.  The <literal>ADD
>
> There is a typo above s/SET TABLE/SET SCHEMA
I will fix this in the next version of the patch.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jan 22, 2021 at 10:01 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks Rahila for your comments. Please find my thoughts below:
>
> On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
> >
> > Hi Vignesh,
> >
> >>
> >> I have handled the above scenario(drop schema should automatically
> >> remove the schema entry from publication schema relation) & addition
> >> of tests in the new v2 patch attached.
> >> Thoughts?
> >
> >
> > Please see some initial comments:
> >
> > 1. I think there should be more tests to show that the schema data is actually replicated
> > to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
> >
> I will fix this issue and include more tests in my next version of the patch.

Modified to handle this and also added a few more tests.

> > 2. How does replication behave when a table is added or removed from a subscribed schema
> > using ALTER TABLE SET SCHEMA?
> >
> I would like to keep the behavior similar to the table behavior. I
> will post more details for this along with my next version of the
> patch.
>

If a table is set to a different schema, after the schema change table
data will not be sent to the subscriber.
When a new table is added to the published schema, the table data will
be sent by the publisher, subscriber will not apply the changes. If
the change needs to be reflected, subscriber's publication should be
refreshed using "alter subscription mysub1 refresh publication". This
relation will be reflected in the subscriber relation when the
subscriber's publication is refreshed.
If a table is dropped, there is no impact on subscriber, This relation
will be present in pg_subscriber_rel after refreshing subscriber
publication.

> > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> > specify one, this can be handy to replicate current schema tables.
> >
> It looks like a good use case, I will check on the feasibility of this
> and try to implement this.

This can be done, I will handle this later.

> > 4. +   The fourth, fifth and sixth variants change which schemas are part of the
> > +   publication.  The <literal>SET TABLE</literal> clause will replace the list
> > +   of schemas in the publication with the specified one.  The <literal>ADD
> >
> > There is a typo above s/SET TABLE/SET SCHEMA
> I will fix this in the next version of the patch.

Modified it.
I have separated the tests and documentation into a separate patch to
make review easier. Attached v3 patch with the fixes.
Thoughts?

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Sun, Jan 31, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jan 22, 2021 at 10:01 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks Rahila for your comments. Please find my thoughts below:
> >
> > On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
> > >
> > > Hi Vignesh,
> > >
> > >>
> > >> I have handled the above scenario(drop schema should automatically
> > >> remove the schema entry from publication schema relation) & addition
> > >> of tests in the new v2 patch attached.
> > >> Thoughts?
> > >
> > >
> > > Please see some initial comments:
> > >
> > > 1. I think there should be more tests to show that the schema data is actually replicated
> > > to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
> > >
> > I will fix this issue and include more tests in my next version of the patch.
>
> Modified to handle this and also added a few more tests.
>
> > > 2. How does replication behave when a table is added or removed from a subscribed schema
> > > using ALTER TABLE SET SCHEMA?
> > >
> > I would like to keep the behavior similar to the table behavior. I
> > will post more details for this along with my next version of the
> > patch.
> >
>
> If a table is set to a different schema, after the schema change table
> data will not be sent to the subscriber.
> When a new table is added to the published schema, the table data will
> be sent by the publisher, subscriber will not apply the changes. If
> the change needs to be reflected, subscriber's publication should be
> refreshed using "alter subscription mysub1 refresh publication". This
> relation will be reflected in the subscriber relation when the
> subscriber's publication is refreshed.
> If a table is dropped, there is no impact on subscriber, This relation
> will be present in pg_subscriber_rel after refreshing subscriber
> publication.
>
> > > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> > > specify one, this can be handy to replicate current schema tables.
> > >
> > It looks like a good use case, I will check on the feasibility of this
> > and try to implement this.
>
> This can be done, I will handle this later.
>
> > > 4. +   The fourth, fifth and sixth variants change which schemas are part of the
> > > +   publication.  The <literal>SET TABLE</literal> clause will replace the list
> > > +   of schemas in the publication with the specified one.  The <literal>ADD
> > >
> > > There is a typo above s/SET TABLE/SET SCHEMA
> > I will fix this in the next version of the patch.
>
> Modified it.
> I have separated the tests and documentation into a separate patch to
> make review easier. Attached v3 patch with the fixes.
> Thoughts?

The earlier patch does not apply on the head. The v4 patch attached
has the following changes:
a) Rebased it on head. b) Removed pubschemas, pubtables columns and
replaced it with pubtype in pg_publication table. c) List the schemas
in describe publication. d) List the publication in list schemas. e)
Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
"FOR SCHEMA" in create publication and alter publication. g) Included
the newly added structure type to typedefs.lst

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Monday, May 24, 2021 at 8:31 PM vignesh C <vignesh21@gmail.com> wrote:
> The earlier patch does not apply on the head. The v4 patch attached
> has the following changes:
> a) Rebased it on head. b) Removed pubschemas, pubtables columns and
> replaced it with pubtype in pg_publication table. c) List the schemas
> in describe publication. d) List the publication in list schemas. e)
> Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
> "FOR SCHEMA" in create publication and alter publication. g) Included
> the newly added structure type to typedefs.lst

Thanks for your patch.

I ran "make check-world" after applying your patch but it failed on my machine. I saw the following log:
--------------
parallel group (2 tests):  subscription publication
     publication                  ... FAILED      108 ms
     subscription                 ... ok           87 ms


diff -U3 /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out
--- /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out      2021-05-25
15:44:52.261683712+0800
 
+++ /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out       2021-05-25
15:48:41.393672595+0800
 
@@ -359,10 +359,10 @@
     "public"

 \dn public;
- List of schemas
-  Name  |  Owner
---------+---------
- public | vignesh
+List of schemas
+  Name  | Owner
+--------+-------
+ public | fnst
 Publications:
     "testpub3_forschema"
--------------

I think the owner of CURRENT_SCHEMA should not be written into publication.out because the result is related to the
user.
 
Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default owner before this test case. Thoughts?

Regards
Tang

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, May 25, 2021 at 2:47 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, May 24, 2021 at 8:31 PM vignesh C <vignesh21@gmail.com> wrote:
> > The earlier patch does not apply on the head. The v4 patch attached
> > has the following changes:
> > a) Rebased it on head. b) Removed pubschemas, pubtables columns and
> > replaced it with pubtype in pg_publication table. c) List the schemas
> > in describe publication. d) List the publication in list schemas. e)
> > Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
> > "FOR SCHEMA" in create publication and alter publication. g) Included
> > the newly added structure type to typedefs.lst
>
> Thanks for your patch.
>
> I ran "make check-world" after applying your patch but it failed on my machine. I saw the following log:
> --------------
> parallel group (2 tests):  subscription publication
>      publication                  ... FAILED      108 ms
>      subscription                 ... ok           87 ms
>
>
> diff -U3 /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out
> --- /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out      2021-05-25
15:44:52.261683712+0800
 
> +++ /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out       2021-05-25
15:48:41.393672595+0800
 
> @@ -359,10 +359,10 @@
>      "public"
>
>  \dn public;
> - List of schemas
> -  Name  |  Owner
> ---------+---------
> - public | vignesh
> +List of schemas
> +  Name  | Owner
> +--------+-------
> + public | fnst
>  Publications:
>      "testpub3_forschema"
> --------------
>
> I think the owner of CURRENT_SCHEMA should not be written into publication.out because the result is related to the
user.
> Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default owner before this test case. Thoughts?

Thanks for identifying and reporting this issue. I have \dn with the
equivalent query to display only the publication name. The updated
patch has the fix for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Sat, Jun 5, 2021 at 7:02 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.

Currently, FOR ALL TABLES is there to add all the tables(existing and
future) in the current database in which the publication is created. I
wonder before providing FOR SCHEMA capability, we better target FOR
DATABASE first, something like CREATE PUBLICATION ... FOR DATABASE
foo, bar, baz, qux; Of course users with the proper permissions on the
specified databases can add them to the publication. This can help to
add all the tables in other databases as well. Then, the CREATE
PUBLICATION ... FOR SCHEMA foo, bar, baz, qux; makes more sense.
Because, my understanding is that: database is a collection of tables,
schema is a collection of databases. I may be wrong here, but it's
just a thought. What do you think?

With Regards,
Bharath Rupireddy.



Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Fri, Jun 11, 2021, 6:22 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Sat, Jun 5, 2021 at 7:02 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.

Currently, FOR ALL TABLES is there to add all the tables(existing and
future) in the current database in which the publication is created. I
wonder before providing FOR SCHEMA capability, we better target FOR
DATABASE first, something like CREATE PUBLICATION ... FOR DATABASE
foo, bar, baz, qux; Of course users with the proper permissions on the
specified databases can add them to the publication. This can help to
add all the tables in other databases as well. Then, the CREATE
PUBLICATION ... FOR SCHEMA foo, bar, baz, qux; makes more sense.
Because, my understanding is that: database is a collection of tables,
schema is a collection of databases. I may be wrong here, but it's
just a thought. What do you think?

Please ignore above comment. I was confused about what a database and schema is in postgres. I'm sorry for the noise.


Regards,
Bharath Rupireddy.

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> And, with this feature, since there can be many huge tables inside a
> schema, the initial table sync phase of the replication can take a
> while.
>
> Say a user has created a publication for a schema with hundreds of
> tables in it, at some point later, can he stop replicating a single or
> some tables from that schema?
>

Isn't this applies to FOR ALL TABLES syntax as well where the user
might want to exclude one or a few tables? I am not sure if it is a
good idea to deal with this as part of this patch.


-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Ajin Cherian
Date:
On Sat, Jun 5, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote:

> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.
>

The patch no longer applies, I think a recent change to tab-complete
has broken it.

regards,
Ajin Cherian
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Jun 16, 2021 at 5:12 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Sat, Jun 5, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> > Thanks for identifying and reporting this issue. I have \dn with the
> > equivalent query to display only the publication name. The updated
> > patch has the fix for the same.
> >
>
> The patch no longer applies, I think a recent change to tab-complete
> has broken it.

Thanks for reporting it, the attached patch is a rebased version of
the patch with few review comment fixes, I will reply with the comment
fixes to the respective mails.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Jun 16, 2021 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > And, with this feature, since there can be many huge tables inside a
> > schema, the initial table sync phase of the replication can take a
> > while.
> >
> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
>
> Isn't this applies to FOR ALL TABLES syntax as well where the user
> might want to exclude one or a few tables? I am not sure if it is a
> good idea to deal with this as part of this patch.

Yes, this applies to "FOR ALL TABLES" publications also, i had worked for this, the patch for the same is attached at [1]. I had created a separate patch for this so that it is easier for reviewing.

[1] - https://www.postgresql.org/message-id/CALDaNm10g2h29a-oFHsadk-Du6RDhnVQT_vfTGqR82DsjxQLqQ%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
>>
>> I have handled the above scenario(drop schema should automatically
>> remove the schema entry from publication schema relation) & addition
>> of tests in the new v2 patch attached.
>> Thoughts?
>
>
> Please see some initial comments:
>
> 1. I think there should be more tests to show that the schema data is actually replicated
> to the subscriber.  Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
>
> 2. How does replication behave when a table is added or removed from a subscribed schema
> using ALTER TABLE SET SCHEMA?
>
> 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> specify one, this can be handy to replicate current schema tables.

I felt supporting a syntax like below will be useful for supporting current schema:
create publication testpub for schema schema_name
or
create publication testpub for schema CURRENT_SCHEMA

Let the user specify CURRENT_SCHEMA explicitly if required instead of not specifying anything.

I have implemented similar lines in the V6 patch at [1]. Thoughts?

[1] - https://www.postgresql.org/message-id/CALDaNm10g2h29a-oFHsadk-Du6RDhnVQT_vfTGqR82DsjxQLqQ%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
Ajin Cherian
Date:
On Thu, Jun 17, 2021 at 12:41 AM vignesh C <vignesh21@gmail.com> wrote:

> Thanks for reporting it, the attached patch is a rebased version of
> the patch with few review comment fixes, I will reply with the comment
> fixes to the respective mails.
>

I've applied the patch, it applies cleand and ran "make check" and
tests run fine.

Some comments for patch 1:

In the commit message, some grammar mistakes:

"Changes was done in
pg_dump to handle pubtype updation in pg_publication table while the database
gets upgraded."

-------------- change to --

Changes were done in
pg_dump to handle pubtype updation in pg_publication table while the database
gets upgraded.

=============

Prototypes present in pg_publication.h was moved to publicationcmds.h so
that minimal datastructures ...

----------------- change to --

Prototypes present in pg_publication.h were moved to publicationcmds.h so
that minimal datastructures ..

========================

In patch 1:

In getObjectDescription()

+ if (!nspname)
+ {
+ pfree(pubname);
+ pfree(nspname);
+ ReleaseSysCache(tup);

Why free nspname if it is NULL?

Same comment in getObjectIdentityParts()
============================

In GetAllTablesPublicationRelations()

+ ScanKeyData key[2];
  TableScanDesc scan;
  HeapTuple tuple;
  List    *result = NIL;
+ int keycount = 0;

  classRel = table_open(RelationRelationId, AccessShareLock);

- ScanKeyInit(&key[0],
+ ScanKeyInit(&key[keycount++],

Here you have init key[1], but the code below in the same function
inits key[0]. I am not sure if this will affect that code below.

if (pubviaroot)
{
ScanKeyInit(&key[0],
Anum_pg_class_relkind,
BTEqualStrategyNumber, F_CHAREQ,
CharGetDatum(RELKIND_PARTITIONED_TABLE));
=================================

in UpdatePublicationTypeTupleValue():

+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+
+ /* Update the catalog. */
+ CatalogTupleUpdate(rel, &tup->t_self, tup);

Not sure if this tup needs to be freed or if the memory context will
take care of it.
=====================

regards,
Ajin Cherian
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
Thanks for the comments.

On Fri, Jun 18, 2021 at 5:25 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 12:41 AM vignesh C <vignesh21@gmail.com> wrote:
>
> > Thanks for reporting it, the attached patch is a rebased version of
> > the patch with few review comment fixes, I will reply with the comment
> > fixes to the respective mails.
> >
>
> I've applied the patch, it applies cleand and ran "make check" and
> tests run fine.
>
> Some comments for patch 1:
>
> In the commit message, some grammar mistakes:
>
> "Changes was done in
> pg_dump to handle pubtype updation in pg_publication table while the database
> gets upgraded."
>
> -------------- change to --
>
> Changes were done in
> pg_dump to handle pubtype updation in pg_publication table while the database
> gets upgraded.
>

I will modify this.

> =============
>
> Prototypes present in pg_publication.h was moved to publicationcmds.h so
> that minimal datastructures ...
>
> ----------------- change to --
>
> Prototypes present in pg_publication.h were moved to publicationcmds.h so
> that minimal datastructures ..
>
> ========================

I will modify this.

>
> In patch 1:
>
> In getObjectDescription()
>
> + if (!nspname)
> + {
> + pfree(pubname);
> + pfree(nspname);
> + ReleaseSysCache(tup);
>
> Why free nspname if it is NULL?
>
> Same comment in getObjectIdentityParts()

I will modify this.

> ============================
>
> In GetAllTablesPublicationRelations()
>
> + ScanKeyData key[2];
>   TableScanDesc scan;
>   HeapTuple tuple;
>   List    *result = NIL;
> + int keycount = 0;
>
>   classRel = table_open(RelationRelationId, AccessShareLock);
>
> - ScanKeyInit(&key[0],
> + ScanKeyInit(&key[keycount++],
>
> Here you have init key[1], but the code below in the same function
> inits key[0]. I am not sure if this will affect that code below.
>
> if (pubviaroot)
> {
> ScanKeyInit(&key[0],
> Anum_pg_class_relkind,
> BTEqualStrategyNumber, F_CHAREQ,
> CharGetDatum(RELKIND_PARTITIONED_TABLE));

I felt this is ok as we specify the keycount to be 1, so only the
key[0] will be used. Thoughts?
ScanKeyInit(&key[0],
Anum_pg_class_relkind,
BTEqualStrategyNumber, F_CHAREQ,
CharGetDatum(RELKIND_PARTITIONED_TABLE));

scan = table_beginscan_catalog(classRel, 1, key);

> =================================
>
> in UpdatePublicationTypeTupleValue():
>
> + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
> + replaces);
> +
> + /* Update the catalog. */
> + CatalogTupleUpdate(rel, &tup->t_self, tup);
>
> Not sure if this tup needs to be freed or if the memory context will
> take care of it.

I felt this is ok, as the cleanup is handled in the caller function
"AlterPublication", thoughts?
/* Cleanup. */
heap_freetuple(tup);
table_close(rel, RowExclusiveLock);

I will post an update patch for the fixes shortly.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Ajin Cherian
Date:
On Mon, Jun 21, 2021 at 3:16 PM vignesh C <vignesh21@gmail.com> wrote:


> I felt this is ok as we specify the keycount to be 1, so only the
> key[0] will be used. Thoughts?
> ScanKeyInit(&key[0],
> Anum_pg_class_relkind,
> BTEqualStrategyNumber, F_CHAREQ,
> CharGetDatum(RELKIND_PARTITIONED_TABLE));
>
> scan = table_beginscan_catalog(classRel, 1, key);
>

It maybe fine, just doesn't look correct when you look at the function
as a whole.

> > =================================
> >
> > in UpdatePublicationTypeTupleValue():
> >
> > + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
> > + replaces);
> > +
> > + /* Update the catalog. */
> > + CatalogTupleUpdate(rel, &tup->t_self, tup);
> >
> > Not sure if this tup needs to be freed or if the memory context will
> > take care of it.
>
> I felt this is ok, as the cleanup is handled in the caller function
> "AlterPublication", thoughts?
> /* Cleanup. */
> heap_freetuple(tup);
> table_close(rel, RowExclusiveLock);

that should be fine.

regards,
Ajin Cherian
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Jun 22, 2021 at 6:57 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Mon, Jun 21, 2021 at 3:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> > I felt this is ok as we specify the keycount to be 1, so only the
> > key[0] will be used. Thoughts?
> > ScanKeyInit(&key[0],
> > Anum_pg_class_relkind,
> > BTEqualStrategyNumber, F_CHAREQ,
> > CharGetDatum(RELKIND_PARTITIONED_TABLE));
> >
> > scan = table_beginscan_catalog(classRel, 1, key);
> >
>
> It maybe fine, just doesn't look correct when you look at the function
> as a whole.

I have added a local variable for this to avoid confusion.

Updated patch has the fix for this, this also includes the fixes for
the other comments you had given.
I have removed the skip table patches to keep the focus on the main
patch, once this patch gets into committable shape, I will focus on
the skip table patch.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote:
> I have removed the skip table patches to keep the focus on the main
> patch, once this patch gets into committable shape, I will focus on
> the skip table patch.

IMO it's a good idea to start a new thread for the "skip table"
feature so that we can discuss it separately. If required you can
specify in that thread that the idea of the "skip table" can be
applied to the "add schema level support" feature.

With Regards,
Bharath Rupireddy.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Jun 22, 2021 at 2:15 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Updated patch has the fix for this, this also includes the fixes for
> the other comments you had given.
> I have removed the skip table patches to keep the focus on the main
> patch, once this patch gets into committable shape, I will focus on
> the skip table patch.
>

I have the following initial comments on the v7 patches:

v7-0001

(1) The patch comment is pretty rough and needs work.

I suggest something like the following:

This patch adds schema-level support for publication.
A new schema option allows one or more schemas to be specified, whose tables
are selected by the publisher for sending the data to the subscriber.

pg_publication maintains information about the publication. Previously, the
"puballtables" bool column was used to indicate if the publication was the
"FOR ALL TABLES" type (if true) or the "FOR TABLE" type (if false). With the
introduction of the "FOR SCHEMA" publication type, it is not easy to determine
the publication type. Therefore, a new column "pubtype" has been added to the
pg_publication relation to indicate the publication type.
There was the possibility of avoiding addition of this new column, but that
would require checking puballtables of pg_publication and checking
pg_publication_rel for table type publication and then checking
pg_publication_schema for schema type publication. Instead, I preferred to add
the "pubtype" column, which makes things easier, and also will help support
new options in the future.
A new system table "pg_publication_schema" has been added, to maintain the
schemas that the user wants to publish through the publication. The
schema/publication/publication_schema dependency was created to handle the
corresponding renaming/removal of schemas to the publication/publication_schema
when the schema is renamed/dropped. The Decoder identifies if the relation is
part of the publication and replicates it to the subscriber. Changes were made
to pg_dump to handle pubtype updation in the pg_publication table when the
database is upgraded.

Prototypes present in pg_publication.h have been moved to publicationcmds.h so
that minimal data structures are exported to pg_dump and psql clients, as the
rest of the information need not be exported.

CATALOG_VERSION_NO needs to be updated while committing, as this feature
involves a catalog change.

TODO: version checks for psql/pg_dump need to be changed from 140000 to 150000
once the ongoing release is completed.


(2) src/backend/catalog/objectaddress.c
getObjectIdentityParts(), case OCLASS_PUBLICATION_SCHEMA

Shouldn't pubname/nspname be pfree()d, if objargs/objname are NULL?


(3) src/backend/catalog/pg_publication.c

(i)
GetAllTablesPublications


BEFORE:
+ * Gets list of relations published.
AFTER:
+ * Gets the list of relations published.

There are several other cases of newly-added "Gets list of ..." comments.


(ii)
GetAllTablesPublicationRelations

BEFORE:
+ * Gets list of all relation published by FOR SCHEMA publication(s).
AFTER:
+ * Gets the list of all relations published by FOR SCHEMA publication(s).


(4) src/backend/commands/publicationcmds.c

Missing function header for function "UpdatePublicationTypeTupleValue".


(5) src/backend/parser/gram.y

Violation of function header comment format:

BEFORE:
+/* makeSchemaSpec
+ * Create a SchemaSpec with the given type
+ */

AFTER:
+/*
+ * makeSchemaSpec
+ * Create a SchemaSpec with the given type
+ */


(6) src/bin/pg_dump/pg_dump.c

The following code is within a loop that processes schemas.
I think that (in the comment) "Clean up and return" should instead say
"Clean up and process the next schema"
Also, should say "Schema is not a member".

+ /*
+ * Schema is not member of any publications. Clean up and return.
+ */
+ PQclear(res);
+ continue;


(7) src/bin/psql/describe.c

Missing function header for function "addFooterToPublicationDesc".


v7-0002

(1) doc/src/sgml/catalogs.sgml
Typo

BEFORE:
+       respctively. The publication type cannot be changed in other cases.
AFTER:
+       respectively. The publication type cannot be changed in other cases.


(2) doc/src/sgml/ref/create_publication.sgml

BEFORE:
+   Create a publication that publishes all changes for all the tables
present in
+production schema:
AFTER:
+   Create a publication that publishes all changes for all the tables
present in
+the schema "production":


BEFORE:
+   Create a publication that publishes all changes for all the tables
present in
+marketing and sales schemas:
AFTER:
+   Create a publication that publishes all changes for all the tables
present in
+the schemas "marketing" and "sales":


(3) src/test/regress/expected/publication.out

BEFORE:
+-- Drop schema that is not preset in the publication
AFTER:
+-- Drop schema that is not present in the publication

BEFORE:
+--- Check create publication on a object which is not schema
AFTER:
+--- Check create publication on an object which is not schema


(4) src/test/regress/sql/publication.sql

BEFORE:
+-- Drop schema that is not preset in the publication
AFTER:
+-- Drop schema that is not present in the publication


Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
Hi

I applied your V7* patch and complied it. The following warnings came out, please take a look.

>pg_publication.c:688:22: warning: ‘tables’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>   funcctx->user_fctx = (void *) tables;
>   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
>describe.c: In function ‘describePublications’:
>describe.c:6479:35: warning: ‘pubtype’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>   else if (has_pubtype && pubtype == PUBTYPE_SCHEMA)

For the warning in pg_publication.c, maybe we can replace the following 'else if' with 'else'.
+        else if (publication->pubtype == PUBTYPE_SCHEMA)

For the warning in describe.c, initialization of 'pubtype' is needed.

Regards
Tang

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Jun 24, 2021 at 4:41 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> I applied your V7* patch and complied it. The following warnings came out, please take a look.
>

I encountered the following warnings when building with the v7 patches applied:
(looks like missing #include "catalog/objectaddress.h")

pg_prewarm.c:109:29: warning: implicit declaration of function
‘get_relkind_objtype’; did you mean ‘get_element_type’?
[-Wimplicit-function-declaration]
   aclcheck_error(aclresult,
get_relkind_objtype(rel->rd_rel->relkind), get_rel_name(relOid));
                             ^~~~~~~~~~~~~~~~~~~
                             get_element_type

heap_surgery.c:391:9: warning: implicit declaration of function
‘get_relkind_objtype’; did you mean ‘get_publication_type’?
[-Wimplicit-function-declaration]
         get_relkind_objtype(rel->rd_rel->relkind),
         ^~~~~~~~~~~~~~~~~~~
         get_publication_type

pgrowlocks.c:136:29: warning: implicit declaration of function
‘get_relkind_objtype’ [-Wimplicit-function-declaration]
   aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Jun 23, 2021 at 2:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Jun 22, 2021 at 2:15 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Updated patch has the fix for this, this also includes the fixes for
> > the other comments you had given.
> > I have removed the skip table patches to keep the focus on the main
> > patch, once this patch gets into committable shape, I will focus on
> > the skip table patch.
> >
>
> I have the following initial comments on the v7 patches:
>
> v7-0001
>
> (1) The patch comment is pretty rough and needs work.
>
> I suggest something like the following:
>
> This patch adds schema-level support for publication.
> A new schema option allows one or more schemas to be specified, whose tables
> are selected by the publisher for sending the data to the subscriber.
>
> pg_publication maintains information about the publication. Previously, the
> "puballtables" bool column was used to indicate if the publication was the
> "FOR ALL TABLES" type (if true) or the "FOR TABLE" type (if false). With the
> introduction of the "FOR SCHEMA" publication type, it is not easy to determine
> the publication type. Therefore, a new column "pubtype" has been added to the
> pg_publication relation to indicate the publication type.
> There was the possibility of avoiding addition of this new column, but that
> would require checking puballtables of pg_publication and checking
> pg_publication_rel for table type publication and then checking
> pg_publication_schema for schema type publication. Instead, I preferred to add
> the "pubtype" column, which makes things easier, and also will help support
> new options in the future.
> A new system table "pg_publication_schema" has been added, to maintain the
> schemas that the user wants to publish through the publication. The
> schema/publication/publication_schema dependency was created to handle the
> corresponding renaming/removal of schemas to the publication/publication_schema
> when the schema is renamed/dropped. The Decoder identifies if the relation is
> part of the publication and replicates it to the subscriber. Changes were made
> to pg_dump to handle pubtype updation in the pg_publication table when the
> database is upgraded.
>
> Prototypes present in pg_publication.h have been moved to publicationcmds.h so
> that minimal data structures are exported to pg_dump and psql clients, as the
> rest of the information need not be exported.
>
> CATALOG_VERSION_NO needs to be updated while committing, as this feature
> involves a catalog change.
>
> TODO: version checks for psql/pg_dump need to be changed from 140000 to 150000
> once the ongoing release is completed.
>

Modified.

> (2) src/backend/catalog/objectaddress.c
> getObjectIdentityParts(), case OCLASS_PUBLICATION_SCHEMA
>
> Shouldn't pubname/nspname be pfree()d, if objargs/objname are NULL?
>

Modified

> (3) src/backend/catalog/pg_publication.c
>
> (i)
> GetAllTablesPublications
>
>
> BEFORE:
> + * Gets list of relations published.
> AFTER:
> + * Gets the list of relations published.
>
> There are several other cases of newly-added "Gets list of ..." comments.
>

Modified

> (ii)
> GetAllTablesPublicationRelations
>
> BEFORE:
> + * Gets list of all relation published by FOR SCHEMA publication(s).
> AFTER:
> + * Gets the list of all relations published by FOR SCHEMA publication(s).
>

Modified

> (4) src/backend/commands/publicationcmds.c
>
> Missing function header for function "UpdatePublicationTypeTupleValue".
>

Included function header.

> (5) src/backend/parser/gram.y
>
> Violation of function header comment format:
>
> BEFORE:
> +/* makeSchemaSpec
> + * Create a SchemaSpec with the given type
> + */
>
> AFTER:
> +/*
> + * makeSchemaSpec
> + * Create a SchemaSpec with the given type
> + */
>

Modified it to:
/*
 * makeSchemaSpec - Create a SchemaSpec with the given type
 */

> (6) src/bin/pg_dump/pg_dump.c
>
> The following code is within a loop that processes schemas.
> I think that (in the comment) "Clean up and return" should instead say
> "Clean up and process the next schema"
> Also, should say "Schema is not a member".
>
> + /*
> + * Schema is not member of any publications. Clean up and return.
> + */
> + PQclear(res);
> + continue;
>

Modified.

> (7) src/bin/psql/describe.c
>
> Missing function header for function "addFooterToPublicationDesc".
>

Included it.

> v7-0002
>
> (1) doc/src/sgml/catalogs.sgml
> Typo
>
> BEFORE:
> +       respctively. The publication type cannot be changed in other cases.
> AFTER:
> +       respectively. The publication type cannot be changed in other cases.
>

Modified

> (2) doc/src/sgml/ref/create_publication.sgml
>
> BEFORE:
> +   Create a publication that publishes all changes for all the tables
> present in
> +production schema:
> AFTER:
> +   Create a publication that publishes all changes for all the tables
> present in
> +the schema "production":
>
>
> BEFORE:
> +   Create a publication that publishes all changes for all the tables
> present in
> +marketing and sales schemas:
> AFTER:
> +   Create a publication that publishes all changes for all the tables
> present in
> +the schemas "marketing" and "sales":
>

Modified.

> (3) src/test/regress/expected/publication.out
>
> BEFORE:
> +-- Drop schema that is not preset in the publication
> AFTER:
> +-- Drop schema that is not present in the publication
>
> BEFORE:
> +--- Check create publication on a object which is not schema
> AFTER:
> +--- Check create publication on an object which is not schema
>

Modified.

> (4) src/test/regress/sql/publication.sql
>
> BEFORE:
> +-- Drop schema that is not preset in the publication
> AFTER:
> +-- Drop schema that is not present in the publication
>

Modified.

Thanks for reviewing and providing the comments, Attached v8 patches
have the fixes for the comments.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Jun 24, 2021 at 2:12 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Jun 24, 2021 at 4:41 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > I applied your V7* patch and complied it. The following warnings came out, please take a look.
> >
>
> I encountered the following warnings when building with the v7 patches applied:
> (looks like missing #include "catalog/objectaddress.h")
>
> pg_prewarm.c:109:29: warning: implicit declaration of function
> ‘get_relkind_objtype’; did you mean ‘get_element_type’?
> [-Wimplicit-function-declaration]
>    aclcheck_error(aclresult,
> get_relkind_objtype(rel->rd_rel->relkind), get_rel_name(relOid));
>                              ^~~~~~~~~~~~~~~~~~~
>                              get_element_type
>
> heap_surgery.c:391:9: warning: implicit declaration of function
> ‘get_relkind_objtype’; did you mean ‘get_publication_type’?
> [-Wimplicit-function-declaration]
>          get_relkind_objtype(rel->rd_rel->relkind),
>          ^~~~~~~~~~~~~~~~~~~
>          get_publication_type
>
> pgrowlocks.c:136:29: warning: implicit declaration of function
> ‘get_relkind_objtype’ [-Wimplicit-function-declaration]
>    aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
>

Thanks for reporting these warnings, I have fixed this in the v8 patch attached at [1].

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Jun 24, 2021 at 12:10 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> Hi
>
> I applied your V7* patch and complied it. The following warnings came out, please take a look.
>
> >pg_publication.c:688:22: warning: ‘tables’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> >   funcctx->user_fctx = (void *) tables;
> >   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
> >describe.c: In function ‘describePublications’:
> >describe.c:6479:35: warning: ‘pubtype’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> >   else if (has_pubtype && pubtype == PUBTYPE_SCHEMA)
>
> For the warning in pg_publication.c, maybe we can replace the following 'else if' with 'else'.
> +               else if (publication->pubtype == PUBTYPE_SCHEMA)
>
> For the warning in describe.c, initialization of 'pubtype' is needed.

Thanks for reporting these warnings, I have fixed this in the v8 patch
attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.gmail.com

Regards,
Vignesh



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Friday, June 25, 2021 2:25 AM vignesh C <vignesh21@gmail.com>wrote:
> 
> Thanks for reporting these warnings, I have fixed this in the v8 patch
> attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.
> gmail.com
> 

Thanks for your patch. The warnings are fixed.

But I found an issue while using your V8 patch, which is similar to [1]. The case is as below:
Drop a schema from publication and refresh publication at subscriber, then insert into publisher table,  the inserts
stillreplicated to subscriber. The expect result is that the data is no longer replicated.
 

For example:
------publisher------
create schema s1;
create table s1.t1 (a int primary key);
create publication pub for schema s1;

------subscriber------
create schema s1;
create table s1.t1 (a int primary key);
create subscription sub connection 'dbname=postgres port=5432' publication pub;

------publisher------
insert into s1.t1 values (1);

------subscriber------
postgres=# select * from s1.t1;
 a
---
 1
(1 row)

------publisher------
alter publication pub drop schema s1;
insert into s1.t1 values (2);

------subscriber------
postgres=# select * from s1.t1;
 a
---
 1
 2
(2 rows)


The similar issue [1] (related to "ALTER PUBLICATION .. DROP TABLE") was fixed by modifying in
rel_sync_cache_publication_cbcallback, which is related to PUBLICATIONRELMAP syscache.  In my case, I think it used
PUBLICATIONSCHEMAMAPsyscache, and no callback was registered for it. Should we register a callback for it or fix it in
otherways?
 

[1]https://www.postgresql.org/message-id/flat/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com

Regards
Tang

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Tuesday, June 29, 2021 11:25 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> 
> Thanks for your patch. The warnings are fixed.
> 
> But I found an issue while using your V8 patch, which is similar to [1]. The case
> is as below:
> Drop a schema from publication and refresh publication at subscriber, then
> insert into publisher table,  the inserts still replicated to subscriber. The expect
> result is that the data is no longer replicated.
> 

I also saw a problem while using "ALTER PUBLICATION ... ADD SCHEMA ...". The case is as below:
Add a schema to publication, then refresh publication at subscriber. Insert into publisher table,  the inserts couldn't
replicateto subscriber.
 

Steps to reproduce the case:
------publisher------
CREATE PUBLICATION testpub FOR SCHEMA public;

------subscriber------
CREATE SUBSCRIPTION testsub CONNECTION 'dbname=postgres port=5432' PUBLICATION testpub;

------publisher------
CREATE SCHEMA s1;
CREATE TABLE s1.t1 (a int PRIMARY KEY);
insert into s1.t1 values (1);
ALTER PUBLICATION testpub ADD SCHEMA s1;

------subscriber------
CREATE SCHEMA s1;
CREATE TABLE s1.t1 (a int PRIMARY KEY);
ALTER SUBSCRIPTION testsub REFRESH PUBLICATION;
postgres=# SELECT * FROM s1.t1;
 a
---
 1
(1 row)

------publisher------
insert into s1.t1 values (2);

------subscriber------
postgres=# SELECT * FROM s1.t1;
 a
---
 1
(1 row)

when I executed "ALTER PUBLICATION ... ADD TABLE ...",  rel_sync_cache_publication_cb callback function set
replicate_validto false, then it would validate the entry in get_rel_sync_entry function, and marked the pubactions to
true.so it worked ok.
 

In the case of  "ALTER PUBLICATION ... ADD SCHEMA ...", replicate_valid would not be set to false. Because of this, the
pubactionswere still false in get_rel_sync_entry function.
 
So I think the reason for it is similar to the one I reported before [1].

[1]
https://www.postgresql.org/message-id/OS0PR01MB61134B20314DE45795DD384CFB029%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards
Tang

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Jun 29, 2021 at 8:55 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, June 25, 2021 2:25 AM vignesh C <vignesh21@gmail.com>wrote:
> >
> > Thanks for reporting these warnings, I have fixed this in the v8 patch
> > attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.
> > gmail.com
> >
>
> Thanks for your patch. The warnings are fixed.
>
> But I found an issue while using your V8 patch, which is similar to [1]. The case is as below:
> Drop a schema from publication and refresh publication at subscriber, then insert into publisher table,  the inserts still replicated to subscriber. The expect result is that the data is no longer replicated.
>
> For example:
> ------publisher------
> create schema s1;
> create table s1.t1 (a int primary key);
> create publication pub for schema s1;
>
> ------subscriber------
> create schema s1;
> create table s1.t1 (a int primary key);
> create subscription sub connection 'dbname=postgres port=5432' publication pub;
>
> ------publisher------
> insert into s1.t1 values (1);
>
> ------subscriber------
> postgres=# select * from s1.t1;
>  a
> ---
>  1
> (1 row)
>
> ------publisher------
> alter publication pub drop schema s1;
> insert into s1.t1 values (2);
>
> ------subscriber------
> postgres=# select * from s1.t1;
>  a
> ---
>  1
>  2
> (2 rows)
>

Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at [1].
Attachment

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:

On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:

> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at [1].

 

Thanks for your patch. I confirmed that the two issues I reported has been fixed.

 

Regards

Tang

Re: Added schema level support for publication.

From
Bharath Rupireddy
Date:
On Tue, Jun 22, 2021 at 10:11 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > I have removed the skip table patches to keep the focus on the main
> > patch, once this patch gets into committable shape, I will focus on
> > the skip table patch.
>
> IMO it's a good idea to start a new thread for the "skip table"
> feature so that we can discuss it separately. If required you can
> specify in that thread that the idea of the "skip table" can be
> applied to the "add schema level support" feature.

Hi Vignesh,

I will find sometime to review the v9 patch set. I'm curious to know
whether the latest v9 patch set has the changes for the "skip table"
feature? Or is it being discussed separately?

Regards,
Bharath Rupireddy.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Jul 1, 2021 at 5:43 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Jun 22, 2021 at 10:11 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > > I have removed the skip table patches to keep the focus on the main
> > > patch, once this patch gets into committable shape, I will focus on
> > > the skip table patch.
> >
> > IMO it's a good idea to start a new thread for the "skip table"
> > feature so that we can discuss it separately. If required you can
> > specify in that thread that the idea of the "skip table" can be
> > applied to the "add schema level support" feature.
>
> Hi Vignesh,
>
> I will find sometime to review the v9 patch set. I'm curious to know
> whether the latest v9 patch set has the changes for the "skip table"
> feature? Or is it being discussed separately?

The v9 patch does not include the "Skip table" feature, I'm planning
to focus on that once the current patch is stabilized.

Regards,
Vignesh



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported
at[1].
 

A comment on v9:

src/bin/psql/describe.c

+                                              if (pset.sversion >= 15000)

I think it should be 150000.

Regards
Tang

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 2, 2021 at 10:18 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you
reportedat [1].
 
>
> A comment on v9:
>
> src/bin/psql/describe.c
>
> +                                              if (pset.sversion >= 15000)
>
> I think it should be 150000.

Thanks for reporting this, I will fix this in the next version of the patch.

Regards,
Vignesh



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported
at[1].
 

Hi,

I had a look at the patch, please consider following comments.

(1)
-            if (pub->alltables)
+            if (pub->pubtype == PUBTYPE_ALLTABLES)
             {
                 publish = true;
                 if (pub->pubviaroot && am_partition)
                     publish_as_relid = llast_oid(get_partition_ancestors(relid));
             }
 
+            if (pub->pubtype == PUBTYPE_SCHEMA)
+            {
+                Oid            schemaId = get_rel_namespace(relid);
+                List       *pubschemas = GetPublicationSchemas(pub->oid);
+
+                if (list_member_oid(pubschemas, schemaId))
+                {

It might be better use "else if" for the second check here.
Like: else if (pub->pubtype == PUBTYPE_SCHEMA)

Besides, we already have the {schemaoid, pubid} set here, it might be
better to scan the cache PUBLICATIONSCHEMAMAP instead of invoking
GetPublicationSchemas() which will scan the whole table.


(2)
+        /* Identify which schemas should be dropped. */
+        foreach(oldlc, oldschemaids)
+        {
+            Oid            oldschemaid = lfirst_oid(oldlc);
+            ListCell   *newlc;
+            bool        found = false;
+
+            foreach(newlc, schemaoidlist)
+            {
+                Oid            newschemaid = lfirst_oid(newlc);
+
+                if (newschemaid == oldschemaid)
+                {
+                    found = true;
+                    break;
+                }
+            }

It seems we can use " if (list_member_oid(schemaoidlist, oldschemaid)) "
to replace the second foreach loop.

(3)
there are some testcases change in 0001 patch, it might be better move them
to 0002 patch.


(4)
+        case OBJECT_PUBLICATION_SCHEMA:
+            objnode = (Node *) list_make2(linitial(name), linitial(args));
+            break;
         case OBJECT_USER_MAPPING:
             objnode = (Node *) list_make2(linitial(name), linitial(args));
             break;

Does it looks better to merge these two switch cases ?
Like:
case OBJECT_PUBLICATION_SCHEMA:
case OBJECT_USER_MAPPING:
    objnode = (Node *) list_make2(linitial(name), linitial(args));
    break;

best regards,
houzj

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Thursday, July 8, 2021 11:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote
> On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also
> fixes the other issue you reported at [1].
> 
> Hi,
> 
> I had a look at the patch, please consider following comments.

Some more commets.

Currently, postgres caches publication actions info in the
RelationData::rd_pubactions, but after applying the patch, it seems
rd_pubactions is not initialized when using schema level publication.

It cound result in some unexpected behaviour when checking if command can be
executed with current replica identity.

----
CheckCmdReplicaIdentity
...
    pubactions = GetRelationPublicationActions(rel);
    if (cmd == CMD_UPDATE && pubactions->pubupdate)
        ereport(ERROR,
                (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                 errmsg("cannot update table \"%s\" because it does not have a replica identity and publishes
updates",
                        RelationGetRelationName(rel)),
                 errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.")));
----
Based on the above code, we access rd_pubactions from relcache in function
GetRelationPublicationActions(), if we don't have correct publication
information, we won't get the error in publication, instead we could lead to
the failure in subscription side.

To fix, I think (1) the patch can modify the function
GetRelationPublicationActions() to get the schema level publication related to
the relation, then merge the publication action to the rd_pubactions.

In addition, (2) it seems we also need to add the relcache invalidation code
about schema level publication when alter publication options.

I attached a rough code diff about (1) and (2), the diff is based on the v9 patchset.
I hope it can help fix the above issues.

Best regards,
Hou zhijie

Attachment

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Jul 9, 2021 at 1:28 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Currently, postgres caches publication actions info in the
> RelationData::rd_pubactions, but after applying the patch, it seems
> rd_pubactions is not initialized when using schema level publication.
>
> It cound result in some unexpected behaviour when checking if command can be
> executed with current replica identity.
>

While testing this patch, I'm finding that for a FOR SCHEMA
publication, UPDATEs and DELETEs on tables belonging to that schema
are not getting replicated (but INSERTs and TRUNCATEs are).
Could this be related to the issues that Hou-san has identified?

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 9, 2021 at 8:58 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, July 8, 2021 11:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote
> > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also
> > fixes the other issue you reported at [1].
> >
> > Hi,
> >
> > I had a look at the patch, please consider following comments.
>
> Some more commets.
>
> Currently, postgres caches publication actions info in the
> RelationData::rd_pubactions, but after applying the patch, it seems
> rd_pubactions is not initialized when using schema level publication.
>
> It cound result in some unexpected behaviour when checking if command can be
> executed with current replica identity.
>
> ----
> CheckCmdReplicaIdentity
> ...
>         pubactions = GetRelationPublicationActions(rel);
>         if (cmd == CMD_UPDATE && pubactions->pubupdate)
>                 ereport(ERROR,
>                                 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>                                  errmsg("cannot update table \"%s\" because it does not have a replica identity and
publishesupdates",
 
>                                                 RelationGetRelationName(rel)),
>                                  errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.")));
> ----
> Based on the above code, we access rd_pubactions from relcache in function
> GetRelationPublicationActions(), if we don't have correct publication
> information, we won't get the error in publication, instead we could lead to
> the failure in subscription side.
>
> To fix, I think (1) the patch can modify the function
> GetRelationPublicationActions() to get the schema level publication related to
> the relation, then merge the publication action to the rd_pubactions.
>
> In addition, (2) it seems we also need to add the relcache invalidation code
> about schema level publication when alter publication options.
>
> I attached a rough code diff about (1) and (2), the diff is based on the v9 patchset.
> I hope it can help fix the above issues.

Thanks for identifying the issues and also providing the fix for it.
Those changes are required, I have taken your changes into my patch.
The Attached patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 9, 2021 at 12:12 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Jul 9, 2021 at 1:28 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Currently, postgres caches publication actions info in the
> > RelationData::rd_pubactions, but after applying the patch, it seems
> > rd_pubactions is not initialized when using schema level publication.
> >
> > It cound result in some unexpected behaviour when checking if command can be
> > executed with current replica identity.
> >
>
> While testing this patch, I'm finding that for a FOR SCHEMA
> publication, UPDATEs and DELETEs on tables belonging to that schema
> are not getting replicated (but INSERTs and TRUNCATEs are).
> Could this be related to the issues that Hou-san has identified?

Thanks for reporting this issue. I felt this issue is the same as the issue which Hou San had reported. This issue is fixed in the v10 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Jul 8, 2021 at 9:16 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you
reportedat [1].
 
>
> Hi,
>
> I had a look at the patch, please consider following comments.
>
> (1)
> -                       if (pub->alltables)
> +                       if (pub->pubtype == PUBTYPE_ALLTABLES)
>                         {
>                                 publish = true;
>                                 if (pub->pubviaroot && am_partition)
>                                         publish_as_relid = llast_oid(get_partition_ancestors(relid));
>                         }
>
> +                       if (pub->pubtype == PUBTYPE_SCHEMA)
> +                       {
> +                               Oid                     schemaId = get_rel_namespace(relid);
> +                               List       *pubschemas = GetPublicationSchemas(pub->oid);
> +
> +                               if (list_member_oid(pubschemas, schemaId))
> +                               {
>
> It might be better use "else if" for the second check here.
> Like: else if (pub->pubtype == PUBTYPE_SCHEMA)
>
> Besides, we already have the {schemaoid, pubid} set here, it might be
> better to scan the cache PUBLICATIONSCHEMAMAP instead of invoking
> GetPublicationSchemas() which will scan the whole table.

Modified.

> (2)
> +               /* Identify which schemas should be dropped. */
> +               foreach(oldlc, oldschemaids)
> +               {
> +                       Oid                     oldschemaid = lfirst_oid(oldlc);
> +                       ListCell   *newlc;
> +                       bool            found = false;
> +
> +                       foreach(newlc, schemaoidlist)
> +                       {
> +                               Oid                     newschemaid = lfirst_oid(newlc);
> +
> +                               if (newschemaid == oldschemaid)
> +                               {
> +                                       found = true;
> +                                       break;
> +                               }
> +                       }
>
> It seems we can use " if (list_member_oid(schemaoidlist, oldschemaid)) "
> to replace the second foreach loop.
>

Modified.

> (3)
> there are some testcases change in 0001 patch, it might be better move them
> to 0002 patch.

These changes are required to modify the existing tests. I kept it in
0001 so that 0001 patch can be committed independently. I think we can
keep the change as it is, I did not make any changes for  this.

> (4)
> +               case OBJECT_PUBLICATION_SCHEMA:
> +                       objnode = (Node *) list_make2(linitial(name), linitial(args));
> +                       break;
>                 case OBJECT_USER_MAPPING:
>                         objnode = (Node *) list_make2(linitial(name), linitial(args));
>                         break;
>
> Does it looks better to merge these two switch cases ?
> Like:
> case OBJECT_PUBLICATION_SCHEMA:
> case OBJECT_USER_MAPPING:
>         objnode = (Node *) list_make2(linitial(name), linitial(args));
>         break;

Modified.

Thanks for the comments, these comments are fixed as part of the v10
patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 2, 2021 at 10:18 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you
reportedat [1].
 
>
> A comment on v9:
>
> src/bin/psql/describe.c
>
> +                                              if (pset.sversion >= 15000)
>
> I think it should be 150000.

Thanks for reporting this issue, this issue is fixed in the v10 patch
attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com

Regards,
Vignesh



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for reporting this issue, this issue is fixed in the v10 patch
> attached at [1].
> [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com

Thanks for fixing it.

By applying your V10 patch, I saw three problems, please have a look.

1. An issue about pg_dump. 
When public schema was published, the publication was created in the output file, but public schema was not added to
it.(Other schemas could be added as expected.)
 

I looked into it and found that selectDumpableNamespace function marks DUMP_COMPONENT_DEFINITION as needless when the
schemais public, leading to schema public is ignored in getPublicationSchemas. So we'd better check whether schemas
shouldbe dumped in another way.
 

I tried to fix it with the following change, please have a look. (Maybe we also need to add some comments for it.)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f6b4f12648..a327d2568b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, NamespaceInfo nspinfo[], int numSchemas)
                 * Ignore publication membership of schemas whose definitions are not
                 * to be dumped.
                 */
-               if (!(nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
+               if (!((nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+                       || (strcmp(nsinfo->dobj.name, "public") == 0 && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
                        continue;

                pg_log_info("reading publication membership for schema \"%s\"",

2. improper behavior for system schema
I found I could create publication for system schema, such as pg_catalog. I think it's better to report an error
messagehere, just like creating publication for system table is unallowed.
 

3. fix for dumpPublicationSchema
Should we add a declaration for it and add const decorations to the it's second parameter? Like other similar
functions.

Regards
Tang

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Jul 12, 2021 at 7:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Thanks for reporting this issue. I felt this issue is the same as the issue which Hou San had reported. This issue is
fixedin the v10 patch attached at [1].
 
> [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
>

I did some testing and the issue that I reported does seem to be fixed
by the v10 patch.

I have some patch review comments for the v10 patch:

(1)

The following:

+ if (!OidIsValid(address.objectId))
+ {
+    if (!missing_ok)
+       ereport(ERROR,
+          (errcode(ERRCODE_UNDEFINED_OBJECT),
+          errmsg("publication schema \"%s\" in publication \"%s\"
does not exist",
+             schemaname, pubname)));
+    return address;
+ }
+
+ return address;

could just be simplified to:

+ if (!OidIsValid(address.objectId) && !missing_ok)
+ {
+    ereport(ERROR,
+       (errcode(ERRCODE_UNDEFINED_OBJECT),
+       errmsg("publication schema \"%s\" in publication \"%s\" does not exist",
+          schemaname, pubname)));
+ }
+
+ return address;


(2) src/backend/catalog/objectaddress.c

I think there is a potential illegal memory access (psform->psnspcid)
in the case of "!missing_ok", as the tuple is released from the cache
on the previous line.

+ psform = (Form_pg_publication_schema) GETSTRUCT(tup);
+ pubname = get_publication_name(psform->pspubid, false);
+ nspname = get_namespace_name(psform->psnspcid);
+ if (!nspname)
+ {
+    pfree(pubname);
+    ReleaseSysCache(tup);
+    if (!missing_ok)
+       elog(ERROR, "cache lookup failed for schema %u",
+          psform->psnspcid);
+    break;
+ }


I think this should be:

+ psform = (Form_pg_publication_schema) GETSTRUCT(tup);
+ pubname = get_publication_name(psform->pspubid, false);
+ nspname = get_namespace_name(psform->psnspcid);
+ if (!nspname)
+ {
+    Oid psnspcid = psform->psnspcid;
+
+    pfree(pubname);
+    ReleaseSysCache(tup);
+    if (!missing_ok)
+       elog(ERROR, "cache lookup failed for schema %u",
+          psnspcid);
+    break;
+ }

There are two cases of this that need correction (see: case
OCLASS_PUBLICATION_SCHEMA).

(3) Incomplete function header comment

+ * makeSchemaSpec - Create a SchemaSpec with the given type

Should be:

+ * makeSchemaSpec - Create a SchemaSpec with the given type and location


(4) src/bin/psql/describe.c

Shouldn't the following comment say "version 15"?

+ /* Prior to version 14 check was based on all tables */
+ if ((has_pubtype && pubtype == PUBTYPE_TABLE) ||
+ (!has_pubtype && !puballtables))


(5) typedefs.list

I think you also need to add "Form_pg_publication_schema" to typedefs.list.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, this issue is fixed in the v10 patch
> > attached at [1].
> > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
>
> Thanks for fixing it.
>
> By applying your V10 patch, I saw three problems, please have a look.
>
> 1. An issue about pg_dump.
> When public schema was published, the publication was created in the output file, but public schema was not added to
it.(Other schemas could be added as expected.) 
>
> I looked into it and found that selectDumpableNamespace function marks DUMP_COMPONENT_DEFINITION as needless when the
schemais public, leading to schema public is ignored in getPublicationSchemas. So we'd better check whether schemas
shouldbe dumped in another way. 
>
> I tried to fix it with the following change, please have a look. (Maybe we also need to add some comments for it.)
>
> diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> index f6b4f12648..a327d2568b 100644
> --- a/src/bin/pg_dump/pg_dump.c
> +++ b/src/bin/pg_dump/pg_dump.c
> @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, NamespaceInfo nspinfo[], int numSchemas)
>                  * Ignore publication membership of schemas whose definitions are not
>                  * to be dumped.
>                  */
> -               if (!(nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
> +               if (!((nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
> +                       || (strcmp(nsinfo->dobj.name, "public") == 0 && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
>                         continue;
>
>                 pg_log_info("reading publication membership for schema \"%s\"",

I felt it is intentionally done like that as the pubic schema is
created by default, hence it is not required to dump else we will get
errors while restoring. Thougths?

> 2. improper behavior for system schema
> I found I could create publication for system schema, such as pg_catalog. I think it's better to report an error
messagehere, just like creating publication for system table is unallowed. 

Modified.

> 3. fix for dumpPublicationSchema
> Should we add a declaration for it and add const decorations to the it's second parameter? Like other similar
functions.

Modified to include const, declaration is not required as the function
definition is before function call, so not making this change.

Thanks for your comments, the attached v11 patch fixes the issues.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Jul 13, 2021 at 2:22 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Jul 12, 2021 at 7:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Thanks for reporting this issue. I felt this issue is the same as the issue which Hou San had reported. This issue is fixed in the v10 patch attached at [1].
> > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> >
>
> I did some testing and the issue that I reported does seem to be fixed
> by the v10 patch.
>
> I have some patch review comments for the v10 patch:
>
> (1)
>
> The following:
>
> + if (!OidIsValid(address.objectId))
> + {
> +    if (!missing_ok)
> +       ereport(ERROR,
> +          (errcode(ERRCODE_UNDEFINED_OBJECT),
> +          errmsg("publication schema \"%s\" in publication \"%s\"
> does not exist",
> +             schemaname, pubname)));
> +    return address;
> + }
> +
> + return address;
>
> could just be simplified to:
>
> + if (!OidIsValid(address.objectId) && !missing_ok)
> + {
> +    ereport(ERROR,
> +       (errcode(ERRCODE_UNDEFINED_OBJECT),
> +       errmsg("publication schema \"%s\" in publication \"%s\" does not exist",
> +          schemaname, pubname)));
> + }
> +
> + return address;

Modified

> (2) src/backend/catalog/objectaddress.c
>
> I think there is a potential illegal memory access (psform->psnspcid)
> in the case of "!missing_ok", as the tuple is released from the cache
> on the previous line.
>
> + psform = (Form_pg_publication_schema) GETSTRUCT(tup);
> + pubname = get_publication_name(psform->pspubid, false);
> + nspname = get_namespace_name(psform->psnspcid);
> + if (!nspname)
> + {
> +    pfree(pubname);
> +    ReleaseSysCache(tup);
> +    if (!missing_ok)
> +       elog(ERROR, "cache lookup failed for schema %u",
> +          psform->psnspcid);
> +    break;
> + }
>
>
> I think this should be:
>
> + psform = (Form_pg_publication_schema) GETSTRUCT(tup);
> + pubname = get_publication_name(psform->pspubid, false);
> + nspname = get_namespace_name(psform->psnspcid);
> + if (!nspname)
> + {
> +    Oid psnspcid = psform->psnspcid;
> +
> +    pfree(pubname);
> +    ReleaseSysCache(tup);
> +    if (!missing_ok)
> +       elog(ERROR, "cache lookup failed for schema %u",
> +          psnspcid);
> +    break;
> + }
>
> There are two cases of this that need correction (see: case
> OCLASS_PUBLICATION_SCHEMA).

Modified

> (3) Incomplete function header comment
>
> + * makeSchemaSpec - Create a SchemaSpec with the given type
>
> Should be:
>
> + * makeSchemaSpec - Create a SchemaSpec with the given type and location

Modified

> (4) src/bin/psql/describe.c
>
> Shouldn't the following comment say "version 15"?
>
> + /* Prior to version 14 check was based on all tables */
> + if ((has_pubtype && pubtype == PUBTYPE_TABLE) ||
> + (!has_pubtype && !puballtables))

Modified

> (5) typedefs.list
>
> I think you also need to add "Form_pg_publication_schema" to typedefs.list.

Modified.

Thanks for the comments, these comments are fixed in the v11 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1oZzaEsZC1W8MRNGZ6LWOayC54_UzyRV%2BnCh8w0yW74g%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for reporting this issue, this issue is fixed in the v10
> > > patch attached at [1].
> > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> >
> > Thanks for fixing it.
> >
> > By applying your V10 patch, I saw three problems, please have a look.
> >
> > 1. An issue about pg_dump.
> > When public schema was published, the publication was created in the
> > output file, but public schema was not added to it. (Other schemas
> > could be added as expected.)
> >
> > I looked into it and found that selectDumpableNamespace function marks
> DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> leading to schema public is ignored in getPublicationSchemas. So we'd better
> check whether schemas should be dumped in another way.
> >
> > I tried to fix it with the following change, please have a look.
> > (Maybe we also need to add some comments for it.)
> >
> > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > index f6b4f12648..a327d2568b 100644
> > --- a/src/bin/pg_dump/pg_dump.c
> > +++ b/src/bin/pg_dump/pg_dump.c
> > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> NamespaceInfo nspinfo[], int numSchemas)
> >                  * Ignore publication membership of schemas whose
> definitions are not
> >                  * to be dumped.
> >                  */
> > -               if (!(nsinfo->dobj.dump &
> DUMP_COMPONENT_DEFINITION))
> > +               if (!((nsinfo->dobj.dump &
> DUMP_COMPONENT_DEFINITION)
> > +                       || (strcmp(nsinfo->dobj.name, "public") == 0
> > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> >                         continue;
> >
> >                 pg_log_info("reading publication membership for schema
> > \"%s\"",
> 
> I felt it is intentionally done like that as the pubic schema is created by default,
> hence it is not required to dump else we will get errors while restoring.
> Thougths?

Thanks for the new patches and I also looked at this issue.

For user defined schema and publication:
--------------------------
create schema s1;
create publication pub2 for SCHEMA s1;
--------------------------

pg_dump will only generate the following SQLs:

------pg_dump result------
CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
ALTER PUBLICATION pub2 ADD SCHEMA s1;
--------------------------

But for the public schema:
--------------------------
create publication pub for SCHEMA public;
--------------------------

pg_dump will only generate the following SQL:

------pg_dump result------
CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
--------------------------

It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
means the public schema won't be published after restoring. So, I think we'd
better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?


Best regards,
Hou zhijie

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Wed, Jul 14, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for your comments, the attached v11 patch fixes the issues.
>

Thanks for your work on this.

I have some minor review comments on the documentation:

(1) wrong link (uses altersubscription instead of alterpublication)
doc/src/sgml/catalogs.sgml

BEFORE:
+       created as an empty publication type. When a table or schema is added to
+       the publication using <link linkend="sql-altersubscription">
+       <command>ALTER PUBLICATION</command></link> then the publication type

AFTER:
+       created as an empty publication type. When a table or schema is added to
+       the publication using <link linkend="sql-alterpublication">
+       <command>ALTER PUBLICATION</command></link> then the publication type


(2) Improve wording and suggest "or" instead of "and"
doc/src/sgml/catalogs.sgml

BEFORE:
+       If a publication is created without specifying any of
+       <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> and
+       <literal>FOR SCHEMA</literal> option, then the publication will be

AFTER:
+       If a publication is created without specifying any of the
+       <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> or
+       <literal>FOR SCHEMA</literal> options, then the publication will be


(3) space at start of literal
doc/src/sgml/catalogs.sgml

+   and <literal> FOR SCHEMA</literal>, so for such publications there will be a


(4) Should say "variants of this command change ..." ?

+   The fourth, fifth and sixth variants change which schemas are part of the



Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
(PubType is getting set to 'e' instead of 's'

test_pub=# create publication pub1;
CREATE PUBLICATION
test_pub=# create table myschema.test(i int);
CREATE TABLE
test_pub=# alter publication pub1 set schema myschema;
ALTER PUBLICATION
test_pub=# \dRp pub1
                                   List of publications
 Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
Via root | PubType
------+-------+------------+---------+---------+---------+-----------+----------+---------
 pub1 | gregn | f          | t       | t       | t       | t         |
f        | e
(1 row)

test_pub=# alter publication pub1 add table test;
ALTER PUBLICATION
test_pub=# \dRp pub1
                                   List of publications
 Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
Via root | PubType
------+-------+------------+---------+---------+---------+-----------+----------+---------
 pub1 | gregn | f          | t       | t       | t       | t         |
f        | t
(1 row)


When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
publication, it seems OK.

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > patch attached at [1].
> > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > >
> > > Thanks for fixing it.
> > >
> > > By applying your V10 patch, I saw three problems, please have a look.
> > >
> > > 1. An issue about pg_dump.
> > > When public schema was published, the publication was created in the
> > > output file, but public schema was not added to it. (Other schemas
> > > could be added as expected.)
> > >
> > > I looked into it and found that selectDumpableNamespace function marks
> > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > leading to schema public is ignored in getPublicationSchemas. So we'd better
> > check whether schemas should be dumped in another way.
> > >
> > > I tried to fix it with the following change, please have a look.
> > > (Maybe we also need to add some comments for it.)
> > >
> > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > index f6b4f12648..a327d2568b 100644
> > > --- a/src/bin/pg_dump/pg_dump.c
> > > +++ b/src/bin/pg_dump/pg_dump.c
> > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > NamespaceInfo nspinfo[], int numSchemas)
> > >                  * Ignore publication membership of schemas whose
> > definitions are not
> > >                  * to be dumped.
> > >                  */
> > > -               if (!(nsinfo->dobj.dump &
> > DUMP_COMPONENT_DEFINITION))
> > > +               if (!((nsinfo->dobj.dump &
> > DUMP_COMPONENT_DEFINITION)
> > > +                       || (strcmp(nsinfo->dobj.name, "public") == 0
> > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > >                         continue;
> > >
> > >                 pg_log_info("reading publication membership for schema
> > > \"%s\"",
> >
> > I felt it is intentionally done like that as the pubic schema is created by default,
> > hence it is not required to dump else we will get errors while restoring.
> > Thougths?
>
> Thanks for the new patches and I also looked at this issue.
>
> For user defined schema and publication:
> --------------------------
> create schema s1;
> create publication pub2 for SCHEMA s1;
> --------------------------
>
> pg_dump will only generate the following SQLs:
>
> ------pg_dump result------
> CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
> ALTER PUBLICATION pub2 ADD SCHEMA s1;
> --------------------------
>
> But for the public schema:
> --------------------------
> create publication pub for SCHEMA public;
> --------------------------
>
> pg_dump will only generate the following SQL:
>
> ------pg_dump result------
> CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
> --------------------------
>
> It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
> means the public schema won't be published after restoring. So, I think we'd
> better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?

Thanks for reporting this issue, this issue is fixed in the v12 patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for your comments, the attached v11 patch fixes the issues.
> >
>
> Thanks for your work on this.
>
> I have some minor review comments on the documentation:
>
> (1) wrong link (uses altersubscription instead of alterpublication)
> doc/src/sgml/catalogs.sgml
>
> BEFORE:
> +       created as an empty publication type. When a table or schema is added to
> +       the publication using <link linkend="sql-altersubscription">
> +       <command>ALTER PUBLICATION</command></link> then the publication type
>
> AFTER:
> +       created as an empty publication type. When a table or schema is added to
> +       the publication using <link linkend="sql-alterpublication">
> +       <command>ALTER PUBLICATION</command></link> then the publication type

Modified.

> (2) Improve wording and suggest "or" instead of "and"
> doc/src/sgml/catalogs.sgml
>
> BEFORE:
> +       If a publication is created without specifying any of
> +       <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> and
> +       <literal>FOR SCHEMA</literal> option, then the publication will be
>
> AFTER:
> +       If a publication is created without specifying any of the
> +       <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> or
> +       <literal>FOR SCHEMA</literal> options, then the publication will be

Modified.

> (3) space at start of literal
> doc/src/sgml/catalogs.sgml
>
> +   and <literal> FOR SCHEMA</literal>, so for such publications there will be a

Modified.

> (4) Should say "variants of this command change ..." ?
>
> +   The fourth, fifth and sixth variants change which schemas are part of the

Modified.

>
> Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
> (PubType is getting set to 'e' instead of 's'
>
> test_pub=# create publication pub1;
> CREATE PUBLICATION
> test_pub=# create table myschema.test(i int);
> CREATE TABLE
> test_pub=# alter publication pub1 set schema myschema;
> ALTER PUBLICATION
> test_pub=# \dRp pub1
>                                    List of publications
>  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> Via root | PubType
> ------+-------+------------+---------+---------+---------+-----------+----------+---------
>  pub1 | gregn | f          | t       | t       | t       | t         |
> f        | e
> (1 row)
>
> test_pub=# alter publication pub1 add table test;
> ALTER PUBLICATION
> test_pub=# \dRp pub1
>                                    List of publications
>  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> Via root | PubType
> ------+-------+------------+---------+---------+---------+-----------+----------+---------
>  pub1 | gregn | f          | t       | t       | t       | t         |
> f        | t
> (1 row)
>
>
> When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
> publication, it seems OK.

Modified.

Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
[1]  - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:

On Friday, July 16, 2021 6:13 PM vignesh C <vignesh21@gmail.com> wrote:

> On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote:

> >

> > Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...

> > (PubType is getting set to 'e' instead of 's'

> >

> > test_pub=# create publication pub1;

> > CREATE PUBLICATION

> > test_pub=# create table myschema.test(i int);

> > CREATE TABLE

> > test_pub=# alter publication pub1 set schema myschema;

> > ALTER PUBLICATION

> > test_pub=# \dRp pub1

> >                                    List of publications

> >  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |

> > Via root | PubType

> > ------+-------+------------+---------+---------+---------+-----------+----------+---------

> >  pub1 | gregn | f          | t       | t       | t       | t         |

> > f        | e

> > (1 row)

> >

> > test_pub=# alter publication pub1 add table test;

> > ALTER PUBLICATION

> > test_pub=# \dRp pub1

> >                                    List of publications

> >  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |

> > Via root | PubType

> > ------+-------+------------+---------+---------+---------+-----------+----------+---------

> >  pub1 | gregn | f          | t       | t       | t       | t         |

> > f        | t

> > (1 row)

> >

> >

> > When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty

> > publication, it seems OK.

>

> Modified.

>

 

Thanks for your patch. But there is a problem about "ALTER PUBLICATION SET TABLE ", which is similar to the issue Greg reported at [1].

 

For example:

postgres=# CREATE TABLE public.t1 (a int);

CREATE TABLE

postgres=# CREATE PUBLICATION pub1;

CREATE PUBLICATION

postgres=# ALTER PUBLICATION pub1 SET TABLE public.t1;

ALTER PUBLICATION

postgres=# \dRp

                                    List of publications

Name |  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType

------+----------+------------+---------+---------+---------+-----------+----------+---------

pub1 | postgres | f          | t       | t       | t       | t         | f        | e

(1 row)

 

I think PubType in this case should be 't' instead of 'e'. Please have a look.

 

[1] - https://www.postgresql.org/message-id/CAJcOf-ddXvY%3DOFC54CshdMa1bswzFjG9qokjC0aFeiS%3D6CNRzw%40mail.gmail.com

 

Regards,

Tang

 

 

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Friday, July 16, 2021 6:10 PM vignesh C <vignesh21@gmail.com>
> On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> > > <tanghy.fnst@fujitsu.com> wrote:
> > > >
> > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > > > >
> > > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > > patch attached at [1].
> > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > > >
> > > > Thanks for fixing it.
> > > >
> > > > By applying your V10 patch, I saw three problems, please have a look.
> > > >
> > > > 1. An issue about pg_dump.
> > > > When public schema was published, the publication was created in the
> > > > output file, but public schema was not added to it. (Other schemas
> > > > could be added as expected.)
> > > >
> > > > I looked into it and found that selectDumpableNamespace function marks
> > > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > > leading to schema public is ignored in getPublicationSchemas. So we'd better
> > > check whether schemas should be dumped in another way.
> > > >
> > > > I tried to fix it with the following change, please have a look.
> > > > (Maybe we also need to add some comments for it.)
> > > >
> > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > > index f6b4f12648..a327d2568b 100644
> > > > --- a/src/bin/pg_dump/pg_dump.c
> > > > +++ b/src/bin/pg_dump/pg_dump.c
> > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > > NamespaceInfo nspinfo[], int numSchemas)
> > > >                  * Ignore publication membership of schemas whose
> > > definitions are not
> > > >                  * to be dumped.
> > > >                  */
> > > > -               if (!(nsinfo->dobj.dump &
> > > DUMP_COMPONENT_DEFINITION))
> > > > +               if (!((nsinfo->dobj.dump &
> > > DUMP_COMPONENT_DEFINITION)
> > > > +                       || (strcmp(nsinfo->dobj.name, "public") == 0
> > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > > >                         continue;
> > > >
> > > >                 pg_log_info("reading publication membership for schema
> > > > \"%s\"",
> > >
> > > I felt it is intentionally done like that as the pubic schema is created by default,
> > > hence it is not required to dump else we will get errors while restoring.
> > > Thougths?
> >
> > Thanks for the new patches and I also looked at this issue.
> >
> > For user defined schema and publication:
> > --------------------------
> > create schema s1;
> > create publication pub2 for SCHEMA s1;
> > --------------------------
> >
> > pg_dump will only generate the following SQLs:
> >
> > ------pg_dump result------
> > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
> > ALTER PUBLICATION pub2 ADD SCHEMA s1;
> > --------------------------
> >
> > But for the public schema:
> > --------------------------
> > create publication pub for SCHEMA public;
> > --------------------------
> >
> > pg_dump will only generate the following SQL:
> >
> > ------pg_dump result------
> > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
> > --------------------------
> >
> > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
> > means the public schema won't be published after restoring. So, I think we'd
> > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?
> 
> Thanks for reporting this issue, this issue is fixed in the v12 patch attached.
> 

I tested your v12 patch and found a problem in the following case.

Step 1:
postgres=# create schema s1;
CREATE SCHEMA
postgres=# create table s1.t1 (a int);
CREATE TABLE
postgres=# create publication pub_t for table s1.t1;
CREATE PUBLICATION
postgres=# create publication pub_s for schema s1;
CREATE PUBLICATION

Step 2:
pg_dump -N s1

I dumped and excluded schema s1, pg_dump generated the following SQL:
-------------------------------
ALTER PUBLICATION pub_s ADD SCHEMA s1;

I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD TABLE s1.t1" was not generated in my case.
Thoughts?

Regards
Tang

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Modified.
>
> Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
> [1]  -
https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>

There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
After that command, it still regards it as an empty (e) publication,
so I can then ALTER PUBLICATION ... ADD SCHEMA ...

e.g.

test_pub=# create schema myschema;
CREATE SCHEMA
test_pub=# CREATE TABLE myschema.test (key int, value text, data jsonb);
CREATE TABLE
test_pub=# create publication pub1;
CREATE PUBLICATION
test_pub=# \dRp+ pub1
                                 Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
root | Pubtype
-------+------------+---------+---------+---------+-----------+----------+---------
 gregn | f          | t       | t       | t       | t         | f        | e
(1 row)

test_pub=# alter publication pub1 set table myschema.test;
ALTER PUBLICATION
test_pub=# \dRp+ pub1
                                 Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
root | Pubtype
-------+------------+---------+---------+---------+-----------+----------+---------
 gregn | f          | t       | t       | t       | t         | f        | e
(1 row)

test_pub=# alter publication pub1 add schema myschema;
ALTER PUBLICATION
test_pub=# \dRp+ pub1
                                 Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
root | Pubtype
-------+------------+---------+---------+---------+-----------+----------+---------
 gregn | f          | t       | t       | t       | t         | f        | s
Schemas:
    "myschema"


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Rahila Syed
Date:


On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Modified.
>
> Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
> [1]  - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>

There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
After that command, it still regards it as an empty (e) publication,
so I can then ALTER PUBLICATION ... ADD SCHEMA ...


One issue here is that the code to update publication type is missing 
in AlterPublicationTables for SET TABLE command.

More broadly, I am not clear about the behaviour of the patch when a
publication is created to publish only certain tables, and is later altered to publish 
a whole schema. I think such behaviour is legitimate. However,
AFAIU as per current code we can't update the publication type 
from PUBTYPE_TABLE to PUBTYPE_SCHEMA. 

I have some review comments as follows:
1. 
In ConvertSchemaSpecListToOidList(List *schemas) function: 
 +     search_path = fetch_search_path(false);
 +                               nspname = get_namespace_name(linitial_oid(search_path));
 +                               if (nspname == NULL)    /* recently-deleted namespace? */
 +                                       ereport(ERROR,
 +                                                       errcode(ERRCODE_UNDEFINED_SCHEMA),
 +                                                       errmsg("no schema has been selected"));
 +
 +                               schemoid = get_namespace_oid(nspname, false);
 +                               break;

The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
doing Schema oid = liinital_oid(search_path)); should be enough.

2. In the same function should there be an if else condition block instead of a switch case as
there are only two cases.


Thank you,
Rahila Syed

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Jul 19, 2021 at 8:43 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, July 16, 2021 6:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> >
>
> > On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote:
>
> > >
>
> > > Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
>
> > > (PubType is getting set to 'e' instead of 's'
>
> > >
>
> > > test_pub=# create publication pub1;
>
> > > CREATE PUBLICATION
>
> > > test_pub=# create table myschema.test(i int);
>
> > > CREATE TABLE
>
> > > test_pub=# alter publication pub1 set schema myschema;
>
> > > ALTER PUBLICATION
>
> > > test_pub=# \dRp pub1
>
> > >                                    List of publications
>
> > >  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
>
> > > Via root | PubType
>
> > > ------+-------+------------+---------+---------+---------+-----------+----------+---------
>
> > >  pub1 | gregn | f          | t       | t       | t       | t         |
>
> > > f        | e
>
> > > (1 row)
>
> > >
>
> > > test_pub=# alter publication pub1 add table test;
>
> > > ALTER PUBLICATION
>
> > > test_pub=# \dRp pub1
>
> > >                                    List of publications
>
> > >  Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
>
> > > Via root | PubType
>
> > > ------+-------+------------+---------+---------+---------+-----------+----------+---------
>
> > >  pub1 | gregn | f          | t       | t       | t       | t         |
>
> > > f        | t
>
> > > (1 row)
>
> > >
>
> > >
>
> > > When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
>
> > > publication, it seems OK.
>
> >
>
> > Modified.
>
> >
>
>
>
> Thanks for your patch. But there is a problem about "ALTER PUBLICATION … SET TABLE …", which is similar to the issue
Gregreported at [1]. 
>
>
>
> For example:
>
> postgres=# CREATE TABLE public.t1 (a int);
>
> CREATE TABLE
>
> postgres=# CREATE PUBLICATION pub1;
>
> CREATE PUBLICATION
>
> postgres=# ALTER PUBLICATION pub1 SET TABLE public.t1;
>
> ALTER PUBLICATION
>
> postgres=# \dRp
>
>                                     List of publications
>
> Name |  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType
>
> ------+----------+------------+---------+---------+---------+-----------+----------+---------
>
> pub1 | postgres | f          | t       | t       | t       | t         | f        | e
>
> (1 row)
>
>
>
> I think PubType in this case should be 't' instead of 'e'. Please have a look.

Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
I have changed relation name pg_publication_schema to
pg_publication_sch so that the names are in similar lines with
pg_publication_rel relation and similar changes were done for variable
names too.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Jul 19, 2021 at 9:32 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, July 16, 2021 6:10 PM vignesh C <vignesh21@gmail.com>
> > On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> > > > <tanghy.fnst@fujitsu.com> wrote:
> > > > >
> > > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > > > >
> > > > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > > > patch attached at [1].
> > > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > > > >
> > > > > Thanks for fixing it.
> > > > >
> > > > > By applying your V10 patch, I saw three problems, please have a look.
> > > > >
> > > > > 1. An issue about pg_dump.
> > > > > When public schema was published, the publication was created in the
> > > > > output file, but public schema was not added to it. (Other schemas
> > > > > could be added as expected.)
> > > > >
> > > > > I looked into it and found that selectDumpableNamespace function marks
> > > > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > > > leading to schema public is ignored in getPublicationSchemas. So we'd better
> > > > check whether schemas should be dumped in another way.
> > > > >
> > > > > I tried to fix it with the following change, please have a look.
> > > > > (Maybe we also need to add some comments for it.)
> > > > >
> > > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > > > index f6b4f12648..a327d2568b 100644
> > > > > --- a/src/bin/pg_dump/pg_dump.c
> > > > > +++ b/src/bin/pg_dump/pg_dump.c
> > > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > > > NamespaceInfo nspinfo[], int numSchemas)
> > > > >                  * Ignore publication membership of schemas whose
> > > > definitions are not
> > > > >                  * to be dumped.
> > > > >                  */
> > > > > -               if (!(nsinfo->dobj.dump &
> > > > DUMP_COMPONENT_DEFINITION))
> > > > > +               if (!((nsinfo->dobj.dump &
> > > > DUMP_COMPONENT_DEFINITION)
> > > > > +                       || (strcmp(nsinfo->dobj.name, "public") == 0
> > > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > > > >                         continue;
> > > > >
> > > > >                 pg_log_info("reading publication membership for schema
> > > > > \"%s\"",
> > > >
> > > > I felt it is intentionally done like that as the pubic schema is created by default,
> > > > hence it is not required to dump else we will get errors while restoring.
> > > > Thougths?
> > >
> > > Thanks for the new patches and I also looked at this issue.
> > >
> > > For user defined schema and publication:
> > > --------------------------
> > > create schema s1;
> > > create publication pub2 for SCHEMA s1;
> > > --------------------------
> > >
> > > pg_dump will only generate the following SQLs:
> > >
> > > ------pg_dump result------
> > > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
> > > ALTER PUBLICATION pub2 ADD SCHEMA s1;
> > > --------------------------
> > >
> > > But for the public schema:
> > > --------------------------
> > > create publication pub for SCHEMA public;
> > > --------------------------
> > >
> > > pg_dump will only generate the following SQL:
> > >
> > > ------pg_dump result------
> > > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
> > > --------------------------
> > >
> > > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
> > > means the public schema won't be published after restoring. So, I think we'd
> > > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?
> >
> > Thanks for reporting this issue, this issue is fixed in the v12 patch attached.
> >
>
> I tested your v12 patch and found a problem in the following case.
>
> Step 1:
> postgres=# create schema s1;
> CREATE SCHEMA
> postgres=# create table s1.t1 (a int);
> CREATE TABLE
> postgres=# create publication pub_t for table s1.t1;
> CREATE PUBLICATION
> postgres=# create publication pub_s for schema s1;
> CREATE PUBLICATION
>
> Step 2:
> pg_dump -N s1
>
> I dumped and excluded schema s1, pg_dump generated the following SQL:
> -------------------------------
> ALTER PUBLICATION pub_s ADD SCHEMA s1;
>
> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD TABLE s1.t1" was not generated in my case. Thoughts?

Thanks for reporting this issue, this issue is fixed in the v13 patch posted at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Modified.
> >
> > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
> > [1]  -
https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
> >
>
> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
> After that command, it still regards it as an empty (e) publication,
> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>
> e.g.
>
> test_pub=# create schema myschema;
> CREATE SCHEMA
> test_pub=# CREATE TABLE myschema.test (key int, value text, data jsonb);
> CREATE TABLE
> test_pub=# create publication pub1;
> CREATE PUBLICATION
> test_pub=# \dRp+ pub1
>                                  Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
> root | Pubtype
> -------+------------+---------+---------+---------+-----------+----------+---------
>  gregn | f          | t       | t       | t       | t         | f        | e
> (1 row)
>
> test_pub=# alter publication pub1 set table myschema.test;
> ALTER PUBLICATION
> test_pub=# \dRp+ pub1
>                                  Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
> root | Pubtype
> -------+------------+---------+---------+---------+-----------+----------+---------
>  gregn | f          | t       | t       | t       | t         | f        | e
> (1 row)
>
> test_pub=# alter publication pub1 add schema myschema;
> ALTER PUBLICATION
> test_pub=# \dRp+ pub1
>                                  Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
> root | Pubtype
> -------+------------+---------+---------+---------+-----------+----------+---------
>  gregn | f          | t       | t       | t       | t         | f        | s
> Schemas:
>     "myschema"

Thanks for reporting this issue, this issue is fixed in the v13 patch
posted at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
>
>
> On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>>
>> On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>> >
>> > Modified.
>> >
>> > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
>> > [1]  - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>> >
>>
>> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
>> After that command, it still regards it as an empty (e) publication,
>> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>>
>
> One issue here is that the code to update publication type is missing
> in AlterPublicationTables for SET TABLE command.

Modified.

> More broadly, I am not clear about the behaviour of the patch when a
> publication is created to publish only certain tables, and is later altered to publish
> a whole schema. I think such behaviour is legitimate. However,
> AFAIU as per current code we can't update the publication type
> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

I initially thought this might not be required for users, I have not made any change for this, I will try to get a few more people's opinion on this and then fix it if required.

> I have some review comments as follows:
> 1.
> In ConvertSchemaSpecListToOidList(List *schemas) function:
>  +     search_path = fetch_search_path(false);
>  +                               nspname = get_namespace_name(linitial_oid(search_path));
>  +                               if (nspname == NULL)    /* recently-deleted namespace? */
>  +                                       ereport(ERROR,
>  +                                                       errcode(ERRCODE_UNDEFINED_SCHEMA),
>  +                                                       errmsg("no schema has been selected"));
>  +
>  +                               schemoid = get_namespace_oid(nspname, false);
>  +                               break;
>
> The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
> doing Schema oid = liinital_oid(search_path)); should be enough.

Modified

> 2. In the same function should there be an if else condition block instead of a switch case as
> there are only two cases.

Modified.

Thanks for the comments, these comments are fixed in the v13 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:

 

 

From: vignesh C <vignesh21@gmail.com>
Sent: Thursday, July 22, 2021 1:38 AM
To: Rahila Syed <rahilasyed90@gmail.com>
Cc: Greg Nancarrow <gregn4422@gmail.com>; Tang, Haiying/
海英 <tanghy.fnst@fujitsu.com>; Ajin Cherian <itsajin@gmail.com>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Subject: Re: Added schema level support for publication.

 

On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
>
>
> On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>>
>> On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>> >
>> > Modified.
>> >
>> > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
>> > [1]  - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>> >
>>
>> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
>> After that command, it still regards it as an empty (e) publication,
>> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>>
>
> One issue here is that the code to update publication type is missing
> in AlterPublicationTables for SET TABLE command.

Modified.

> More broadly, I am not clear about the behaviour of the patch when a
> publication is created to publish only certain tables, and is later altered to publish
> a whole schema. I think such behaviour is legitimate. However,
> AFAIU as per current code we can't update the publication type
> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

I initially thought this might not be required for users, I have not made any change for this, I will try to get a few more people's opinion on this and then fix it if required.

> I have some review comments as follows:
> 1.
> In ConvertSchemaSpecListToOidList(List *schemas) function:
>  +     search_path = fetch_search_path(false);
>  +                               nspname = get_namespace_name(linitial_oid(search_path));
>  +                               if (nspname == NULL)    /* recently-deleted namespace? */
>  +                                       ereport(ERROR,
>  +                                                       errcode(ERRCODE_UNDEFINED_SCHEMA),
>  +                                                       errmsg("no schema has been selected"));
>  +
>  +                               schemoid = get_namespace_oid(nspname, false);
>  +                               break;
>
> The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
> doing Schema oid = liinital_oid(search_path)); should be enough.

Modified

> 2. In the same function should there be an if else condition block instead of a switch case as
> there are only two cases.

Modified.

Thanks for the comments, these comments are fixed in the v13 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On, July 22, 2021 1:38 AM vignesh C <vignesh21@gmail.com> wrote:
>On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed <mailto:rahilasyed90@gmail.com> wrote:
>> More broadly, I am not clear about the behaviour of the patch when a
>> publication is created to publish only certain tables, and is later altered to publish
>> a whole schema. I think such behaviour is legitimate. However,
>> AFAIU as per current code we can't update the publication type
>> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

> I initially thought this might not be required for users, I have not made any
> change for this, I will try to get a few more people's opinion on this and then fix it if required.

Currently, It's not allowed to ALTER a FOR TABLE PUBLICATION to a FOR ALL
TABLES PUBLICATION. So, I am not sure it's legitimate to ALTER a FOR TABLE
PUBLICATION to a FOR SCHEMA PUBLICATION. Personally, It sounds more like a
separate feature which can be discussed in a separate thread.

Best regards,
houzj

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote
> > I think PubType in this case should be 't' instead of 'e'. Please have a look.
> 
> Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
> I have changed relation name pg_publication_schema to pg_publication_sch
> so that the names are in similar lines with pg_publication_rel relation and similar
> changes were done for variable names too.

Hi,

Thanks for the new version patches.
I had a few comments.

1)
+
+    appendPQExpBuffer(query, "ALTER PUBLICATION %s ", fmtId(pubrinfo->pubname));
+    appendPQExpBuffer(query, "ADD SCHEMA %s;\n", fmtId(schemainfo->dobj.name));
+

It seems we can combine these two function call.
like appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
                       fmtId(pubrinfo->pubname),
                       fmtId(schemainfo->dobj.name));


2)
+            footers[0] = pstrdup("Publications:");
+

This word seems need to be translated.
footers[0] = pstrdup(_("Publications:"));

3)
I think it might be better to add a testcase to cover the issue
reported before [1].

[1] https://www.postgresql.org/message-id/CAJcOf-dsKOYKmdrU5nwWeFoHvhiACbmw_KU%3DJQMEeDp6WwijqA%40mail.gmail.com

4)
Personally, the new name pg_publication_sch is not very easy to understand.
(Maybe it's because I am not a native english speaker. If others feel ok,
please ignore this comment)

Best regards,
Houzj

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Jul 22, 2021 at 1:42 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Personally, the new name pg_publication_sch is not very easy to understand.
> (Maybe it's because I am not a native english speaker. If others feel ok,
> please ignore this comment)
>

I was actually thinking the same thing.
I prefer the full SCHEMA/schema, even for all the internal
variables/definitions which have been changed since the last patch
version.
I think Vignesh was trying to be consistent with pg_publication_rel
and pg_subscription_rel, but maybe "rel" is better understood to be an
abbreviation for "relation" than "sch" for "schema"?
Thoughts from others?

Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Thursday, July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
> I have changed relation name pg_publication_schema to
> pg_publication_sch so that the names are in similar lines with
> pg_publication_rel relation and similar changes were done for variable
> names too.

Thanks for your fixing. The issue is fixed as you said.

After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ... SET
SCHEMA...", I would get the following error message:
 

postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1;
ERROR:  duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index"
DETAIL:  Key (psnspcid, pspubid)=(16406, 16405) already exists.

I think the error message is pretty hard to understand. Maybe we can do sth to improve this scenario.

Here is two proposal:
1. Don't report error message, just add some code to make the above command to be executed successfully, 
   just like "ALTER PUBLICATION ... SET TABLE ..." as follolws:

postgres=# ALTER PUBLICATION pub2 SET TABLE t1,t1;
ALTER PUBLICATION
postgres=# \dRp+ pub2
                                   Publication pub2
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype
----------+------------+---------+---------+---------+-----------+----------+---------
 postgres | f          | t       | t       | t       | t         | f        | t
Tables:
    "public.t1"

2. Report a easily understandable message like: Schema s1 specified more than once

Thoughts?

Regards
Tang

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Jul 23, 2021 at 10:56 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
>
> After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ...
SETSCHEMA ...", I would get the following error message:
 
>
> postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1;
> ERROR:  duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index"
> DETAIL:  Key (psnspcid, pspubid)=(16406, 16405) already exists.
>

That definitely seems to be a bug, since "ALTER PUBLICATION ... SET
TABLE ..." ignores duplicates and there is no ERROR.
"CREATE PUBLICATION ... SET SCHEMA s1, s1;" and "ALTER PUBLICATION ...
ADD SCHEMA s1, s1;"   also give the same kind of error.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Jul 22, 2021 at 9:12 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote
> > > I think PubType in this case should be 't' instead of 'e'. Please have a look.
> >
> > Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
> > I have changed relation name pg_publication_schema to pg_publication_sch
> > so that the names are in similar lines with pg_publication_rel relation and similar
> > changes were done for variable names too.
>
> Hi,
>
> Thanks for the new version patches.
> I had a few comments.
>
> 1)
> +
> +       appendPQExpBuffer(query, "ALTER PUBLICATION %s ", fmtId(pubrinfo->pubname));
> +       appendPQExpBuffer(query, "ADD SCHEMA %s;\n", fmtId(schemainfo->dobj.name));
> +
>
> It seems we can combine these two function call.
> like appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
>                                            fmtId(pubrinfo->pubname),
>                                            fmtId(schemainfo->dobj.name));
>

Modified.

> 2)
> +                       footers[0] = pstrdup("Publications:");
> +
>
> This word seems need to be translated.
> footers[0] = pstrdup(_("Publications:"));

Modified

> 3)
> I think it might be better to add a testcase to cover the issue
> reported before [1].
>
> [1] https://www.postgresql.org/message-id/CAJcOf-dsKOYKmdrU5nwWeFoHvhiACbmw_KU%3DJQMEeDp6WwijqA%40mail.gmail.com

Added

> 4)
> Personally, the new name pg_publication_sch is not very easy to understand.
> (Maybe it's because I am not a native english speaker. If others feel ok,
> please ignore this comment)

I have changed it to pg_publication_schema as earlier, as Greg also
felt the other name was better.

Thanks for the comments, the attached v14 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Jul 22, 2021 at 9:38 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Jul 22, 2021 at 1:42 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Personally, the new name pg_publication_sch is not very easy to understand.
> > (Maybe it's because I am not a native english speaker. If others feel ok,
> > please ignore this comment)
> >
>
> I was actually thinking the same thing.
> I prefer the full SCHEMA/schema, even for all the internal
> variables/definitions which have been changed since the last patch
> version.
> I think Vignesh was trying to be consistent with pg_publication_rel
> and pg_subscription_rel, but maybe "rel" is better understood to be an
> abbreviation for "relation" than "sch" for "schema"?
> Thoughts from others?

I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better. This is fixed in the v14 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 23, 2021 at 6:26 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Thursday, July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this issue, this issue is fixed in the attached v13 patch.
> > I have changed relation name pg_publication_schema to
> > pg_publication_sch so that the names are in similar lines with
> > pg_publication_rel relation and similar changes were done for variable
> > names too.
>
> Thanks for your fixing. The issue is fixed as you said.
>
> After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ...
SETSCHEMA ...", I would get the following error message:
 
>
> postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1;
> ERROR:  duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index"
> DETAIL:  Key (psnspcid, pspubid)=(16406, 16405) already exists.
>
> I think the error message is pretty hard to understand. Maybe we can do sth to improve this scenario.
>
> Here is two proposal:
> 1. Don't report error message, just add some code to make the above command to be executed successfully,
>    just like "ALTER PUBLICATION ... SET TABLE ..." as follolws:
>
> postgres=# ALTER PUBLICATION pub2 SET TABLE t1,t1;
> ALTER PUBLICATION
> postgres=# \dRp+ pub2
>                                    Publication pub2
>   Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype
> ----------+------------+---------+---------+---------+-----------+----------+---------
>  postgres | f          | t       | t       | t       | t         | f        | t
> Tables:
>     "public.t1"
>
> 2. Report a easily understandable message like: Schema s1 specified more than once
>

I have changed it to not report any error, this issue is fixed in the
v14 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com

Regards,
Vignesh



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> I have changed it to not report any error, this issue is fixed in the
> v14 patch attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g
> mail.com
> 

Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit
#678f5448c2d869),please rebase it.
 

Besides, I tested your patch on old HEAD and confirmed that the issue I reported was fixed.

Regards
Tang

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Jul 23, 2021 at 10:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better.
Thisis fixed in the v14 patch attached at [1].
 
> [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com
>

Thanks, I think it looks better.

On another issue, there seems to be problems with pg_dump support for
FOR SCHEMA publications.

For example:

   CREATE PUBLICATION p FOR SCHEMA myschema;

pg_dump is dumping:

   CREATE PUBLICATION p WITH (publish = 'insert, update, delete, truncate');
   ...
   ALTER PUBLICATION p ADD SCHEMA p;

Obviously that last line should instead be "ALTER PUBLICATION p ADD
SCHEMA myschema;"

I think the bug is caused by the following:

+ appendPQExpBuffer(query,
+   "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
+   fmtId(pubsinfo->pubname), fmtId(schemainfo->dobj.name));

The comment for fmtId() says:

    *  Note that the returned string must be used before calling fmtId again,
    *  since we re-use the same return buffer each time.

So I think there was a reason why the ALTER PUBLICATION and ADD SCHEMA
were previously formatted separately (and so should NOT have been
combined).
It should be restored to how it was in the previous patch version.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Jul 26, 2021 at 8:17 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Jul 23, 2021 at 10:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was
better.This is fixed in the v14 patch attached at [1].
 
> > [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com
> >
>
> Thanks, I think it looks better.
>
> On another issue, there seems to be problems with pg_dump support for
> FOR SCHEMA publications.
>
> For example:
>
>    CREATE PUBLICATION p FOR SCHEMA myschema;
>
> pg_dump is dumping:
>
>    CREATE PUBLICATION p WITH (publish = 'insert, update, delete, truncate');
>    ...
>    ALTER PUBLICATION p ADD SCHEMA p;
>
> Obviously that last line should instead be "ALTER PUBLICATION p ADD
> SCHEMA myschema;"
>
> I think the bug is caused by the following:
>
> + appendPQExpBuffer(query,
> +   "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
> +   fmtId(pubsinfo->pubname), fmtId(schemainfo->dobj.name));
>
> The comment for fmtId() says:
>
>     *  Note that the returned string must be used before calling fmtId again,
>     *  since we re-use the same return buffer each time.
>
> So I think there was a reason why the ALTER PUBLICATION and ADD SCHEMA
> were previously formatted separately (and so should NOT have been
> combined).
> It should be restored to how it was in the previous patch version.

Thanks for the comment, this is modified in the v15 patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Jul 26, 2021 at 7:41 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have changed it to not report any error, this issue is fixed in the
> > v14 patch attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g
> > mail.com
> >
>
> Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869), please rebase it.
>

Thanks for reporting it, it is rebased at the v15 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm27LRWF9ney%3DcVeD-0jc2%2BJ5Y0wNQhighZB%3DAat4VbNBA%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Jul 26, 2021 at 3:21 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comment, this is modified in the v15 patch attached.
>

I have several minor review comments.

(1) src/backend/catalog/objectaddress.c
Should start comment sentences with an uppercase letter, for consistency.

+ /* fetch publication name and schema oid from input list */

I also notice that some 1-sentence comments end with "." (full-stop)
and others don't. It seems to alternate all over the place, and so is
quite noticeable.
Unfortunately, it already seems to be like this in much of the code
that this patch patches.
Ideally (at least my personal preference is) 1-sentence comments
should not end with a ".".

(2) src/backend/catalog/pg_publication.c
errdetail message

I think the following should say "Temporary schemas ..." (since the
existing error message for tables says "System tables cannot be added
to publications.").

+   errdetail("Temporary schema cannot be added to publications.")));


(3) src/backend/commands/publicationcmds.c
PublicationAddTables

I think that the Assert below is not correct (i.e. not restrictive enough).
Although the condition passes, it is allowing, for example,
stmt->for_all_tables==true if stmt->schemas==NIL, and that doesn't
seem to be correct.
I suggest the following change:

BEFORE:
+ Assert(!stmt || !stmt->for_all_tables || !stmt->schemas);
AFTER:
+ Assert(!stmt || (!stmt->for_all_tables && !stmt->schemas));


(4) src/backend/commands/publicationcmds.c
PublicationAddSchemas

Similarly, I think that the Assert below is not restrictive enough,
and think it should be changed:

BEFORE:
+ Assert(!stmt || !stmt->for_all_tables || !stmt->tables);
AFTER:
+ Assert(!stmt || (!stmt->for_all_tables && !stmt->tables));


(5) src/bin/pg_dump/common.c

Spelling mistake.

BEFORE:
+ pg_log_info("reading publciation schemas");
AFTER:
+ pg_log_info("reading publication schemas");


Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:

On Monday, July 26, 2021 1:25 PM vignesh C <vignesh21@gmail.com> wrote:

> On Mon, Jul 26, 2021 at 7:41 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:

> >

> > On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:

> > >

> > > I have changed it to not report any error, this issue is fixed in the

> > > v14 patch attached at [1].

> > > [1] - https://www.postgresql.org/message-

> > > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g

> > > mail.com

> > >

> >

> > Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869), please rebase it.

> >

>

> Thanks for reporting it, it is rebased at the v15 patch attached at [1].

> [1] - https://www.postgresql.org/message-id/CALDaNm27LRWF9ney%3DcVeD-0jc2%2BJ5Y0wNQhighZB%3DAat4VbNBA%40mail.gmail.com

 

Thanks for your new patch. I applied your patch and it succeeded.

Here are some comments on the tests in your patch. The attached file included changes about the comments, please have a look.

 

1.  src/test/regress/sql/publication.sql

There are some existing tests to verify that we can't add table to for all tables publication', should we add some tests about adding schema to for all tables publication / for table publication?

I added some cases in the attached file.

 

Besides, the following existing comment seems not suitable. It uses 'SET' but the comment says 'add'. And since we can add table or schema, I think we should point out what we add is table, thoughts?

 

-- fail - can't add to for all tables publication

ALTER PUBLICATION testpub_foralltables SET TABLE pub_test.testpub_nopk;

 

 

2. src/test/subscription/t/001_rep_changes.pl

2.1

+# Insert some data into few tables and verify that inserted data is replicated.

+$node_publisher->safe_psql('postgres', "INSERT INTO  sch1.tab1 VALUES(generate_series(11,20))");

+$node_publisher->safe_psql('postgres', "INSERT INTO  sch2.tab1 VALUES(generate_series(11,20))");

+

+$node_publisher->wait_for_catchup('tap_sub_schema');

 

There are two spaces between "INTO" and "sch1.tab1".

 

2.2

Most of publication names are lowercase, and some of them are uppercase. I think it will be better if all of them are lowercase.

I modified them in the attached file.

 

2.3

+# Verify that the subscription relation list is updated after refresh.

+$result = $node_subscriber->safe_psql('postgres',

+        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid IN (SELECT oid FROM pg_subscription WHERE subname = 'tap_sub_schema')");

+is($result, qq(5),

+        'check subscription relation status was dropped on subscriber');

 

Should it be 'check subscription relation status is not yet dropped on subscriber' here?

 

2.4

+# Drop publications as we don't need them anymore

+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema");

 

There is only one publication, so the comment here should be 'Drop publication as we don't need it anymore'.

 

3.

There are some existing test cases about publication for table and publication for all tables in 002_pg_dump.pl, so I think we could add some test cases about publication for schema.

I tried to add some cases in the attached file.

 

Regards

Tang

 

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Jul 27, 2021 at 12:21 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote
>
> Thanks for your new patch. I applied your patch and it succeeded.
>
> Here are some comments on the tests in your patch. The attached file included changes about the comments, please have
alook. 
>
>
>
> 1.  src/test/regress/sql/publication.sql
>
> There are some existing tests to verify that we can't add table to ‘for all tables publication', should we add some
testsabout adding schema to ‘for all tables publication’ / ‘for table publication’? 
>
> I added some cases in the attached file.

I have taken the changes.

>
> Besides, the following existing comment seems not suitable. It uses 'SET' but the comment says 'add'. And since we
canadd table or schema, I think we should point out what we add is table, thoughts? 
>
> -- fail - can't add to for all tables publication
>
> ALTER PUBLICATION testpub_foralltables SET TABLE pub_test.testpub_nopk;
>

Since this is in base code, you might want to post a patch on a separate thread.

> 2. src/test/subscription/t/001_rep_changes.pl
>
> 2.1
>
> +# Insert some data into few tables and verify that inserted data is replicated.
>
> +$node_publisher->safe_psql('postgres', "INSERT INTO  sch1.tab1 VALUES(generate_series(11,20))");
>
> +$node_publisher->safe_psql('postgres', "INSERT INTO  sch2.tab1 VALUES(generate_series(11,20))");
>
> +
>
> +$node_publisher->wait_for_catchup('tap_sub_schema');
>
>
>
> There are two spaces between "INTO" and "sch1.tab1".

I have taken the changes.

> 2.2
>
> Most of publication names are lowercase, and some of them are uppercase. I think it will be better if all of them are
lowercase.
>
> I modified them in the attached file.

I have taken the changes.

> 2.3
>
> +# Verify that the subscription relation list is updated after refresh.
>
> +$result = $node_subscriber->safe_psql('postgres',
>
> +        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid IN (SELECT oid FROM pg_subscription WHERE subname =
'tap_sub_schema')");
>
> +is($result, qq(5),
>
> +        'check subscription relation status was dropped on subscriber');
>
>
>
> Should it be 'check subscription relation status is not yet dropped on subscriber' here?

I have taken the changes.

> 2.4
>
> +# Drop publications as we don't need them anymore
>
> +$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema");
>
>
>
> There is only one publication, so the comment here should be 'Drop publication as we don't need it anymore'.

I have taken the changes.

> 3.
>
> There are some existing test cases about publication for table and publication for all tables in 002_pg_dump.pl, so I
thinkwe could add some test cases about publication for schema. 
>
> I tried to add some cases in the attached file.

Thanks for the patch, I have merged the changes. Attached v16 patch
has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Jul 27, 2021 at 5:11 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Jul 26, 2021 at 3:21 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comment, this is modified in the v15 patch attached.
> >
>
> I have several minor review comments.
>
> (1) src/backend/catalog/objectaddress.c
> Should start comment sentences with an uppercase letter, for consistency.
>
> + /* fetch publication name and schema oid from input list */
>
> I also notice that some 1-sentence comments end with "." (full-stop)
> and others don't. It seems to alternate all over the place, and so is
> quite noticeable.
> Unfortunately, it already seems to be like this in much of the code
> that this patch patches.
> Ideally (at least my personal preference is) 1-sentence comments
> should not end with a ".".

Modified.

> (2) src/backend/catalog/pg_publication.c
> errdetail message
>
> I think the following should say "Temporary schemas ..." (since the
> existing error message for tables says "System tables cannot be added
> to publications.").
>
> +   errdetail("Temporary schema cannot be added to publications.")));
>

Modified.

> (3) src/backend/commands/publicationcmds.c
> PublicationAddTables
>
> I think that the Assert below is not correct (i.e. not restrictive enough).
> Although the condition passes, it is allowing, for example,
> stmt->for_all_tables==true if stmt->schemas==NIL, and that doesn't
> seem to be correct.
> I suggest the following change:
>
> BEFORE:
> + Assert(!stmt || !stmt->for_all_tables || !stmt->schemas);
> AFTER:
> + Assert(!stmt || (!stmt->for_all_tables && !stmt->schemas));
>

Modified.

> (4) src/backend/commands/publicationcmds.c
> PublicationAddSchemas
>
> Similarly, I think that the Assert below is not restrictive enough,
> and think it should be changed:
>
> BEFORE:
> + Assert(!stmt || !stmt->for_all_tables || !stmt->tables);
> AFTER:
> + Assert(!stmt || (!stmt->for_all_tables && !stmt->tables));
>

Modified.

> (5) src/bin/pg_dump/common.c
>
> Spelling mistake.
>
> BEFORE:
> + pg_log_info("reading publciation schemas");
> AFTER:
> + pg_log_info("reading publication schemas");

Modified.

Thanks for the comments, the comments are fixed in the v16 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2LgV5XcLF80rJ60NwnjKpZj%3D%3DLxJpO4W2TG2G5XmUtDA%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On July 28, 2021 6:44 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for the patch, I have merged the changes. Attached v16 patch has the
> fixes for the same.

Thanks for the new version patches.
Here are a few comments:

1)

+       /* Identify which schemas should be dropped */
+       foreach(oldlc, oldschemaids)
+       {
+           Oid         oldschemaid = lfirst_oid(oldlc);
+
+           if (!list_member_oid(schemaoidlist, oldschemaid))
+               delschemas = lappend_oid(delschemas, oldschemaid);
+       }
+

We can use list_difference here to simplify the code.

2)

+
+       /* Filter out duplicates if user specifies "sch1, sch1" */
+       if (list_member_oid(schemaoidlist, schemaoid))
+           continue;
+
+       schemaoidlist = lappend_oid(schemaoidlist, schemaoid);

It might be more concise to use list_append_unique_oid() here.

3)

+  <para>
+   Create a publication that publishes all changes for all the tables present in
+the schema "production":
+<programlisting>

The second line seems not aligned.
After:
+  <para>
+   Create a publication that publishes all changes for all the tables present in
+   the schema "production":
+<programlisting>

4)

+       resetPQExpBuffer(query);
+
+       /* Get the publication membership for the schema */
+       appendPQExpBuffer(query,
+                         "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
+                         "FROM pg_publication_schema ps, pg_publication p "
+                         "WHERE ps.psnspcid = '%u' "
+                         "AND p.oid = ps.pspubid",
+                         nsinfo->dobj.catId.oid);

It seems we can use printfPQExpBuffer() here which is a convenience routine
that does the same thing as resetPQExpBuffer() followed by appendPQExpBuffer().

Best regards,
Houzj

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Jul 30, 2021 at 12:12 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On July 28, 2021 6:44 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the patch, I have merged the changes. Attached v16 patch has the
> > fixes for the same.
>
> Thanks for the new version patches.
> Here are a few comments:
>
> 1)
>
> +       /* Identify which schemas should be dropped */
> +       foreach(oldlc, oldschemaids)
> +       {
> +           Oid         oldschemaid = lfirst_oid(oldlc);
> +
> +           if (!list_member_oid(schemaoidlist, oldschemaid))
> +               delschemas = lappend_oid(delschemas, oldschemaid);
> +       }
> +
>
> We can use list_difference here to simplify the code.

Modified.

> 2)
>
> +
> +       /* Filter out duplicates if user specifies "sch1, sch1" */
> +       if (list_member_oid(schemaoidlist, schemaoid))
> +           continue;
> +
> +       schemaoidlist = lappend_oid(schemaoidlist, schemaoid);
>
> It might be more concise to use list_append_unique_oid() here.

Modified.

> 3)
>
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +the schema "production":
> +<programlisting>
>
> The second line seems not aligned.
> After:
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +   the schema "production":
> +<programlisting>

Modified.

> 4)
>
> +       resetPQExpBuffer(query);
> +
> +       /* Get the publication membership for the schema */
> +       appendPQExpBuffer(query,
> +                         "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
> +                         "FROM pg_publication_schema ps, pg_publication p "
> +                         "WHERE ps.psnspcid = '%u' "
> +                         "AND p.oid = ps.pspubid",
> +                         nsinfo->dobj.catId.oid);
>
> It seems we can use printfPQExpBuffer() here which is a convenience routine
> that does the same thing as resetPQExpBuffer() followed by appendPQExpBuffer().

Modified.

Thanks for the comments, attached v17 patches has the fixes for the same.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> 
> Thanks for the comments, attached v17 patches has the fixes for the same.

Thanks for your new patch.

I saw the following warning when compiling. It seems we don't need this variable any more.

publicationcmds.c: In function ‘AlterPublicationSchemas’:
publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
   ListCell   *oldlc;
               ^~~~~

Regards
Tang

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for the comments, attached v17 patches has the fixes for the same.

Thanks for the new patch, it looks good to me except one minor thing:
It might be better to add the [CREATE PUBLICATION xxx FOR SCHEMA ] in tab-complete.c

Best regards,
houzj

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Tuesday, August 3, 2021 4:10 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> On August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, attached v17 patches has the fixes for the same.
> 
> Thanks for the new patch, it looks good to me except one minor thing:
> It might be better to add the [CREATE PUBLICATION xxx FOR SCHEMA ] in
> tab-complete.c

Sorry, the patch already had the logic, please ignore this comment.

Best regards,
houzj

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> >
> > Thanks for the comments, attached v17 patches has the fixes for the same.
>
> Thanks for your new patch.
>
> I saw the following warning when compiling. It seems we don't need this variable any more.
>
> publicationcmds.c: In function ‘AlterPublicationSchemas’:
> publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
>    ListCell   *oldlc;
>                ^~~~~

Thanks for reporting this, this is fixed in the v18 patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for reporting this, this is fixed in the v18 patch attached.
>

I have started looking into this patch and below are some initial comments.

1.
+ /* Fetch publication name and schema oid from input list */
+ schemaname = strVal(linitial(object));
+ pubname = strVal(lsecond(object));

I think the comment should be: "Fetch schema name and publication name
from input list"

2.
@@ -3902,6 +3958,46 @@ getObjectDescription(const ObjectAddress
*object, bool missing_ok)
  break;
  }

+ case OCLASS_PUBLICATION_SCHEMA:
+ {
+ HeapTuple tup;
+ char    *pubname;
+ Form_pg_publication_schema psform;
+ char    *nspname;
+
+ tup = SearchSysCache1(PUBLICATIONSCHEMA,
+   ObjectIdGetDatum(object->objectId));
+ if (!HeapTupleIsValid(tup))
+ {
+ if (!missing_ok)
+ elog(ERROR, "cache lookup failed for publication schema %u",
+ object->objectId);
+ break;
+ }
+
+ psform = (Form_pg_publication_schema) GETSTRUCT(tup);
+ pubname = get_publication_name(psform->pspubid, false);
+ nspname = get_namespace_name(psform->psnspcid);
+ if (!nspname)
+ {
+ Oid psnspcid = psform->psnspcid;
+
+ pfree(pubname);
+ ReleaseSysCache(tup);
+ if (!missing_ok)
+ elog(ERROR, "cache lookup failed for schema %u",
+ psnspcid);
+ break;
+ }

The above code in getObjectDescription looks quite similar to what you
have in getObjectIdentityParts(). Can we extract the common code into
a separate function?

3. Can we use column name pubkind (similar to relkind in pg_class)
instead of pubtype? If so, please change PUBTYPE_ALLTABLES and similar
other defines to PUBKIND_*.


4.
@@ -3632,6 +3650,7 @@ typedef struct CreatePublicationStmt
  List    *options; /* List of DefElem nodes */
  List    *tables; /* Optional list of tables to add */
  bool for_all_tables; /* Special publication for all tables in db */
+ List    *schemas; /* Optional list of schemas */
 } CreatePublicationStmt;

Isn't it better to keep a schemas list after tables?

5.
@@ -1163,12 +1168,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
  Publication *pub = lfirst(lc);
  bool publish = false;

- if (pub->alltables)
+ if (pub->pubtype == PUBTYPE_ALLTABLES)
  {
  publish = true;
  if (pub->pubviaroot && am_partition)
  publish_as_relid = llast_oid(get_partition_ancestors(relid));
  }
+ else if (pub->pubtype == PUBTYPE_SCHEMA)
+ {
+ Oid schemaId = get_rel_namespace(relid);
+ Oid psid = GetSysCacheOid2(PUBLICATIONSCHEMAMAP,
+    Anum_pg_publication_schema_oid,
+    ObjectIdGetDatum(schemaId),
+    ObjectIdGetDatum(pub->oid));
+
+ if (OidIsValid(psid))
+ {
+ publish = true;
+ if (pub->pubviaroot && am_partition)
+ publish_as_relid = llast_oid(get_partition_ancestors(relid));
+ }
+ }

Isn't it better to get schema publications once as we get relation
publications via GetRelationPublications and then decide whether to
publish or not? I think that will save repeated cache searches for
each publication requested by the subscriber?

6.
+ {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
+ PublicationSchemaPsnspcidPspubidIndexId,
+ 2,
+ {
+ Anum_pg_publication_schema_psnspcid,
+ Anum_pg_publication_schema_pspubid,
+ 0,
+ 0
+ },

Why don't we keep pubid as the first column in this index?

7.
getPublicationSchemas()
{
..
+ /* Get the publication membership for the schema */
+ printfPQExpBuffer(query,
+   "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
+   "FROM pg_publication_schema ps, pg_publication p "
+   "WHERE ps.psnspcid = '%u' "
+   "AND p.oid = ps.pspubid",
+   nsinfo->dobj.catId.oid);
..
}

Why do you need to use join here? Why the query and another handling
in this function be similar to what we have in getPublicationTables?
Also, there is another function GetPublicationSchemas() in the patch,
can we name one of these differently for the purpose of easy
grepability?

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Tuesday, August 3, 2021 11:08 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for reporting this, this is fixed in the v18 patch attached.

Thanks for fixing it.

Few suggestions for V18:

1. 
+# Clean up the tables on both publisher and subscriber as we don't need them
+$node_publisher->safe_psql('postgres', "DROP SCHEMA sch1 cascade");
+$node_publisher->safe_psql('postgres', "DROP SCHEMA sch2 cascade");
+$node_publisher->safe_psql('postgres', "DROP SCHEMA sch3 cascade");
+$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch1 cascade");
+$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch2 cascade");
+$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch3 cascade");

Should we change the comment to "Clean up the schemas ... ", instead of 'tables'?

2.
+$result = $node_subscriber->safe_psql('postgres',
+        "SELECT count(*) FROM sch1.tab3");

Spaces are used here(and some other places), but in most places we use a TAB, so
I think it's better to change it to a TAB.

3.
    List       *tables;            /* List of tables to add/drop */
     bool        for_all_tables; /* Special publication for all tables in db */
     DefElemAction tableAction;    /* What action to perform with the tables */
+    List       *schemas;        /* Optional list of schemas */
 } AlterPublicationStmt;

Should we change the comment here to 'List of schemas to add/drop', then it can
be consistent with the comment for 'tables'.

I also noticed that 'tableAction' variable is used when we add/drop/set schemas,
so maybe the variable name is not suitable any more.

Besides, the following comment is above these codes. Should we add some comments
for schema?

/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */

And it says 'add/drop', do we need to add 'set'? (it's not related to this
patch, so I think I can add it in another thread[1] if needed, which is related
to comment improvement)

4.
I saw the existing tests about permissions in publication.sql, should we add
tests for schema publication? Like this:

diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 33dbdf7bed..c19337631e 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -160,16 +160,19 @@ GRANT CREATE ON DATABASE regression TO regress_publication_user2;
 SET ROLE regress_publication_user2;
 SET client_min_messages = 'ERROR';
 CREATE PUBLICATION testpub2;  -- ok
+CREATE PUBLICATION testpub3;  -- ok
 RESET client_min_messages;

 ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- fail
+ALTER PUBLICATION testpub3 ADD SCHEMA pub_test;  -- fail

 SET ROLE regress_publication_user;
 GRANT regress_publication_user TO regress_publication_user2;
 SET ROLE regress_publication_user2;
 ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- ok
+ALTER PUBLICATION testpub3 ADD SCHEMA pub_test;  -- ok

-DROP PUBLICATION testpub2;
+DROP PUBLICATION testpub2, testpub3;

 SET ROLE regress_publication_user;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;


[1]
https://www.postgresql.org/message-id/flat/OS0PR01MB6113480F937572BF1216DD61FBEF9%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
> >
>
> I have started looking into this patch and below are some initial comments.
>

Few more comments:
===================
1. Do we need the previous column 'puballtables' after adding pubtype
or pubkind in pg_publication?

2.
@@ -224,6 +279,20 @@ CreatePublication(ParseState *pstate,
CreatePublicationStmt *stmt)
..
+ nspcrel = table_open(NamespaceRelationId, ShareUpdateExclusiveLock);
+ PublicationAddSchemas(puboid, schemaoidlist, true, NULL);
+ table_close(nspcrel, ShareUpdateExclusiveLock);

What is the reason for opening and taking a lock on
NamespaceRelationId? Do you want to avoid dropping the corresponding
schema during this duration? If so, that is not sufficient because
what if somebody drops it just before you took lock on
NamespaceRelationId. I think you need to use LockDatabaseObject to
avoid dropping the schema and note that it should be unlocked only at
the end of the transaction similar to what we do for tables. I guess
you need to add this locking inside the function
PublicationAddSchemas. Also, it would be good if you can add few
comments in this part of the code to explain the reason for locking.

3. The function PublicationAddSchemas is called from multiple places
in the patch but the locking protection is not there at all places. I
think if you add locking as suggested in the previous point then it
should be okay. I think you need similar locking for
PublicationDropSchemas.

4.
@@ -421,16 +537,84 @@ AlterPublicationTables(AlterPublicationStmt
*stmt, Relation rel,
  PublicationAddTables(pubid, rels, true, stmt);

  CloseTableList(delrels);
+ if (pubform->pubtype == PUBTYPE_EMPTY)
+ UpdatePublicationTypeTupleValue(rel, tup,
+ Anum_pg_publication_pubtype,
+ PUBTYPE_TABLE);
  }

At the above and all other similar places, the patch first updates the
pg_publication after performing the rel/schema operation. Isn't it
better to first update pg_publication to remain in sync with how
CreatePublication works? I am not able to see any issue with the way
you have it in the patch but it is better to keep the code consistent
across various similar functions to avoid confusion in the future.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> > >
> > > Thanks for the comments, attached v17 patches has the fixes for the same.
> >
> > Thanks for your new patch.
> >
> > I saw the following warning when compiling. It seems we don't need this variable any more.
> >
> > publicationcmds.c: In function ‘AlterPublicationSchemas’:
> > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
> >    ListCell   *oldlc;
> >                ^~~~~
>
> Thanks for reporting this, this is fixed in the v18 patch attached.

I've also started reviewing this patch. I've not looked at the patch
yet but here are initial comments/questions based on using this
feature:

pg_publication catalog still has puballtables column but it's still
necessary? IIUC since pubtype = 'a' means publishing all tables in the
database puballtables seems no longer necessary.

---
Suppose that a parent table and its child table are defined in
different schemas, there is a publication for the schema where only
the parent table is defined, and the subscriber subscribes to the
publication, should changes for its child table be replicated to the
subscriber?

In FOR TABLE cases, i.g., where the subscriber subscribes to the
publication that is only for the parent table, changes for its child
table are replicated to the subscriber.

As far as I tested v18 patch, changes for the child table are not
replicated in FOR SCHEMA cases. Here is the test script:

On publisher and subscriber:
create schema p_schema;
create schema c_schema;
create table p_schema.p (a int) partition by list (a);
create table c_schema.c partition of p_schema.p for values in (1);

On publisher:
create publication pub_p_schema for schema p_schema;

On subscriber:
create subscription pub connection 'dbname=postgres' publication pub_p_schema;

On publisher:
insert into p_schema.p values (1);
select * from p_schema.p;
 a
---
 1
(1 row)

On subscriber:
select * from p_schema.p;
 a
---

(0 rows)

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
> >
>
> I have started looking into this patch and below are some initial comments.
>
> 1.
> + /* Fetch publication name and schema oid from input list */
> + schemaname = strVal(linitial(object));
> + pubname = strVal(lsecond(object));
>
> I think the comment should be: "Fetch schema name and publication name
> from input list"
>

Modified.

> 2.
> @@ -3902,6 +3958,46 @@ getObjectDescription(const ObjectAddress
> *object, bool missing_ok)
>   break;
>   }
>
> + case OCLASS_PUBLICATION_SCHEMA:
> + {
> + HeapTuple tup;
> + char    *pubname;
> + Form_pg_publication_schema psform;
> + char    *nspname;
> +
> + tup = SearchSysCache1(PUBLICATIONSCHEMA,
> +   ObjectIdGetDatum(object->objectId));
> + if (!HeapTupleIsValid(tup))
> + {
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for publication schema %u",
> + object->objectId);
> + break;
> + }
> +
> + psform = (Form_pg_publication_schema) GETSTRUCT(tup);
> + pubname = get_publication_name(psform->pspubid, false);
> + nspname = get_namespace_name(psform->psnspcid);
> + if (!nspname)
> + {
> + Oid psnspcid = psform->psnspcid;
> +
> + pfree(pubname);
> + ReleaseSysCache(tup);
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for schema %u",
> + psnspcid);
> + break;
> + }
>
> The above code in getObjectDescription looks quite similar to what you
> have in getObjectIdentityParts(). Can we extract the common code into
> a separate function?

Modified.

> 3. Can we use column name pubkind (similar to relkind in pg_class)
> instead of pubtype? If so, please change PUBTYPE_ALLTABLES and similar
> other defines to PUBKIND_*.
>

Modified.

> 4.
> @@ -3632,6 +3650,7 @@ typedef struct CreatePublicationStmt
>   List    *options; /* List of DefElem nodes */
>   List    *tables; /* Optional list of tables to add */
>   bool for_all_tables; /* Special publication for all tables in db */
> + List    *schemas; /* Optional list of schemas */
>  } CreatePublicationStmt;
>
> Isn't it better to keep a schemas list after tables?

Modified.

> 5.
> @@ -1163,12 +1168,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
>   Publication *pub = lfirst(lc);
>   bool publish = false;
>
> - if (pub->alltables)
> + if (pub->pubtype == PUBTYPE_ALLTABLES)
>   {
>   publish = true;
>   if (pub->pubviaroot && am_partition)
>   publish_as_relid = llast_oid(get_partition_ancestors(relid));
>   }
> + else if (pub->pubtype == PUBTYPE_SCHEMA)
> + {
> + Oid schemaId = get_rel_namespace(relid);
> + Oid psid = GetSysCacheOid2(PUBLICATIONSCHEMAMAP,
> +    Anum_pg_publication_schema_oid,
> +    ObjectIdGetDatum(schemaId),
> +    ObjectIdGetDatum(pub->oid));
> +
> + if (OidIsValid(psid))
> + {
> + publish = true;
> + if (pub->pubviaroot && am_partition)
> + publish_as_relid = llast_oid(get_partition_ancestors(relid));
> + }
> + }
>
> Isn't it better to get schema publications once as we get relation
> publications via GetRelationPublications and then decide whether to
> publish or not? I think that will save repeated cache searches for
> each publication requested by the subscriber?

Modified.

> 6.
> + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> + PublicationSchemaPsnspcidPspubidIndexId,
> + 2,
> + {
> + Anum_pg_publication_schema_psnspcid,
> + Anum_pg_publication_schema_pspubid,
> + 0,
> + 0
> + },
>
> Why don't we keep pubid as the first column in this index?

I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
it is, thoughts?

> 7.
> getPublicationSchemas()
> {
> ..
> + /* Get the publication membership for the schema */
> + printfPQExpBuffer(query,
> +   "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
> +   "FROM pg_publication_schema ps, pg_publication p "
> +   "WHERE ps.psnspcid = '%u' "
> +   "AND p.oid = ps.pspubid",
> +   nsinfo->dobj.catId.oid);
> ..
> }
>
> Why do you need to use join here? Why the query and another handling
> in this function be similar to what we have in getPublicationTables?
> Also, there is another function GetPublicationSchemas() in the patch,
> can we name one of these differently for the purpose of easy
> grepability?

Modified it similar to getPublicationTables without joins. The
function is renamed to getPublicationNamespaces.

Thanks for the comments, the attached v19 patch has the fixes for the comments.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Aug 4, 2021 at 8:08 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>
> On Tuesday, August 3, 2021 11:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
>
> Thanks for fixing it.
>
> Few suggestions for V18:
>
> 1.
> +# Clean up the tables on both publisher and subscriber as we don't need them
> +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch1 cascade");
> +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch2 cascade");
> +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch3 cascade");
> +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch1 cascade");
> +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch2 cascade");
> +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch3 cascade");
>
> Should we change the comment to "Clean up the schemas ... ", instead of 'tables'?

Modified.

> 2.
> +$result = $node_subscriber->safe_psql('postgres',
> +        "SELECT count(*) FROM sch1.tab3");
>
> Spaces are used here(and some other places), but in most places we use a TAB, so
> I think it's better to change it to a TAB.

Modified.

> 3.
>         List       *tables;                     /* List of tables to add/drop */
>         bool            for_all_tables; /* Special publication for all tables in db */
>         DefElemAction tableAction;      /* What action to perform with the tables */
> +       List       *schemas;            /* Optional list of schemas */
>  } AlterPublicationStmt;
>
> Should we change the comment here to 'List of schemas to add/drop', then it can
> be consistent with the comment for 'tables'.

Modified.

> I also noticed that 'tableAction' variable is used when we add/drop/set schemas,
> so maybe the variable name is not suitable any more.

Changed the variable name.

> Besides, the following comment is above these codes. Should we add some comments
> for schema?
>
> /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */

Modified

> And it says 'add/drop', do we need to add 'set'? (it's not related to this
> patch, so I think I can add it in another thread[1] if needed, which is related
> to comment improvement)

You can include the change in the patch posted.

> 4.
> I saw the existing tests about permissions in publication.sql, should we add
> tests for schema publication? Like this:
>
> diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
> index 33dbdf7bed..c19337631e 100644
> --- a/src/test/regress/sql/publication.sql
> +++ b/src/test/regress/sql/publication.sql
> @@ -160,16 +160,19 @@ GRANT CREATE ON DATABASE regression TO regress_publication_user2;
>  SET ROLE regress_publication_user2;
>  SET client_min_messages = 'ERROR';
>  CREATE PUBLICATION testpub2;  -- ok
> +CREATE PUBLICATION testpub3;  -- ok
>  RESET client_min_messages;
>
>  ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- fail
> +ALTER PUBLICATION testpub3 ADD SCHEMA pub_test;  -- fail
>
>  SET ROLE regress_publication_user;
>  GRANT regress_publication_user TO regress_publication_user2;
>  SET ROLE regress_publication_user2;
>  ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- ok
> +ALTER PUBLICATION testpub3 ADD SCHEMA pub_test;  -- ok
>
> -DROP PUBLICATION testpub2;
> +DROP PUBLICATION testpub2, testpub3;
>
>  SET ROLE regress_publication_user;
>  REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;

Added.

The changes for the above are available in the v19 patch posted at [1].

[1] - https://www.postgresql.org/message-id/CALDaNm3BMLBpWOSdS3Q2vwpsM%3D0yovpJm8dKHRqNyFpANbrhpw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for reporting this, this is fixed in the v18 patch attached.
> > >
> >
> > I have started looking into this patch and below are some initial comments.
> >
>
> Few more comments:
> ===================
> 1. Do we need the previous column 'puballtables' after adding pubtype
> or pubkind in pg_publication?

I felt this should be retained as old client will still be using puballtables, like in case of old client executing \dRp+  commands.

> 2.
> @@ -224,6 +279,20 @@ CreatePublication(ParseState *pstate,
> CreatePublicationStmt *stmt)
> ..
> + nspcrel = table_open(NamespaceRelationId, ShareUpdateExclusiveLock);
> + PublicationAddSchemas(puboid, schemaoidlist, true, NULL);
> + table_close(nspcrel, ShareUpdateExclusiveLock);
>
> What is the reason for opening and taking a lock on
> NamespaceRelationId? Do you want to avoid dropping the corresponding
> schema during this duration? If so, that is not sufficient because
> what if somebody drops it just before you took lock on
> NamespaceRelationId. I think you need to use LockDatabaseObject to
> avoid dropping the schema and note that it should be unlocked only at
> the end of the transaction similar to what we do for tables. I guess
> you need to add this locking inside the function
> PublicationAddSchemas. Also, it would be good if you can add few
> comments in this part of the code to explain the reason for locking.

Modified.

> 3. The function PublicationAddSchemas is called from multiple places
> in the patch but the locking protection is not there at all places. I
> think if you add locking as suggested in the previous point then it
> should be okay. I think you need similar locking for
> PublicationDropSchemas.

Modified.

> 4.
> @@ -421,16 +537,84 @@ AlterPublicationTables(AlterPublicationStmt
> *stmt, Relation rel,
>   PublicationAddTables(pubid, rels, true, stmt);
>
>   CloseTableList(delrels);
> + if (pubform->pubtype == PUBTYPE_EMPTY)
> + UpdatePublicationTypeTupleValue(rel, tup,
> + Anum_pg_publication_pubtype,
> + PUBTYPE_TABLE);
>   }
>
> At the above and all other similar places, the patch first updates the
> pg_publication after performing the rel/schema operation. Isn't it
> better to first update pg_publication to remain in sync with how
> CreatePublication works? I am not able to see any issue with the way
> you have it in the patch but it is better to keep the code consistent
> across various similar functions to avoid confusion in the future.

Modified.

Thanks for the comments, the changes for the above are available in the v19 patch posted at [1].

[1] - https://www.postgresql.org/message-id/CALDaNm3BMLBpWOSdS3Q2vwpsM%3D0yovpJm8dKHRqNyFpANbrhpw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Few more comments:
> > ===================
> > 1. Do we need the previous column 'puballtables' after adding pubtype
> > or pubkind in pg_publication?
>
> I felt this should be retained as old client will still be using puballtables, like in case of old client executing
\dRp+ commands.
 
>

But do we guarantee that old clients work with newer server versions?
For example, psql docs say: "psql works best with servers of the same
or an older major version. Backslash commands are particularly likely
to fail if the server is of a newer version than psql itself." [1]
(See Notes Section). Similarly, pg_dump docs say: "However, pg_dump
cannot dump from PostgreSQL servers newer than its own major version;
it will refuse to even try, rather than risk making an invalid dump."
[2] (See Notes Section).

It seems Sawada-San has the same question and IIUC docs suggest we
don't need such compatibility, so what makes you think we need it?

[1] - https://www.postgresql.org/docs/devel/app-psql.html
[2] - https://www.postgresql.org/docs/devel/app-pgdump.html

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
>
> > 6.
> > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > + PublicationSchemaPsnspcidPspubidIndexId,
> > + 2,
> > + {
> > + Anum_pg_publication_schema_psnspcid,
> > + Anum_pg_publication_schema_pspubid,
> > + 0,
> > + 0
> > + },
> >
> > Why don't we keep pubid as the first column in this index?
>
> I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> it is, thoughts?
>

Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
because it is searched using the only relid in
GetRelationPublications, so, similarly, in the patch, you are using
schema_oid in GetSchemaPublications, so probably that will help. I was
wondering why you haven't directly used the cache in
GetSchemaPublications similar to GetRelationPublications? It seems
there is a danger for concurrent object drop. Can you please check how
the safety is ensured say when either one wants to drop the
corresponding relation/schema or publication? Another point is why
can't we use the other index (where the index is on relid or
schema_oid (PublicationSchemaObjectIndexId))?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 6, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Few more comments:
> > > ===================
> > > 1. Do we need the previous column 'puballtables' after adding pubtype
> > > or pubkind in pg_publication?
> >
> > I felt this should be retained as old client will still be using puballtables, like in case of old client executing
\dRp+ commands.
 
> >
>
> But do we guarantee that old clients work with newer server versions?
> For example, psql docs say: "psql works best with servers of the same
> or an older major version. Backslash commands are particularly likely
> to fail if the server is of a newer version than psql itself." [1]
> (See Notes Section). Similarly, pg_dump docs say: "However, pg_dump
> cannot dump from PostgreSQL servers newer than its own major version;
> it will refuse to even try, rather than risk making an invalid dump."
> [2] (See Notes Section).
>
> It seems Sawada-San has the same question and IIUC docs suggest we
> don't need such compatibility, so what makes you think we need it?

Ok, I was not sure if we can remove any system table columns, hence
had retained it. It seems that my understanding was wrong. I will
remove this column in the next version of the patch.

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > 6.
> > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > + 2,
> > > + {
> > > + Anum_pg_publication_schema_psnspcid,
> > > + Anum_pg_publication_schema_pspubid,
> > > + 0,
> > > + 0
> > > + },
> > >
> > > Why don't we keep pubid as the first column in this index?
> >
> > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > it is, thoughts?
> >
>
> Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> because it is searched using the only relid in
> GetRelationPublications, so, similarly, in the patch, you are using
> schema_oid in GetSchemaPublications, so probably that will help. I was
> wondering why you haven't directly used the cache in
> GetSchemaPublications similar to GetRelationPublications?

Both of the approaches work, I was not sure which one is better, If
the approach in GetRelationPublications is better, I will change it to
something similar to GetRelationPublications. Thoughts?

It seems
> there is a danger for concurrent object drop. Can you please check how
> the safety is ensured say when either one wants to drop the
> corresponding relation/schema or publication?

If a table is dropped concurrently from another session during logical
replication of some operation in that table, while we get
get_rel_sync_entry the cache invalidations(rel_sync_cache_relation_cb)
happen.  The cache entry will be marked as false, also the schema_sent
will be marked as false. It will resend the relation using the
relation that was prepared while processing this transaction from
ReorderBufferProcessTXN. I felt this is ok since the relation is
dropped after the operation on the table. Similarly if the publication
is dropped concurrently from another session during logical
replication of some operation in that table, while we get
get_rel_sync_entry the cache
invalidations(publication_invalidation_cb) happen. The publications
will be reloaded and validated again, the data will be replicated to
the server. I felt this behavior is fine since the publication is
dropped after the operation on the table.

 Another point is why
> can't we use the other index (where the index is on relid or
> schema_oid (PublicationSchemaObjectIndexId))?

I felt this cannot be used because In this case the index is in the
oid column of pg_publication_schema and not on psnspcid column.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > > 6.
> > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > + 2,
> > > > + {
> > > > + Anum_pg_publication_schema_psnspcid,
> > > > + Anum_pg_publication_schema_pspubid,
> > > > + 0,
> > > > + 0
> > > > + },
> > > >
> > > > Why don't we keep pubid as the first column in this index?
> > >
> > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > it is, thoughts?
> > >
> >
> > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > because it is searched using the only relid in
> > GetRelationPublications, so, similarly, in the patch, you are using
> > schema_oid in GetSchemaPublications, so probably that will help. I was
> > wondering why you haven't directly used the cache in
> > GetSchemaPublications similar to GetRelationPublications?
>
> Both of the approaches work, I was not sure which one is better, If
> the approach in GetRelationPublications is better, I will change it to
> something similar to GetRelationPublications. Thoughts?
>

I think it is better to use the cache as if we don't find the entry in
the cache, then we will anyway search the required entry via sys table
scan, see SearchCatCacheList. I think the point I wanted to ensure was
that a concurrent session won't blow up the entry while we are looking
at it. How is that ensured?


-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> > > >
> > > > Thanks for the comments, attached v17 patches has the fixes for the same.
> > >
> > > Thanks for your new patch.
> > >
> > > I saw the following warning when compiling. It seems we don't need this variable any more.
> > >
> > > publicationcmds.c: In function ‘AlterPublicationSchemas’:
> > > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
> > >    ListCell   *oldlc;
> > >                ^~~~~
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
>
> I've also started reviewing this patch. I've not looked at the patch
> yet but here are initial comments/questions based on using this
> feature:
>
> pg_publication catalog still has puballtables column but it's still
> necessary? IIUC since pubtype = 'a' means publishing all tables in the
> database puballtables seems no longer necessary.

I will remove puballtables  in my next version of the patch.

> ---
> Suppose that a parent table and its child table are defined in
> different schemas, there is a publication for the schema where only
> the parent table is defined, and the subscriber subscribes to the
> publication, should changes for its child table be replicated to the
> subscriber?

I felt that in this case only the table data that is present in the
publish schema should be sent to the subscriber. Since the child table
schema is not part of the publication, I felt this child table data
should not be replicated. Thoughts?
I have kept the above same behavior in the case of publication created
using PUBLISH_VIA_PARTITION_ROOT option i.e the child table data will
not be sent.  But now I'm feeling we should send the child table data
since it is being sent through the parent table which is part of the
publication. Also this way users can use this option if the user has
the table having partitions designed across the schemas. Thoughts?

> In FOR TABLE cases, i.g., where the subscriber subscribes to the
> publication that is only for the parent table, changes for its child
> table are replicated to the subscriber.
>
> As far as I tested v18 patch, changes for the child table are not
> replicated in FOR SCHEMA cases. Here is the test script:
>
> On publisher and subscriber:
> create schema p_schema;
> create schema c_schema;
> create table p_schema.p (a int) partition by list (a);
> create table c_schema.c partition of p_schema.p for values in (1);
>
> On publisher:
> create publication pub_p_schema for schema p_schema;
>
> On subscriber:
> create subscription pub connection 'dbname=postgres' publication pub_p_schema;
>
> On publisher:
> insert into p_schema.p values (1);
> select * from p_schema.p;
>  a
> ---
>  1
> (1 row)
>
> On subscriber:
> select * from p_schema.p;
>  a
> ---
>
> (0 rows)

I have kept this behavior intentionally, details explained above. Thoughts?

Regards,
Vignesh



Re: Added schema level support for publication.

From
Mark Dilger
Date:

> On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote:
>
> the attached v19 patch

With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that
table:

+CREATE ROLE user1;
+GRANT CREATE ON DATABASE regression TO user1;
+CREATE ROLE user2;
+GRANT CREATE ON DATABASE regression TO user2;
+SET SESSION AUTHORIZATION user1;
+CREATE SCHEMA user1schema;
+GRANT CREATE, USAGE ON SCHEMA user1schema TO user2;
+RESET SESSION AUTHORIZATION;
+SET SESSION AUTHORIZATION user2;
+CREATE TABLE user1schema.user2private (junk text);
+REVOKE ALL PRIVILEGES ON user1schema.user2private FROM PUBLIC;
+REVOKE ALL PRIVILEGES ON user1schema.user2private FROM user1;
+CREATE TABLE user1schema.user2public (junk text);
+GRANT SELECT ON user1schema.user2public TO PUBLIC;
+RESET SESSION AUTHORIZATION;
+SET SESSION AUTHORIZATION user1;
+SELECT * FROM user1schema.user2private;
+ERROR:  permission denied for table user2private
+SELECT * FROM user1schema.user2public;
+ junk
+------
+(0 rows)
+
+CREATE PUBLICATION user1pub;
+WARNING:  wal_level is insufficient to publish logical changes
+HINT:  Set wal_level to logical before creating subscriptions.
+ALTER PUBLICATION user1pub
+   ADD TABLE user1schema.user2public;
+ERROR:  must be owner of table user2public
+ALTER PUBLICATION user1pub
+   ADD TABLE user1schema.user2private, user1schema.user2public;
+ERROR:  must be owner of table user2private
+SELECT * FROM pg_catalog.pg_publication_tables
+   WHERE pubname = 'user1pub';
+ pubname | schemaname | tablename
+---------+------------+-----------
+(0 rows)
+
+ALTER PUBLICATION user1pub ADD SCHEMA user1schema;
+SELECT * FROM pg_catalog.pg_publication_tables
+   WHERE pubname = 'user1pub';
+ pubname  | schemaname  |  tablename
+----------+-------------+--------------
+ user1pub | user1schema | user2private
+ user1pub | user1schema | user2public
+(2 rows)

It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their schemas,
butthat's how it is.  The design of this patch seems to assume otherwise.  Perhaps ALTER PUBLICATION ... ADD SCHEMA
shouldbe restricted to superusers, just as FOR ALL TABLES? 

Alternatively, you could add ownership checks per table to mirror the behavior of ALTER PUBLICATION ... ADD TABLE, but
thatwould foreclose the option of automatically updating the list of tables in the publication as new tables are added
tothe schema, since those new tables would not necessarily belong to the schema owner, and having a error thrown during
CREATETABLE would be quite unfriendly.  I think until this is hammered out, it is safer to require superuser privileges
andthen we can revisit this issue and loosen the requirement in a subsequent commit. 

What do you think?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Aug 9, 2021 at 9:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> > On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > the attached v19 patch
>
> With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that
table:
>
...
...
>
> It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their
schemas,but that's how it is.  The design of this patch seems to assume otherwise.  Perhaps ALTER PUBLICATION ... ADD
SCHEMAshould be restricted to superusers, just as FOR ALL TABLES? 
>

+1. Your suggestion sounds reasonable to me.

> Alternatively, you could add ownership checks per table to mirror the behavior of ALTER PUBLICATION ... ADD TABLE,
butthat would foreclose the option of automatically updating the list of tables in the publication as new tables are
addedto the schema, since those new tables would not necessarily belong to the schema owner, and having a error thrown
duringCREATE TABLE would be quite unfriendly.  I think until this is hammered out, it is safer to require superuser
privilegesand then we can revisit this issue and loosen the requirement in a subsequent commit. 
>

I think the same argument can be made for "FOR ALL TABLES .." as well.
So, let's leave such a requirement for another patch.


--
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Aug 6, 2021 at 6:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v19 patch has the fixes for the comments.
>

Some more review comments, this time for the v19 patch:


(1) In patch v19-0002, there's still a couple of instances where it
says "publication type" instead of "publication kind".

(2) src/backend/catalog/pg_publication.c

"This should only be used for normal publications."

What exactly is meant by that - what type is considered normal? Maybe
that comment should be more specific.

(3) src/backend/catalog/pg_publication.c
GetSchemaPublications

Function header says "Gets list of publication oids for publications
marked as FOR SCHEMA."

Shouldn't it say something like: "Gets the list of FOR SCHEMA
publication oids associated with a specified schema oid." or something
like that?
(since the function accepts a schemaid parameter)

(4) src/backend/commands/publicationcmds.c

In AlterPublicationSchemas(), I notice that the two error cases
"cannot be added to or dropped ..." don't check stmt->action for
DEFELEM_ADD/DEFELEM_DROP.
Is that OK? (i.e. should these cases error out if stmt->action is not
DEFELEM_ADD/DEFELEM_DROP?)
Also, I assume that the else part (not DEFELEM_ADD/DEFELEM_DROP) is
the "Set" case? Maybe a comment should be added to the top of the else
part.

(5) src/backend/commands/publicationcmds.c
Typo (same in 2 places): "automaically" -> "automatically"

+  * will be released automaically at the end of create publication

See functions:
(i) CreatePublication
(ii) AlterPublicationSchemas

(6) src/backend/commands/publicationcmds.c
LockSchemaList

Function header says "are locked in ShareUpdateExclusiveLock mode" but
then code calls LockDatabaseObject using "AccessShareLock".


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Aug 9, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > ---
> > Suppose that a parent table and its child table are defined in
> > different schemas, there is a publication for the schema where only
> > the parent table is defined, and the subscriber subscribes to the
> > publication, should changes for its child table be replicated to the
> > subscriber?
>
> I felt that in this case only the table data that is present in the
> publish schema should be sent to the subscriber. Since the child table
> schema is not part of the publication, I felt this child table data
> should not be replicated.
>

But, as point out by Sawada-San, the same is true for FOR TABLE case.
I think we should be consistent here and should publish the data for
the child table if the parent table's schema is published.

> I have kept the above same behavior in the case of publication created
> using PUBLISH_VIA_PARTITION_ROOT option i.e the child table data will
> not be sent.  But now I'm feeling we should send the child table data
> since it is being sent through the parent table which is part of the
> publication. Also this way users can use this option if the user has
> the table having partitions designed across the schemas.
>

This sounds fine to me.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 9, 2021 at 9:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>
>
> > On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > the attached v19 patch
>
> With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that
table:
>
> +CREATE ROLE user1;
> +GRANT CREATE ON DATABASE regression TO user1;
> +CREATE ROLE user2;
> +GRANT CREATE ON DATABASE regression TO user2;
> +SET SESSION AUTHORIZATION user1;
> +CREATE SCHEMA user1schema;
> +GRANT CREATE, USAGE ON SCHEMA user1schema TO user2;
> +RESET SESSION AUTHORIZATION;
> +SET SESSION AUTHORIZATION user2;
> +CREATE TABLE user1schema.user2private (junk text);
> +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM PUBLIC;
> +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM user1;
> +CREATE TABLE user1schema.user2public (junk text);
> +GRANT SELECT ON user1schema.user2public TO PUBLIC;
> +RESET SESSION AUTHORIZATION;
> +SET SESSION AUTHORIZATION user1;
> +SELECT * FROM user1schema.user2private;
> +ERROR:  permission denied for table user2private
> +SELECT * FROM user1schema.user2public;
> + junk
> +------
> +(0 rows)
> +
> +CREATE PUBLICATION user1pub;
> +WARNING:  wal_level is insufficient to publish logical changes
> +HINT:  Set wal_level to logical before creating subscriptions.
> +ALTER PUBLICATION user1pub
> +   ADD TABLE user1schema.user2public;
> +ERROR:  must be owner of table user2public
> +ALTER PUBLICATION user1pub
> +   ADD TABLE user1schema.user2private, user1schema.user2public;
> +ERROR:  must be owner of table user2private
> +SELECT * FROM pg_catalog.pg_publication_tables
> +   WHERE pubname = 'user1pub';
> + pubname | schemaname | tablename
> +---------+------------+-----------
> +(0 rows)
> +
> +ALTER PUBLICATION user1pub ADD SCHEMA user1schema;
> +SELECT * FROM pg_catalog.pg_publication_tables
> +   WHERE pubname = 'user1pub';
> + pubname  | schemaname  |  tablename
> +----------+-------------+--------------
> + user1pub | user1schema | user2private
> + user1pub | user1schema | user2public
> +(2 rows)
>
> It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their
schemas,but that's how it is.  The design of this patch seems to assume otherwise.  Perhaps ALTER PUBLICATION ... ADD
SCHEMAshould be restricted to superusers, just as FOR ALL TABLES? 

I will handle this in the next version of the patch.
Additionally I will add this check for "Alter publication add schema"
and "Alter publication set schema". I'm not planning to add this check
for "Alter publication drop schema" to keep the behavior similar to
"Alter publication drop table".
Also, the behavior of "Alter publication drop table" for which the
user is not the owner is successful, Is this behavior correct?
create table tbl1(c1 int);
create table tbl2(c1 int);
create publication mypub1 for table tbl1,tbl2;
SET SESSION AUTHORIZATION user2;
alter table tbl2 owner to user2;
RESET SESSION AUTHORIZATION;
postgres=> alter publication mypub1  drop table tbl2;
ALTER PUBLICATION
postgres=> alter publication mypub1 add table tbl2;
ERROR:  must be owner of table tbl2

Thoughts?

Regards,
Vignesh



Re: Added schema level support for publication.

From
Mark Dilger
Date:

> On Aug 10, 2021, at 10:59 PM, vignesh C <vignesh21@gmail.com> wrote:
>
> Also, the behavior of "Alter publication drop table" for which the
> user is not the owner is successful, Is this behavior correct?

I think that dropping a table from a publication should be allowed for the publication owner, without regard to the
ownerof the table.  Adding a table to a publication is different, as it exposes the contents of the table. 

Consider the following:

+create user user1;
+create user user2;
+create table tbl1(c1 int);
+create table tbl2(c1 int);
+create publication pub1 for table tbl1,tbl2;
+WARNING:  wal_level is insufficient to publish logical changes
+HINT:  Set wal_level to logical before creating subscriptions.
+alter table tbl1 owner to user1;
+alter publication pub1 owner to user1;
+alter table tbl2 owner to user2;
+SET SESSION AUTHORIZATION user2;
+alter publication pub1 drop table tbl1;
+ERROR:  must be owner of publication pub1
+alter publication pub1 drop table tbl2;
+ERROR:  must be owner of publication pub1
+alter publication pub1 add table tbl1;
+ERROR:  must be owner of publication pub1
+alter publication pub1 add table tbl2;
+ERROR:  must be owner of publication pub1
+RESET SESSION AUTHORIZATION;
+SET SESSION AUTHORIZATION user1;
+alter publication pub1 drop table tbl1;
+alter publication pub1 drop table tbl2;
+alter publication pub1 add table tbl1;
+alter publication pub1 add table tbl2;
+ERROR:  must be owner of table tbl2

Clearly user2 cannot modify pub1, not even with respect to user2's own table.  user1 can modify its own publication
exceptfor adding someone else's table.  This seems correct to me. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Aug 11, 2021 at 7:45 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Aug 10, 2021, at 10:59 PM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > Also, the behavior of "Alter publication drop table" for which the
> > user is not the owner is successful, Is this behavior correct?
>
> I think that dropping a table from a publication should be allowed for the publication owner, without regard to the
ownerof the table.
 
> Adding a table to a publication is different, as it exposes the contents of the table.
>
..
..
>
> Clearly user2 cannot modify pub1, not even with respect to user2's own table.  user1 can modify its own publication
exceptfor adding someone else's table.  This seems correct to me.
 
>

That looks reasonable to me as well and I think we should follow the
same for the schema.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Mon, Aug 9, 2021 at 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > > 6.
> > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > > + 2,
> > > > > + {
> > > > > + Anum_pg_publication_schema_psnspcid,
> > > > > + Anum_pg_publication_schema_pspubid,
> > > > > + 0,
> > > > > + 0
> > > > > + },
> > > > >
> > > > > Why don't we keep pubid as the first column in this index?
> > > >
> > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > > it is, thoughts?
> > > >
> > >
> > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > > because it is searched using the only relid in
> > > GetRelationPublications, so, similarly, in the patch, you are using
> > > schema_oid in GetSchemaPublications, so probably that will help. I was
> > > wondering why you haven't directly used the cache in
> > > GetSchemaPublications similar to GetRelationPublications?
> >
> > Both of the approaches work, I was not sure which one is better, If
> > the approach in GetRelationPublications is better, I will change it to
> > something similar to GetRelationPublications. Thoughts?
> >
>
> I think it is better to use the cache as if we don't find the entry in
> the cache, then we will anyway search the required entry via sys table
> scan, see SearchCatCacheList.

+1

I had the same comment while reading the v19 patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v19 patch has the fixes for the comments.

Thank you for updating the patch!

Here are some comments on v19 patch:

+                case OCLASS_PUBLICATION_SCHEMA:
+                        RemovePublicationSchemaById(object->objectId);
+                        break;

+void
+RemovePublicationSchemaById(Oid psoid)
+{
+        Relation       rel;
+        HeapTuple      tup;
+
+        rel = table_open(PublicationSchemaRelationId, RowExclusiveLock);
+
+        tup = SearchSysCache1(PUBLICATIONSCHEMA, ObjectIdGetDatum(psoid));
+
+        if (!HeapTupleIsValid(tup))
+                elog(ERROR, "cache lookup failed for publication schema %u",
+                         psoid);
+
+        CatalogTupleDelete(rel, &tup->t_self);
+
+        ReleaseSysCache(tup);
+
+        table_close(rel, RowExclusiveLock);
+}

Since RemovePublicationSchemaById() does simple catalog tuple
deletion, it seems to me that we can DropObjectById() to delete the
row of pg_publication_schema.

 ---
         {
-                ScanKeyInit(&key[0],
+                ScanKeyData skey[1];
+
+                ScanKeyInit(&skey[0],
                                         Anum_pg_class_relkind,
                                         BTEqualStrategyNumber, F_CHAREQ,

CharGetDatum(RELKIND_PARTITIONED_TABLE));

-                scan = table_beginscan_catalog(classRel, 1, key);
+                scan = table_beginscan_catalog(classRel, 1, skey);

Since we specify 1 as the number of keys in table_beginscan_catalog(),
can we reuse 'key' instead of using 'skey'?

---
Even if we drop all tables added to the publication from it, 'pubkind'
doesn't go back to 'empty'. Is that intentional behavior? If we do
that, we can save the lookup of pg_publication_rel and
pg_publication_schema in some cases, and we can switch the publication
that was created as FOR SCHEMA to FOR TABLE and vice versa.

---
+static void
+UpdatePublicationKindTupleValue(Relation rel, HeapTuple tup, int col,
+                                                                char pubkind)

Since all callers of this function specify Anum_pg_publication_pubkind
to 'col', it seems 'col' is not necessary.

---
+static void
+AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
+                                                HeapTuple tup,
Form_pg_publication pubform)

I think we don't need to pass 'pubform' to this function since we can
get it by GETSTRUCT(tup).

---
+                Oid                    schemaId = get_rel_namespace(relid);
                 List      *pubids = GetRelationPublications(relid);
+                List      *schemaPubids = GetSchemaPublications(schemaId);

Can we defer to get the list of schema publications (i.g.,
'schemaPubids') until we find the PUBKIND_SCHEMA publication? Perhaps
the same is true for building 'pubids'.

---
+                                                   List of publications
+        Name        |          Owner           | All tables | Inserts
| Updates | Deletes | Truncates | Via root | PubKind

+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------+---------
+ testpib_ins_trunct | regress_publication_user | f          | t
| f       | f       | f         | f        | e
+ testpub_default    | regress_publication_user | f          | f
| t       | f       | f         | f        | e

I think it's more readable if \dRp shows 'all tables', 'table',
'schema', and 'empty' in PubKind instead of the single character.

I think 'Pub kind' is more consistent with other column names.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> ---
> Even if we drop all tables added to the publication from it, 'pubkind'
> doesn't go back to 'empty'. Is that intentional behavior? If we do
> that, we can save the lookup of pg_publication_rel and
> pg_publication_schema in some cases, and we can switch the publication
> that was created as FOR SCHEMA to FOR TABLE and vice versa.
>

Do we really want to allow users to change a publication that is FOR
SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
postgres=# Alter Publication pub add table tbl1;
ERROR:  publication "pub" is defined as FOR ALL TABLES
DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Peter Eisentraut
Date:
On 13.08.21 04:59, Amit Kapila wrote:
>> Even if we drop all tables added to the publication from it, 'pubkind'
>> doesn't go back to 'empty'. Is that intentional behavior? If we do
>> that, we can save the lookup of pg_publication_rel and
>> pg_publication_schema in some cases, and we can switch the publication
>> that was created as FOR SCHEMA to FOR TABLE and vice versa.
>>
> Do we really want to allow users to change a publication that is FOR
> SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
> postgres=# Alter Publication pub add table tbl1;
> ERROR:  publication "pub" is defined as FOR ALL TABLES
> DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.

I think the strict separation between publication-for-tables vs. 
publication-for-schemas is a mistake.  Why can't I have a publication 
that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note 
that we have a pending patch to add sequences support to logical 
replication.  So eventually, a publication will be able to contain a 
bunch of different objects of different kinds.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sat, Aug 14, 2021 at 3:02 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 13.08.21 04:59, Amit Kapila wrote:
> >> Even if we drop all tables added to the publication from it, 'pubkind'
> >> doesn't go back to 'empty'. Is that intentional behavior? If we do
> >> that, we can save the lookup of pg_publication_rel and
> >> pg_publication_schema in some cases, and we can switch the publication
> >> that was created as FOR SCHEMA to FOR TABLE and vice versa.
> >>
> > Do we really want to allow users to change a publication that is FOR
> > SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
> > postgres=# Alter Publication pub add table tbl1;
> > ERROR:  publication "pub" is defined as FOR ALL TABLES
> > DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.
>
> I think the strict separation between publication-for-tables vs.
> publication-for-schemas is a mistake.  Why can't I have a publication
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note
> that we have a pending patch to add sequences support to logical
> replication.  So eventually, a publication will be able to contain a
> bunch of different objects of different kinds.
>

Valid point.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote:
> > > >
> > > > Thanks for the comments, attached v17 patches has the fixes for the same.
> > >
> > > Thanks for your new patch.
> > >
> > > I saw the following warning when compiling. It seems we don't need this variable any more.
> > >
> > > publicationcmds.c: In function ‘AlterPublicationSchemas’:
> > > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable]
> > >    ListCell   *oldlc;
> > >                ^~~~~
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
>
> I've also started reviewing this patch. I've not looked at the patch
> yet but here are initial comments/questions based on using this
> feature:
>
> pg_publication catalog still has puballtables column but it's still
> necessary? IIUC since pubtype = 'a' means publishing all tables in the
> database puballtables seems no longer necessary.

Removed puballtables.

> ---
> Suppose that a parent table and its child table are defined in
> different schemas, there is a publication for the schema where only
> the parent table is defined, and the subscriber subscribes to the
> publication, should changes for its child table be replicated to the
> subscriber?
>
> In FOR TABLE cases, i.g., where the subscriber subscribes to the
> publication that is only for the parent table, changes for its child
> table are replicated to the subscriber.

Modified it to keep the behavior similar to FOR Table publication.

> As far as I tested v18 patch, changes for the child table are not
> replicated in FOR SCHEMA cases. Here is the test script:
>
> On publisher and subscriber:
> create schema p_schema;
> create schema c_schema;
> create table p_schema.p (a int) partition by list (a);
> create table c_schema.c partition of p_schema.p for values in (1);
>
> On publisher:
> create publication pub_p_schema for schema p_schema;
>
> On subscriber:
> create subscription pub connection 'dbname=postgres' publication pub_p_schema;
>
> On publisher:
> insert into p_schema.p values (1);
> select * from p_schema.p;
>  a
> ---
>  1
> (1 row)
>
> On subscriber:
> select * from p_schema.p;
>  a
> ---
>
> (0 rows)

Modified to handle this.
Thanks for the comments, the attached v20 patch handles the above issues.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 9, 2021 at 9:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>
>
> > On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > the attached v19 patch
>
> With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that table:
>
> +CREATE ROLE user1;
> +GRANT CREATE ON DATABASE regression TO user1;
> +CREATE ROLE user2;
> +GRANT CREATE ON DATABASE regression TO user2;
> +SET SESSION AUTHORIZATION user1;
> +CREATE SCHEMA user1schema;
> +GRANT CREATE, USAGE ON SCHEMA user1schema TO user2;
> +RESET SESSION AUTHORIZATION;
> +SET SESSION AUTHORIZATION user2;
> +CREATE TABLE user1schema.user2private (junk text);
> +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM PUBLIC;
> +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM user1;
> +CREATE TABLE user1schema.user2public (junk text);
> +GRANT SELECT ON user1schema.user2public TO PUBLIC;
> +RESET SESSION AUTHORIZATION;
> +SET SESSION AUTHORIZATION user1;
> +SELECT * FROM user1schema.user2private;
> +ERROR:  permission denied for table user2private
> +SELECT * FROM user1schema.user2public;
> + junk
> +------
> +(0 rows)
> +
> +CREATE PUBLICATION user1pub;
> +WARNING:  wal_level is insufficient to publish logical changes
> +HINT:  Set wal_level to logical before creating subscriptions.
> +ALTER PUBLICATION user1pub
> +   ADD TABLE user1schema.user2public;
> +ERROR:  must be owner of table user2public
> +ALTER PUBLICATION user1pub
> +   ADD TABLE user1schema.user2private, user1schema.user2public;
> +ERROR:  must be owner of table user2private
> +SELECT * FROM pg_catalog.pg_publication_tables
> +   WHERE pubname = 'user1pub';
> + pubname | schemaname | tablename
> +---------+------------+-----------
> +(0 rows)
> +
> +ALTER PUBLICATION user1pub ADD SCHEMA user1schema;
> +SELECT * FROM pg_catalog.pg_publication_tables
> +   WHERE pubname = 'user1pub';
> + pubname  | schemaname  |  tablename  
> +----------+-------------+--------------
> + user1pub | user1schema | user2private
> + user1pub | user1schema | user2public
> +(2 rows)
>
> It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their schemas, but that's how it is.  The design of this patch seems to assume otherwise.  Perhaps ALTER PUBLICATION ... ADD SCHEMA should be restricted to superusers, just as FOR ALL TABLES?

Thanks for the comment, this is handled in the v20 patch attached at [1].

> Alternatively, you could add ownership checks per table to mirror the behavior of ALTER PUBLICATION ... ADD TABLE, but that would foreclose the option of automatically updating the list of tables in the publication as new tables are added to the schema, since those new tables would not necessarily belong to the schema owner, and having a error thrown during CREATE TABLE would be quite unfriendly.  I think until this is hammered out, it is safer to require superuser privileges and then we can revisit this issue and loosen the requirement in a subsequent commit.

I agree with Amit on this point for handling this as a separate patch, I have not made any change for this, kept the behavior as is. 

[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Aug 10, 2021 at 1:40 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 6:32 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v19 patch has the fixes for the comments.
> >
>
> Some more review comments, this time for the v19 patch:
>
>
> (1) In patch v19-0002, there's still a couple of instances where it
> says "publication type" instead of "publication kind".

Modified

> (2) src/backend/catalog/pg_publication.c
>
> "This should only be used for normal publications."
>
> What exactly is meant by that - what type is considered normal? Maybe
> that comment should be more specific.

Modified

> (3) src/backend/catalog/pg_publication.c
> GetSchemaPublications
>
> Function header says "Gets list of publication oids for publications
> marked as FOR SCHEMA."
>
> Shouldn't it say something like: "Gets the list of FOR SCHEMA
> publication oids associated with a specified schema oid." or something
> like that?
> (since the function accepts a schemaid parameter)

Modfified

> (4) src/backend/commands/publicationcmds.c
>
> In AlterPublicationSchemas(), I notice that the two error cases
> "cannot be added to or dropped ..." don't check stmt->action for
> DEFELEM_ADD/DEFELEM_DROP.
> Is that OK? (i.e. should these cases error out if stmt->action is not
> DEFELEM_ADD/DEFELEM_DROP?)
> Also, I assume that the else part (not DEFELEM_ADD/DEFELEM_DROP) is
> the "Set" case? Maybe a comment should be added to the top of the else
> part.

The error message should also include set, I have modified the error message accordingly.

> (5) src/backend/commands/publicationcmds.c
> Typo (same in 2 places): "automaically" -> "automatically"
>
> +  * will be released automaically at the end of create publication
>
> See functions:
> (i) CreatePublication
> (ii) AlterPublicationSchemas

Modified.

> (6) src/backend/commands/publicationcmds.c
> LockSchemaList
>
> Function header says "are locked in ShareUpdateExclusiveLock mode" but
> then code calls LockDatabaseObject using "AccessShareLock".

Modified.

Thanks for the comments, these issues are fixed in the v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v19 patch has the fixes for the comments.
>
> Thank you for updating the patch!
>
> Here are some comments on v19 patch:
>
> +                case OCLASS_PUBLICATION_SCHEMA:
> +                        RemovePublicationSchemaById(object->objectId);
> +                        break;
>
> +void
> +RemovePublicationSchemaById(Oid psoid)
> +{
> +        Relation       rel;
> +        HeapTuple      tup;
> +
> +        rel = table_open(PublicationSchemaRelationId, RowExclusiveLock);
> +
> +        tup = SearchSysCache1(PUBLICATIONSCHEMA, ObjectIdGetDatum(psoid));
> +
> +        if (!HeapTupleIsValid(tup))
> +                elog(ERROR, "cache lookup failed for publication schema %u",
> +                         psoid);
> +
> +        CatalogTupleDelete(rel, &tup->t_self);
> +
> +        ReleaseSysCache(tup);
> +
> +        table_close(rel, RowExclusiveLock);
> +}
>
> Since RemovePublicationSchemaById() does simple catalog tuple
> deletion, it seems to me that we can DropObjectById() to delete the
> row of pg_publication_schema.

Relation cache invalidations were missing in the function, I have added and retained the function with invalidation changes.

>  ---
>          {
> -                ScanKeyInit(&key[0],
> +                ScanKeyData skey[1];
> +
> +                ScanKeyInit(&skey[0],
>                                          Anum_pg_class_relkind,
>                                          BTEqualStrategyNumber, F_CHAREQ,
>
> CharGetDatum(RELKIND_PARTITIONED_TABLE));
>
> -                scan = table_beginscan_catalog(classRel, 1, key);
> +                scan = table_beginscan_catalog(classRel, 1, skey);
>
> Since we specify 1 as the number of keys in table_beginscan_catalog(),
> can we reuse 'key' instead of using 'skey'?

Modified.

> ---
> Even if we drop all tables added to the publication from it, 'pubkind'
> doesn't go back to 'empty'. Is that intentional behavior? If we do
> that, we can save the lookup of pg_publication_rel and
> pg_publication_schema in some cases, and we can switch the publication
> that was created as FOR SCHEMA to FOR TABLE and vice versa.

I felt this can be handled as a separate patch as the same scenario applies for all tables publication too. Thoughts?

> ---
> +static void
> +UpdatePublicationKindTupleValue(Relation rel, HeapTuple tup, int col,
> +                                                                char pubkind)
>
> Since all callers of this function specify Anum_pg_publication_pubkind
> to 'col', it seems 'col' is not necessary.

Modified

> ---
> +static void
> +AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
> +                                                HeapTuple tup,
> Form_pg_publication pubform)
>
> I think we don't need to pass 'pubform' to this function since we can
> get it by GETSTRUCT(tup).

Modified.

> ---
> +                Oid                    schemaId = get_rel_namespace(relid);
>                  List      *pubids = GetRelationPublications(relid);
> +                List      *schemaPubids = GetSchemaPublications(schemaId);
>
> Can we defer to get the list of schema publications (i.g.,
> 'schemaPubids') until we find the PUBKIND_SCHEMA publication? Perhaps
> the same is true for building 'pubids'.

I felt that we can get the publication information and use it whenever required instead of querying in the loop. Thoughts?

> ---
> +                                                   List of publications
> +        Name        |          Owner           | All tables | Inserts
> | Updates | Deletes | Truncates | Via root | PubKind
> +--------------------+--------------------------+------------+---------+---------+---------+-----------+----------+---------
> + testpib_ins_trunct | regress_publication_user | f          | t
> | f       | f       | f         | f        | e
> + testpub_default    | regress_publication_user | f          | f
> | t       | f       | f         | f        | e
>
> I think it's more readable if \dRp shows 'all tables', 'table',
> 'schema', and 'empty' in PubKind instead of the single character.

Modified

> I think 'Pub kind' is more consistent with other column names.

Modified

Thanks for the comments, these issues are fixed in the v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 9, 2021 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > > 6.
> > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > > + 2,
> > > > > + {
> > > > > + Anum_pg_publication_schema_psnspcid,
> > > > > + Anum_pg_publication_schema_pspubid,
> > > > > + 0,
> > > > > + 0
> > > > > + },
> > > > >
> > > > > Why don't we keep pubid as the first column in this index?
> > > >
> > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > > it is, thoughts?
> > > >
> > >
> > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > > because it is searched using the only relid in
> > > GetRelationPublications, so, similarly, in the patch, you are using
> > > schema_oid in GetSchemaPublications, so probably that will help. I was
> > > wondering why you haven't directly used the cache in
> > > GetSchemaPublications similar to GetRelationPublications?
> >
> > Both of the approaches work, I was not sure which one is better, If
> > the approach in GetRelationPublications is better, I will change it to
> > something similar to GetRelationPublications. Thoughts?
> >
>
> I think it is better to use the cache as if we don't find the entry in
> the cache, then we will anyway search the required entry via sys table
> scan, see SearchCatCacheList.

Modified. This is handled in the v20 patch posted at [1].

I think the point I wanted to ensure was
> that a concurrent session won't blow up the entry while we are looking
> at it. How is that ensured?

The concurrency points occur at two places, Walsender session and user session:
For Walsender process when we query the data from the cache we will get the results based on historic snapshot. I also debugged and verified that we get the results based on historic snapshot, if we check the cache during our operation (insert is just before drop) we will be able to get the dropped record from the cache as this drop occurred after our insert. And if we query the cache after the drop, we will not get the dropped information from the cache. So I feel our existing code is good enough which handles the concurrency through the historic snapshot.
For user sessions, user session checks for replica identity for update/delete operations. To prevent concurrency issues, when schema is added to the publication, the rel cache invalidation happens in publication_add_schema by calling InvalidatePublicationRels, similarly when a schema is dropped from the publication, the rel cache invalidation is handled in RemovePublicationSchemaById by calling InvalidatePublicationRels. Once the invalidation happens it will check the system tables again before deciding. I felt this rel cache invalidation will prevent concurrency issues.

[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 6, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Few more comments:
> > > ===================
> > > 1. Do we need the previous column 'puballtables' after adding pubtype
> > > or pubkind in pg_publication?

Removed puballtables, this is handled as part of v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> I think the strict separation between publication-for-tables vs. 
> publication-for-schemas is a mistake.  Why can't I have a publication 
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note 
> that we have a pending patch to add sequences support to logical 
> replication.  So eventually, a publication will be able to contain a 
> bunch of different objects of different kinds.

This seems like it's going to create a mess, because the meaning of
"include schema S" will change over time as we add more features.
That is, with the present patch (I suppose, haven't read it) we have
"schema S" meaning "publish all tables in schema S".  When that other
patch lands, presumably that same publication definition would start
meaning "publish all tables and sequences in schema S".  And a few years
down the road it might start meaning something else again.  That sounds
like the sort of potentially app-breaking change that we don't like
to make.

We could avoid that bug-in-waiting if the syntax were more like
"FOR ALL TABLES IN SCHEMA s", which could later extend to
"FOR ALL SEQUENCES IN SCHEMA s", etc.  This is then a very clean
intermediate step between publishing one table and "FOR ALL TABLES"
without a schema limitation.

I tend to agree that a single publication should be able to incorporate
any of these options.

            regards, tom lane



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Fri, Aug 13, 2021 at 11:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > ---
> > Even if we drop all tables added to the publication from it, 'pubkind'
> > doesn't go back to 'empty'. Is that intentional behavior? If we do
> > that, we can save the lookup of pg_publication_rel and
> > pg_publication_schema in some cases, and we can switch the publication
> > that was created as FOR SCHEMA to FOR TABLE and vice versa.
> >
>
> Do we really want to allow users to change a publication that is FOR
> SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
> postgres=# Alter Publication pub add table tbl1;
> ERROR:  publication "pub" is defined as FOR ALL TABLES
> DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.

When it comes to FOR ALL TABLES, we can neither add/drop tables
to/from the publication. So it makes sense to me that it never changes
to 'empty'. I'm not sure there are use cases in practice where to
change FOR SCHEMA to FOR TABLE. But it seems a bit weird to me that
pubkind doesn't change to "empty" even if the publication actually
gets empty. It might be just that "empty" is misleading, though.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Sun, Aug 15, 2021 at 12:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > I think the strict separation between publication-for-tables vs.
> > publication-for-schemas is a mistake.  Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note
> > that we have a pending patch to add sequences support to logical
> > replication.  So eventually, a publication will be able to contain a
> > bunch of different objects of different kinds.
>
> This seems like it's going to create a mess, because the meaning of
> "include schema S" will change over time as we add more features.
> That is, with the present patch (I suppose, haven't read it) we have
> "schema S" meaning "publish all tables in schema S".  When that other
> patch lands, presumably that same publication definition would start
> meaning "publish all tables and sequences in schema S".  And a few years
> down the road it might start meaning something else again.  That sounds
> like the sort of potentially app-breaking change that we don't like
> to make.
>
> We could avoid that bug-in-waiting if the syntax were more like
> "FOR ALL TABLES IN SCHEMA s", which could later extend to
> "FOR ALL SEQUENCES IN SCHEMA s", etc.  This is then a very clean
> intermediate step between publishing one table and "FOR ALL TABLES"
> without a schema limitation.

+1

> I tend to agree that a single publication should be able to incorporate
> any of these options.

I personally prefer that a single publication can include both all
tables and all sequences in a database or a schema. It would be a more
convenient way to specify replicating all objects (tables and
sequences) in a database or a schema.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Peter Smith
Date:
On Sun, Aug 15, 2021 at 1:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > I think the strict separation between publication-for-tables vs.
> > publication-for-schemas is a mistake.  Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note
> > that we have a pending patch to add sequences support to logical
> > replication.  So eventually, a publication will be able to contain a
> > bunch of different objects of different kinds.
>
> This seems like it's going to create a mess, because the meaning of
> "include schema S" will change over time as we add more features.
> That is, with the present patch (I suppose, haven't read it) we have
> "schema S" meaning "publish all tables in schema S".  When that other
> patch lands, presumably that same publication definition would start
> meaning "publish all tables and sequences in schema S".  And a few years
> down the road it might start meaning something else again.  That sounds
> like the sort of potentially app-breaking change that we don't like
> to make.
>
> We could avoid that bug-in-waiting if the syntax were more like
> "FOR ALL TABLES IN SCHEMA s", which could later extend to
> "FOR ALL SEQUENCES IN SCHEMA s", etc.  This is then a very clean
> intermediate step between publishing one table and "FOR ALL TABLES"
> without a schema limitation.
>
> I tend to agree that a single publication should be able to incorporate
> any of these options.
>

How about if the improved syntax from Tom Lane [1] also allowed an
"AND" keyword for combining whatever you wish?

Then the question from Peter E. [2] "Why can't I have a publication
that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
have an intuitive solution like:

CREATE PUBLICATION pub1
FOR TABLE t1,t2,t3 AND
FOR ALL TABLES IN SCHEMA s1,s2,s3;

------
[1] https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/4fb39707-dca9-1563-4482-b7a8315c36ca%40enterprisedb.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Added schema level support for publication.

From
Tom Lane
Date:
Peter Smith <smithpb2250@gmail.com> writes:
> Then the question from Peter E. [2] "Why can't I have a publication
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
> have an intuitive solution like:

> CREATE PUBLICATION pub1
> FOR TABLE t1,t2,t3 AND
> FOR ALL TABLES IN SCHEMA s1,s2,s3;

That seems a bit awkward, since the existing precedent is
to use commas.  We shouldn't need more than one FOR noise-word,
either.  So I was imagining syntax more like, say,

    CREATE PUBLICATION pub1 FOR
      TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
          SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;

Abstractly it'd be

createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]

cpitem := ALL TABLES |
          TABLE name |
          ALL TABLES IN SCHEMA name |
          ALL SEQUENCES |
          SEQUENCE name |
          ALL SEQUENCES IN SCHEMA name |
          name

The grammar output would need some post-analysis to attribute the
right type to bare "name" items, but that doesn't seem difficult.

            regards, tom lane



Re: Added schema level support for publication.

From
Peter Smith
Date:
On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Smith <smithpb2250@gmail.com> writes:
> > Then the question from Peter E. [2] "Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
> > have an intuitive solution like:
>
> > CREATE PUBLICATION pub1
> > FOR TABLE t1,t2,t3 AND
> > FOR ALL TABLES IN SCHEMA s1,s2,s3;
>
> That seems a bit awkward, since the existing precedent is
> to use commas.  We shouldn't need more than one FOR noise-word,
> either.  So I was imagining syntax more like, say,

When I wrote that "AND" suggestion I had in mind that commas may get
weird if there were objects with keyword names. e.g. if there was a
schema called SEQUENCE and a SEQUENCE called  SEQUENCE then this will
be allowed.

CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA SEQUENCE, SEQUENCE SEQUENCE;

But probably I was just overthinking it.

>
>         CREATE PUBLICATION pub1 FOR
>           TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
>           SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
>
> Abstractly it'd be
>
> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
>
> cpitem := ALL TABLES |
>           TABLE name |
>           ALL TABLES IN SCHEMA name |
>           ALL SEQUENCES |
>           SEQUENCE name |
>           ALL SEQUENCES IN SCHEMA name |
>           name
>
> The grammar output would need some post-analysis to attribute the
> right type to bare "name" items, but that doesn't seem difficult.

That last bare "name" cpitem. looks like it would permit the following syntax:

CREATE PUBLICATION pub FOR a,b,c;

Was that intentional?

------
Kind Regards,
Peter Smith.
Fujitsu Australia.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> >
> >         CREATE PUBLICATION pub1 FOR
> >           TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
> >           SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
> >
> > Abstractly it'd be
> >
> > createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> >
> > cpitem := ALL TABLES |
> >           TABLE name |
> >           ALL TABLES IN SCHEMA name |
> >           ALL SEQUENCES |
> >           SEQUENCE name |
> >           ALL SEQUENCES IN SCHEMA name |
> >           name
> >
> > The grammar output would need some post-analysis to attribute the
> > right type to bare "name" items, but that doesn't seem difficult.
>
> That last bare "name" cpitem. looks like it would permit the following syntax:
>
> CREATE PUBLICATION pub FOR a,b,c;
>
> Was that intentional?
>

I think so. IIUC, the idea is that after parsing we find out whether
the given name is table, sequence, or any other object. Here, I think
the name could be either of table or sequence because, for schema, we
won't be knowing whether to include tables, sequences, or both in the
schema. Also, we can have the same name for schema and table so it
might be tricky to distinguish among those unless we give priority to
one of those.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
>> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Abstractly it'd be
>>>
>>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
>>>
>>> cpitem := ALL TABLES |
>>>     TABLE name |
>>>     ALL TABLES IN SCHEMA name |
>>>     ALL SEQUENCES |
>>>     SEQUENCE name |
>>>     ALL SEQUENCES IN SCHEMA name |
>>>     name
>>>
>>> The grammar output would need some post-analysis to attribute the
>>> right type to bare "name" items, but that doesn't seem difficult.

>> That last bare "name" cpitem. looks like it would permit the following syntax:
>> CREATE PUBLICATION pub FOR a,b,c;
>> Was that intentional?

> I think so.

I had supposed that we could throw an error at the post-processing stage,
or alternatively default to assuming that such names are tables.

Now you could instead make the grammar work like

cpitem := ALL TABLES |
      TABLE name [, ...] |
      ALL TABLES IN SCHEMA name [, ...] |
      ALL SEQUENCES |
      SEQUENCE name [, ...] |
      ALL SEQUENCES IN SCHEMA name [, ...]

which would result in a two-level-list data structure.  I'm not sure
that this is better, as any sort of mistake would result in a very
uninformative generic "syntax error" from Bison.  Errors out of a
post-processing stage could be more specific than that.

(Perhaps, though, we should *document* it like the latter way,
even if the actual implementation is more like the first way.)

            regards, tom lane



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Abstractly it'd be
> >>>
> >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> >>>
> >>> cpitem := ALL TABLES |
> >>>     TABLE name |
> >>>     ALL TABLES IN SCHEMA name |
> >>>     ALL SEQUENCES |
> >>>     SEQUENCE name |
> >>>     ALL SEQUENCES IN SCHEMA name |
> >>>     name
> >>>
> >>> The grammar output would need some post-analysis to attribute the
> >>> right type to bare "name" items, but that doesn't seem difficult.
>
> >> That last bare "name" cpitem. looks like it would permit the following syntax:
> >> CREATE PUBLICATION pub FOR a,b,c;
> >> Was that intentional?
>
> > I think so.
>
> I had supposed that we could throw an error at the post-processing stage,
> or alternatively default to assuming that such names are tables.
>
> Now you could instead make the grammar work like
>
> cpitem := ALL TABLES |
>           TABLE name [, ...] |
>           ALL TABLES IN SCHEMA name [, ...] |
>           ALL SEQUENCES |
>           SEQUENCE name [, ...] |
>           ALL SEQUENCES IN SCHEMA name [, ...]
>
> which would result in a two-level-list data structure.  I'm not sure
> that this is better, as any sort of mistake would result in a very
> uninformative generic "syntax error" from Bison.  Errors out of a
> post-processing stage could be more specific than that.

I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is something like below:
CreatePublicationStmt:
CREATE PUBLICATION name FOR pub_obj_list opt_definition
{
CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
n->pubname = $3;
n->options = $6;
n->pubobjects = (List *)$5;
$$ = (Node *)n;
}
;
pub_obj_list: PublicationObjSpec
{ $$ = list_make1($1); }
| pub_obj_list ',' PublicationObjSpec
{ $$ = lappend($1, $3); }
;
/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec: TABLE pubobj_expr
{ ....}
| ALL TABLES IN_P SCHEMA pubobj_expr
{ ....}
| pubobj_expr
{ ....}
;
pubobj_expr:
any_name
{ ....}
| any_name '*'
{ ....}
| ONLY any_name
{ ....}
| ONLY '(' any_name ')'
{ ....}
| CURRENT_SCHEMA
{ ....}
;

I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA support in case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like "sch1.t1".
I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR ALL TABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables.
Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALL TABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and "ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP separately which cannot be combined currently. I have not mentioned SEQUENCE handling separately, the sequence will be extended in similar lines. I thought of throwing an error at post processing, the first option that Tom Lane had suggested if the user specified syntax like: create publication pub1 for t1,t2;
Thoughts?

[1] - https://www.postgresql.org/message-id/877603.1629120678%40sss.pgh.pa.us

Regards,
Vignesh

Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>> Abstractly it'd be
> > >>>
> > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> > >>>
> > >>> cpitem := ALL TABLES |
> > >>>     TABLE name |
> > >>>     ALL TABLES IN SCHEMA name |
> > >>>     ALL SEQUENCES |
> > >>>     SEQUENCE name |
> > >>>     ALL SEQUENCES IN SCHEMA name |
> > >>>     name
> > >>>
> > >>> The grammar output would need some post-analysis to attribute the
> > >>> right type to bare "name" items, but that doesn't seem difficult.
> >
> > >> That last bare "name" cpitem. looks like it would permit the following syntax:
> > >> CREATE PUBLICATION pub FOR a,b,c;
> > >> Was that intentional?
> >
> > > I think so.
> >
> > I had supposed that we could throw an error at the post-processing stage,
> > or alternatively default to assuming that such names are tables.
> >
> > Now you could instead make the grammar work like
> >
> > cpitem := ALL TABLES |
> >           TABLE name [, ...] |
> >           ALL TABLES IN SCHEMA name [, ...] |
> >           ALL SEQUENCES |
> >           SEQUENCE name [, ...] |
> >           ALL SEQUENCES IN SCHEMA name [, ...]
> >
> > which would result in a two-level-list data structure.  I'm not sure
> > that this is better, as any sort of mistake would result in a very
> > uninformative generic "syntax error" from Bison.  Errors out of a
> > post-processing stage could be more specific than that.
>
> I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is something
likebelow: 
> CreatePublicationStmt:
> CREATE PUBLICATION name FOR pub_obj_list opt_definition
> {
> CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
> n->pubname = $3;
> n->options = $6;
> n->pubobjects = (List *)$5;
> $$ = (Node *)n;
> }
> ;
> pub_obj_list: PublicationObjSpec
> { $$ = list_make1($1); }
> | pub_obj_list ',' PublicationObjSpec
> { $$ = lappend($1, $3); }
> ;
> /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
> PublicationObjSpec: TABLE pubobj_expr
> { ....}
> | ALL TABLES IN_P SCHEMA pubobj_expr
> { ....}
> | pubobj_expr
> { ....}
> ;
> pubobj_expr:
> any_name
> { ....}
> | any_name '*'
> { ....}
> | ONLY any_name
> { ....}
> | ONLY '(' any_name ')'
> { ....}
> | CURRENT_SCHEMA
> { ....}
> ;

"FOR ALL TABLES” (that includes all tables in the database) is missing
in this syntax?

>
> I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA
supportin case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like
"sch1.t1".

I think that relation_expr also accepts objects like "sch1.t1", no?

> I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR
ALLTABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. 

I think so too.

> Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALL
TABLESIN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and "ALTER
PUBLICATIONpub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP
separatelywhich cannot be combined currently. 

If we support the former syntax, the latter two syntaxes are also
supported. Why do we want to support only the latter separate two
syntaxes?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Aug 25, 2021 at 1:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >>> Abstractly it'd be
> > > >>>
> > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> > > >>>
> > > >>> cpitem := ALL TABLES |
> > > >>>     TABLE name |
> > > >>>     ALL TABLES IN SCHEMA name |
> > > >>>     ALL SEQUENCES |
> > > >>>     SEQUENCE name |
> > > >>>     ALL SEQUENCES IN SCHEMA name |
> > > >>>     name
> > > >>>
> > > >>> The grammar output would need some post-analysis to attribute the
> > > >>> right type to bare "name" items, but that doesn't seem difficult.
> > >
> > > >> That last bare "name" cpitem. looks like it would permit the following syntax:
> > > >> CREATE PUBLICATION pub FOR a,b,c;
> > > >> Was that intentional?
> > >
> > > > I think so.
> > >
> > > I had supposed that we could throw an error at the post-processing stage,
> > > or alternatively default to assuming that such names are tables.
> > >
> > > Now you could instead make the grammar work like
> > >
> > > cpitem := ALL TABLES |
> > >           TABLE name [, ...] |
> > >           ALL TABLES IN SCHEMA name [, ...] |
> > >           ALL SEQUENCES |
> > >           SEQUENCE name [, ...] |
> > >           ALL SEQUENCES IN SCHEMA name [, ...]
> > >
> > > which would result in a two-level-list data structure.  I'm not sure
> > > that this is better, as any sort of mistake would result in a very
> > > uninformative generic "syntax error" from Bison.  Errors out of a
> > > post-processing stage could be more specific than that.
> >
> > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is
somethinglike below: 
> > CreatePublicationStmt:
> > CREATE PUBLICATION name FOR pub_obj_list opt_definition
> > {
> > CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
> > n->pubname = $3;
> > n->options = $6;
> > n->pubobjects = (List *)$5;
> > $$ = (Node *)n;
> > }
> > ;
> > pub_obj_list: PublicationObjSpec
> > { $$ = list_make1($1); }
> > | pub_obj_list ',' PublicationObjSpec
> > { $$ = lappend($1, $3); }
> > ;
> > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
> > PublicationObjSpec: TABLE pubobj_expr
> > { ....}
> > | ALL TABLES IN_P SCHEMA pubobj_expr
> > { ....}
> > | pubobj_expr
> > { ....}
> > ;
> > pubobj_expr:
> > any_name
> > { ....}
> > | any_name '*'
> > { ....}
> > | ONLY any_name
> > { ....}
> > | ONLY '(' any_name ')'
> > { ....}
> > | CURRENT_SCHEMA
> > { ....}
> > ;
>
> "FOR ALL TABLES” (that includes all tables in the database) is missing
> in this syntax?

"FOR ALL TABLES" is present in CreatePublicationStmt rule, sorry for
not including all of CreatePublicationStmt rule in the previous mail,
I thought of keeping the contents shorter:

CreatePublicationStmt:
CREATE PUBLICATION name opt_definition
{....}
| CREATE PUBLICATION name FOR ALL TABLES opt_definition
{....}
| CREATE PUBLICATION name FOR pub_obj_list opt_definition
{....}
;

It is not in  pub_obj_list as the user will be able to specify either
of "FOR ALL TABLES" or "FOR TABLE/ FOR ALL TABLES IN SCHEMA" along
with create publication.

> >
> > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA
supportin case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like
"sch1.t1".
>
> I think that relation_expr also accepts objects like "sch1.t1", no?

Earlier syntax only supported relations, the relations were parsed
into RangeVar datatype. The new feature supports schema for which only
the schema name is required. To keep the parsing rule common, I used
any_name which will store the dotted name into a list for both
relation and schema. I will later convert it into rangevar for
relation and schema oid for schema names during the processing and
create the publication. I felt relation_expr was able to handle dotted
names because of qualified_name having "ColId indirection", here
indirection rule takes care of handling the dotted names.

> > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR
ALLTABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. 
>
> I think so too.
>
> > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2,
ALLTABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and
"ALTERPUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP
separatelywhich cannot be combined currently. 
>
> If we support the former syntax, the latter two syntaxes are also
> supported. Why do we want to support only the latter separate two
> syntaxes?

We can support either syntax, I was not sure which one is better. If
alter also should support similar syntax I can do it as a separate
patch so as to not increase the main patch size. Thoughts?

Attached v21 patch has the changes based on the new syntax and fixes
few of the other review comments provided by reviewers.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Aug 14, 2021 at 3:02 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
>
> On 13.08.21 04:59, Amit Kapila wrote:
> >> Even if we drop all tables added to the publication from it, 'pubkind'
> >> doesn't go back to 'empty'. Is that intentional behavior? If we do
> >> that, we can save the lookup of pg_publication_rel and
> >> pg_publication_schema in some cases, and we can switch the publication
> >> that was created as FOR SCHEMA to FOR TABLE and vice versa.
> >>
> > Do we really want to allow users to change a publication that is FOR
> > SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
> > postgres=# Alter Publication pub add table tbl1;
> > ERROR:  publication "pub" is defined as FOR ALL TABLES
> > DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.
>
> I think the strict separation between publication-for-tables vs.
> publication-for-schemas is a mistake.  Why can't I have a publication
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3.  Also note
> that we have a pending patch to add sequences support to logical
> replication.  So eventually, a publication will be able to contain a
> bunch of different objects of different kinds.

Thanks for the feedback, now the same publication can handle schemas as well as tables, and can be extended further to other objects. This is handled in the v21 patch attached at [1]. It is supported by the following syntax:
CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA s1,s2,s3, TABLE t1,t2,t3;

[1] - https://www.postgresql.org/message-id/CALDaNm2iKJvSdCyh0S%2BwYgFjMNB4hu3kYjk%3DYrEkpqTJY9zW%2Bw%40mail.gmail.com

Regards,
Vignesh

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Wednesday, August 25, 2021 5:37 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Attached v21 patch has the changes based on the new syntax and fixes
> few of the other review comments provided by reviewers.
> 

Thanks for your new patch. I saw the following warning when building, please have a look.

publicationcmds.c: In function ‘ConvertPubObjSpecListToOidList’:
publicationcmds.c:212:23: warning: ‘prevobjtype’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    pubobj->pubobjtype = prevobjtype;
    ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~

Regards
Tang

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Aug 25, 2021 at 1:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > >>> Abstractly it'd be
> > > > >>>
> > > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> > > > >>>
> > > > >>> cpitem := ALL TABLES |
> > > > >>>     TABLE name |
> > > > >>>     ALL TABLES IN SCHEMA name |
> > > > >>>     ALL SEQUENCES |
> > > > >>>     SEQUENCE name |
> > > > >>>     ALL SEQUENCES IN SCHEMA name |
> > > > >>>     name
> > > > >>>
> > > > >>> The grammar output would need some post-analysis to attribute the
> > > > >>> right type to bare "name" items, but that doesn't seem difficult.
> > > >
> > > > >> That last bare "name" cpitem. looks like it would permit the following syntax:
> > > > >> CREATE PUBLICATION pub FOR a,b,c;
> > > > >> Was that intentional?
> > > >
> > > > > I think so.
> > > >
> > > > I had supposed that we could throw an error at the post-processing stage,
> > > > or alternatively default to assuming that such names are tables.
> > > >
> > > > Now you could instead make the grammar work like
> > > >
> > > > cpitem := ALL TABLES |
> > > >           TABLE name [, ...] |
> > > >           ALL TABLES IN SCHEMA name [, ...] |
> > > >           ALL SEQUENCES |
> > > >           SEQUENCE name [, ...] |
> > > >           ALL SEQUENCES IN SCHEMA name [, ...]
> > > >
> > > > which would result in a two-level-list data structure.  I'm not sure
> > > > that this is better, as any sort of mistake would result in a very
> > > > uninformative generic "syntax error" from Bison.  Errors out of a
> > > > post-processing stage could be more specific than that.
> > >
> > > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is
somethinglike below: 
> > > CreatePublicationStmt:
> > > CREATE PUBLICATION name FOR pub_obj_list opt_definition
> > > {
> > > CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
> > > n->pubname = $3;
> > > n->options = $6;
> > > n->pubobjects = (List *)$5;
> > > $$ = (Node *)n;
> > > }
> > > ;
> > > pub_obj_list: PublicationObjSpec
> > > { $$ = list_make1($1); }
> > > | pub_obj_list ',' PublicationObjSpec
> > > { $$ = lappend($1, $3); }
> > > ;
> > > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
> > > PublicationObjSpec: TABLE pubobj_expr
> > > { ....}
> > > | ALL TABLES IN_P SCHEMA pubobj_expr
> > > { ....}
> > > | pubobj_expr
> > > { ....}
> > > ;
> > > pubobj_expr:
> > > any_name
> > > { ....}
> > > | any_name '*'
> > > { ....}
> > > | ONLY any_name
> > > { ....}
> > > | ONLY '(' any_name ')'
> > > { ....}
> > > | CURRENT_SCHEMA
> > > { ....}
> > > ;
> >
> > "FOR ALL TABLES” (that includes all tables in the database) is missing
> > in this syntax?
>
> "FOR ALL TABLES" is present in CreatePublicationStmt rule, sorry for
> not including all of CreatePublicationStmt rule in the previous mail,
> I thought of keeping the contents shorter:
>
> CreatePublicationStmt:
> CREATE PUBLICATION name opt_definition
> {....}
> | CREATE PUBLICATION name FOR ALL TABLES opt_definition
> {....}
> | CREATE PUBLICATION name FOR pub_obj_list opt_definition
> {....}
> ;
>
> It is not in  pub_obj_list as the user will be able to specify either
> of "FOR ALL TABLES" or "FOR TABLE/ FOR ALL TABLES IN SCHEMA" along
> with create publication.
>
> > >
> > > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle
CURRENT_SCHEMAsupport in case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support
objectslike "sch1.t1". 
> >
> > I think that relation_expr also accepts objects like "sch1.t1", no?
>
> Earlier syntax only supported relations, the relations were parsed
> into RangeVar datatype. The new feature supports schema for which only
> the schema name is required. To keep the parsing rule common, I used
> any_name which will store the dotted name into a list for both
> relation and schema. I will later convert it into rangevar for
> relation and schema oid for schema names during the processing and
> create the publication. I felt relation_expr was able to handle dotted
> names because of qualified_name having "ColId indirection", here
> indirection rule takes care of handling the dotted names.
>
> > > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and
"FORALL TABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. 
> >
> > I think so too.
> >
> > > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2,
ALLTABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2"  and
"ALTERPUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP
separatelywhich cannot be combined currently. 
> >
> > If we support the former syntax, the latter two syntaxes are also
> > supported. Why do we want to support only the latter separate two
> > syntaxes?
>
> We can support either syntax, I was not sure which one is better. If
> alter also should support similar syntax I can do it as a separate
> patch so as to not increase the main patch size. Thoughts?

I have implemented this in the 0003 patch, I have kept it separate to
reduce the testing effort and also it will be easier if someone
disagrees with the syntax. I will merge it to the main patch later
based on the feedback. Attached v22 patch has the changes for the
same.
Thoughts?

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Aug 26, 2021 at 7:52 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Wednesday, August 25, 2021 5:37 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v21 patch has the changes based on the new syntax and fixes
> > few of the other review comments provided by reviewers.
> >
>
> Thanks for your new patch. I saw the following warning when building, please have a look.
>
> publicationcmds.c: In function ‘ConvertPubObjSpecListToOidList’:
> publicationcmds.c:212:23: warning: ‘prevobjtype’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>     pubobj->pubobjtype = prevobjtype;
>     ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~

Thanks for reporting this, I have fixed this in the v22 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1rNDbGwoo0FC9vF1BmueUy__u1ZM5yYOjEQW1Of6zdWQ%40mail.gmail.com

Regards,
VIgnesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have implemented this in the 0003 patch, I have kept it separate to
> reduce the testing effort and also it will be easier if someone
> disagrees with the syntax. I will merge it to the main patch later
> based on the feedback. Attached v22 patch has the changes for the
> same.
>

Few comments on v22-0001-Added-schema-level-support-for-publication:
========================================================
1. Why in publication_add_schema(), you are registering invalidation
for all schema relations? It seems this is to allow rebuilding the
publication info for decoding sessions. But that is not required as
you are registering rel_sync_cache_publication_cb for
publication_schema relation. In rel_sync_cache_publication_cb, we are
marking replicate_valid as false for each entry which will allow
publication info to be rebuilt in get_rel_sync_entry.

I see that it is done for a single relation in the current code in
function publication_add_relation but I guess that is also not
required. You can once test this. If you still think it is required,
can you please explain me why and then we can accordingly add some
comments in the patch.

Peter E., Sawada-San, can you please let me know if I am missing
something in this regard? In the code, I see a comment "/* Invalidate
relcache so that publication info is rebuilt. */" in function
publication_add_relation() but I don't understand why it is required
as per my explanation above?

2.
+ * Publication object type
+ */
+typedef enum PublicationObjSpecType
+{
+ PUBLICATIONOBJ_TABLE, /* Table type */
+ PUBLICATIONOBJ_SCHEMA, /* Schema type */
+ PUBLICATIONOBJ_SEQUENCE, /* Sequence type */

Why add anything related to the sequence in this patch?

3.
+get_object_address_publication_schema(List *object, bool missing_ok)
+{
+ ObjectAddress address;
+ char    *pubname;
+ Publication *pub;
+ char    *schemaname;
+ Oid schemaoid;
+
+ ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid);
+
+ /* Fetch schema name and publication name from input list */
+ schemaname = strVal(linitial(object));
+ pubname = strVal(lsecond(object));
+
+ schemaoid = get_namespace_oid(schemaname, false);
+
+ /* Now look up the pg_publication tuple */
+ pub = GetPublicationByName(pubname, missing_ok);
+ if (!pub)
+ return address;

Why the schema name handling is different from publication name? Why
can't we pass missing_ok for schema api and handle it similar
publication api?

4. In getPublicationSchemaInfo(), why the missing_ok flag is not used
in get_publication_name() whereas it is used for all other syscache
searches in that function?

5. Don't we need to expose a view for publication schemas similar to
pg_publication_tables?

6.
publication_add_schema()
{
..
+ /* Can't be system namespace */
+ if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" is a system schema",
+ get_namespace_name(schemaoid)),
+ errdetail("System schemas cannot be added to publications.")));
+
+ /* Can't be temporary namespace */
+ if (isAnyTempNamespace(schemaoid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" is a temporary schema",
+ get_namespace_name(schemaoid)),
+ errdetail("Temporary schemas cannot be added to publications.")));
..
}

Can we change the first detail message as: "This operation is not
supported for system schemas." and the second detail message as:
"Temporary schemas cannot be replicated."? This is to make these
messages similar to corresponding messages for relations in function
check_publication_add_relation(). Can we move these checks to a
separate function?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
> >
>
> Few comments on v22-0001-Added-schema-level-support-for-publication:
> ========================================================
> 1. Why in publication_add_schema(), you are registering invalidation
> for all schema relations? It seems this is to allow rebuilding the
> publication info for decoding sessions. But that is not required as
> you are registering rel_sync_cache_publication_cb for
> publication_schema relation. In rel_sync_cache_publication_cb, we are
> marking replicate_valid as false for each entry which will allow
> publication info to be rebuilt in get_rel_sync_entry.
>
> I see that it is done for a single relation in the current code in
> function publication_add_relation but I guess that is also not
> required. You can once test this. If you still think it is required,
> can you please explain me why and then we can accordingly add some
> comments in the patch.

I felt this is required for handling the following concurrency scenario:
create schema sch1;
create table sch1.t1(c1 int);
insert into sch1.t1 values(10);
update sch1.t1 set c1 = 11;
# update will be successful and relation cache will update publication
actions based on the current state i.e no publication
create publication pub1 for all tables in schema sch1;
# After the above publication is created the relations present in this
schema should be invalidated so that the next update should fail. If
the relations are not invalidated the updates will be successful based
on previous publication actions.
update sch1.t1 set c1 = 11;
I will add comments to mention the above details. Thoughts?

> Peter E., Sawada-San, can you please let me know if I am missing
> something in this regard? In the code, I see a comment "/* Invalidate
> relcache so that publication info is rebuilt. */" in function
> publication_add_relation() but I don't understand why it is required
> as per my explanation above?
>
> 2.
> + * Publication object type
> + */
> +typedef enum PublicationObjSpecType
> +{
> + PUBLICATIONOBJ_TABLE, /* Table type */
> + PUBLICATIONOBJ_SCHEMA, /* Schema type */
> + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */
>
> Why add anything related to the sequence in this patch?

I will handle this in my next version.

> 3.
> +get_object_address_publication_schema(List *object, bool missing_ok)
> +{
> + ObjectAddress address;
> + char    *pubname;
> + Publication *pub;
> + char    *schemaname;
> + Oid schemaoid;
> +
> + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid);
> +
> + /* Fetch schema name and publication name from input list */
> + schemaname = strVal(linitial(object));
> + pubname = strVal(lsecond(object));
> +
> + schemaoid = get_namespace_oid(schemaname, false);
> +
> + /* Now look up the pg_publication tuple */
> + pub = GetPublicationByName(pubname, missing_ok);
> + if (!pub)
> + return address;
>
> Why the schema name handling is different from publication name? Why
> can't we pass missing_ok for schema api and handle it similar
> publication api?

I will handle this in my next version.

> 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used
> in get_publication_name() whereas it is used for all other syscache
> searches in that function?

I will handle this in my next version.

> 5. Don't we need to expose a view for publication schemas similar to
> pg_publication_tables?

pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL
TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally
access pg_publication_rel and pg_publication_schema to get the
corresponding tables. I felt we don't need a separate view for
publication schemas. Thoughts?

> 6.
> publication_add_schema()
> {
> ..
> + /* Can't be system namespace */
> + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("\"%s\" is a system schema",
> + get_namespace_name(schemaoid)),
> + errdetail("System schemas cannot be added to publications.")));
> +
> + /* Can't be temporary namespace */
> + if (isAnyTempNamespace(schemaoid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("\"%s\" is a temporary schema",
> + get_namespace_name(schemaoid)),
> + errdetail("Temporary schemas cannot be added to publications.")));
> ..
> }
>
> Can we change the first detail message as: "This operation is not
> supported for system schemas." and the second detail message as:
> "Temporary schemas cannot be replicated."? This is to make these
> messages similar to corresponding messages for relations in function
> check_publication_add_relation(). Can we move these checks to a
> separate function?

I will handle this in my next version.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have implemented this in the 0003 patch, I have kept it separate to
> > > reduce the testing effort and also it will be easier if someone
> > > disagrees with the syntax. I will merge it to the main patch later
> > > based on the feedback. Attached v22 patch has the changes for the
> > > same.
> > >
> >
> > Few comments on v22-0001-Added-schema-level-support-for-publication:
> > ========================================================
> > 1. Why in publication_add_schema(), you are registering invalidation
> > for all schema relations? It seems this is to allow rebuilding the
> > publication info for decoding sessions. But that is not required as
> > you are registering rel_sync_cache_publication_cb for
> > publication_schema relation. In rel_sync_cache_publication_cb, we are
> > marking replicate_valid as false for each entry which will allow
> > publication info to be rebuilt in get_rel_sync_entry.
> >
> > I see that it is done for a single relation in the current code in
> > function publication_add_relation but I guess that is also not
> > required. You can once test this. If you still think it is required,
> > can you please explain me why and then we can accordingly add some
> > comments in the patch.
>
> I felt this is required for handling the following concurrency scenario:
> create schema sch1;
> create table sch1.t1(c1 int);
> insert into sch1.t1 values(10);
> update sch1.t1 set c1 = 11;
> # update will be successful and relation cache will update publication
> actions based on the current state i.e no publication
> create publication pub1 for all tables in schema sch1;
> # After the above publication is created the relations present in this
> schema should be invalidated so that the next update should fail.
>

What do you mean by update should fail? I think all the relations in
RelationSyncCache via rel_sync_cache_publication_cb because you have
updated pg_publication_schema and that should register syscache
invalidation.

> If
> the relations are not invalidated the updates will be successful based
> on previous publication actions.
> update sch1.t1 set c1 = 11;
>

I think even without special relcache invalidations the relations
should be invalidated because of syscache invalidation as mentioned in
the previous point. Am I missing something here?

>
> > 5. Don't we need to expose a view for publication schemas similar to
> > pg_publication_tables?
>
> pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL
> TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally
> access pg_publication_rel and pg_publication_schema to get the
> corresponding tables. I felt we don't need a separate view for
> publication schemas. Thoughts?
>

Don't you think some users might want to know all the schema names for
a publication? I am not completely sure on this point but I think it
is good to have information for users. It might be also useful to have
pg_publication_objects where we can display object types (like table,
schema, sequence, etc) and then object names. If you are not convinced
then we can wait and see what others think about this.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > I have implemented this in the 0003 patch, I have kept it separate to
> > > > reduce the testing effort and also it will be easier if someone
> > > > disagrees with the syntax. I will merge it to the main patch later
> > > > based on the feedback. Attached v22 patch has the changes for the
> > > > same.
> > > >
> > >
> > > Few comments on v22-0001-Added-schema-level-support-for-publication:
> > > ========================================================
> > > 1. Why in publication_add_schema(), you are registering invalidation
> > > for all schema relations? It seems this is to allow rebuilding the
> > > publication info for decoding sessions. But that is not required as
> > > you are registering rel_sync_cache_publication_cb for
> > > publication_schema relation. In rel_sync_cache_publication_cb, we are
> > > marking replicate_valid as false for each entry which will allow
> > > publication info to be rebuilt in get_rel_sync_entry.
> > >
> > > I see that it is done for a single relation in the current code in
> > > function publication_add_relation but I guess that is also not
> > > required. You can once test this. If you still think it is required,
> > > can you please explain me why and then we can accordingly add some
> > > comments in the patch.
> >
> > I felt this is required for handling the following concurrency scenario:
> > create schema sch1;
> > create table sch1.t1(c1 int);
> > insert into sch1.t1 values(10);
> > update sch1.t1 set c1 = 11;
> > # update will be successful and relation cache will update publication
> > actions based on the current state i.e no publication
> > create publication pub1 for all tables in schema sch1;
> > # After the above publication is created the relations present in this
> > schema should be invalidated so that the next update should fail.
> >
>
> What do you mean by update should fail? I think all the relations in
> RelationSyncCache via rel_sync_cache_publication_cb because you have
> updated pg_publication_schema and that should register syscache
> invalidation.

By update should fail, I meant the updates without setting replica
identity before creating the decoding context. The scenario is like
below (all in the same session, the subscription is not created):
create schema sch1;
create table sch1.t1(c1 int);
insert into sch1.t1 values(10);
# Before updating we will check CheckCmdReplicaIdentity, as there are
no publications on this table rd_pubactions will be set accordingly in
relcache entry.
update sch1.t1 set c1 = 11;
# Now we will create the publication after rd_pubactions has been set
in the cache. Now when we create this publication we should invalidate
the relations present in the schema, this is required so that when the
next update happens, we should check the publication actions again in
CheckCmdReplicaIdentity and fail the update which does not  set
replica identity after the publication is created.
create publication pub1 for all tables in schema sch1;
# After the above publication is created the relations present in this
schema will be invalidated. Now we will check the publication actions
again in CheckCmdReplicaIdentity and the update will fail.
update sch1.t1 set c1 = 11;
ERROR:  cannot update table "t1" because it does not have a replica
identity and publishes updates
HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

The rel_sync_cache_publication_cb invalidations are registered once
the decoding context is created. Since the decoding context is not
created at this point of time we should explicitly invalidate all the
relations present in this schema by calling InvalidatePublicationRels.

> > If
> > the relations are not invalidated the updates will be successful based
> > on previous publication actions.
> > update sch1.t1 set c1 = 11;
> >
>
> I think even without special relcache invalidations the relations
> should be invalidated because of syscache invalidation as mentioned in
> the previous point. Am I missing something here?

The rel_sync_cache_publication_cb invalidations are registered once
the decoding context is created. Since the decoding context is not
created at this point of time we should explicitly invalidate all the
relations present in this schema by calling InvalidatePublicationRels.
This is to prevent updates for which the replica identity is not set,
I have mentioned more details above.

> >
> > > 5. Don't we need to expose a view for publication schemas similar to
> > > pg_publication_tables?
> >
> > pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL
> > TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally
> > access pg_publication_rel and pg_publication_schema to get the
> > corresponding tables. I felt we don't need a separate view for
> > publication schemas. Thoughts?
> >
>
> Don't you think some users might want to know all the schema names for
> a publication? I am not completely sure on this point but I think it
> is good to have information for users. It might be also useful to have
> pg_publication_objects where we can display object types (like table,
> schema, sequence, etc) and then object names. If you are not convinced
> then we can wait and see what others think about this.

Ok, I will add this view in the next version.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have implemented this in the 0003 patch, I have kept it separate to
> reduce the testing effort and also it will be easier if someone
> disagrees with the syntax. I will merge it to the main patch later
> based on the feedback. Attached v22 patch has the changes for the
> same.

Just experimenting with the new syntax so far, and seeing some new
messages and docs, I have the following suggestions for improvements:

src/backend/commands/publicationcmds.c

(1)
BEFORE:
for table/for all tables in schema should be specified before the object
AFTER:
FOR TABLE / FOR ALL TABLES IN SCHEMA should be specified before the
table/schema name(s)

(2)
BEFORE:
Tables cannot be added, dropped or set on FOR ALL TABLES publications.
AFTER:
Tables cannot be added to, dropped from, or set on FOR ALL TABLES publications.

(3)
BEFORE:
Schemas cannot be added, dropped or set on FOR ALL TABLES publications.
AFTER:
Schemas cannot be added to, dropped from, or set on FOR ALL TABLES publications.


v22-0002

doc/src/sgml/ref/create_publication.sgml

(1)
BEFORE:
+   Create a publication that publishes all changes for users and departments
+   table and that publishes all changes for all the tables present in the
AFTER:
+   Create a publication that publishes all changes for tables "users" and
+   "departments" and that publishes all changes for all the tables
present in the


Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> I have implemented this in the 0003 patch, I have kept it separate to reduce the
> testing effort and also it will be easier if someone disagrees with the syntax. I
> will merge it to the main patch later based on the feedback. Attached v22 patch
> has the changes for the same.
> Thoughts?

Hi,

Here are some comments for the new version patches.

About  0001
1)
+            rel->relpersistence = RELPERSISTENCE_PERMANENT;

It seems we don't need to set this since makeRangeVarFromNameList()
already set it.


2)
+            if (!relids || !schemarelids)
+                tables = list_concat(relids, schemarelids);
+            else
+                tables = list_concat_unique_oid(relids, schemarelids);
+        }

It seems we can simplify the above code like the following: 
tables = list_concat_unique_oid(relids, schemarelids);


3)
+        relids = GetPublicationRelations(pubform->oid,
+                                         PUBLICATION_PART_ALL);
+        schemarelids = GetAllSchemasPublicationRelations(pubform->oid,
+                                                         PUBLICATION_PART_ALL);
+        relids = list_concat(relids, schemarelids);

should we invoke list_concat_unique_oid here ?

4)

+                search_path = fetch_search_path(false);
+                if (search_path == NIL) /* nothing valid in search_path? */

It might be better to list_free(search_path) when not used.

5)
+            if (list_length(pubobj->name) == 1 &&
+                (strcmp(relname, "CURRENT_SCHEMA") == 0))
+                ereport(ERROR,
+                        errcode(ERRCODE_SYNTAX_ERROR),
+                        errmsg("invalid relation name at or near"),
+                        parser_errposition(pstate, pubobj->location));

Maybe we don't need this check, because it will report an error in
OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that
message seems readable to me.


About  0002
6)

diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 0c84d87873..0a479dfe36 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 32;
+use Test::More tests => 46;

I think it might be better to move these testcases create a separate perl file.

About  0003
7) 
The v22-0003 seems simple and can remove lots of code in patch v22-0001, so
maybe we can merge 0001 and 0003 into one patch ?

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sun, Aug 29, 2021 at 7:22 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > What do you mean by update should fail? I think all the relations in
> > RelationSyncCache via rel_sync_cache_publication_cb because you have
> > updated pg_publication_schema and that should register syscache
> > invalidation.
>
> By update should fail, I meant the updates without setting replica
> identity before creating the decoding context. The scenario is like
> below (all in the same session, the subscription is not created):
> create schema sch1;
> create table sch1.t1(c1 int);
> insert into sch1.t1 values(10);
> # Before updating we will check CheckCmdReplicaIdentity, as there are
> no publications on this table rd_pubactions will be set accordingly in
> relcache entry.
> update sch1.t1 set c1 = 11;
> # Now we will create the publication after rd_pubactions has been set
> in the cache. Now when we create this publication we should invalidate
> the relations present in the schema, this is required so that when the
> next update happens, we should check the publication actions again in
> CheckCmdReplicaIdentity and fail the update which does not  set
> replica identity after the publication is created.
> create publication pub1 for all tables in schema sch1;
> # After the above publication is created the relations present in this
> schema will be invalidated. Now we will check the publication actions
> again in CheckCmdReplicaIdentity and the update will fail.
> update sch1.t1 set c1 = 11;
> ERROR:  cannot update table "t1" because it does not have a replica
> identity and publishes updates
> HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
>

Okay, I got it but let's add few comments in the code related to it.
Also, I noticed that the code in InvalidatePublicationRels() already
exists in AlterPublicationOptions(). You can try to refactor the
existing code as a separate initial patch.

BTW, I noticed that "for all tables", we don't register invalidations
in the above scenario, and then later that causes conflict on the
subscriber. I think that is a bug in the current code and we can deal
with that separately.

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have implemented this in the 0003 patch, I have kept it separate to
> reduce the testing effort and also it will be easier if someone
> disagrees with the syntax. I will merge it to the main patch later
> based on the feedback. Attached v22 patch has the changes for the
> same.
> Thoughts?
> 

Thanks for your new patch. Here are some suggestions:

1. 
If a publication published a table and the schema where the table belonged to, the
publication name would show twice when using '\d+' for the table.
Maybe we should add some check to avoid the duplication. Thought?

For example:

create schema sch1;
create table sch1.tbl(a int);
create publication pub for table sch1.tbl, all tables in schema sch1;

postgres=# \d+ sch1.tbl
                                            Table "sch1.tbl"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              |
Publications:
    "pub"
    "pub"
Access method: heap

2. doc/src/sgml/catalogs.sgml
@@ -6169,6 +6174,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
        publication instead of its own.
       </para></entry>
      </row>
+
     </tbody>
    </tgroup>
   </table>

It seems that we don't need this change.

3. src/bin/psql/tab-complete.c

+    /* Complete "CREATE PUBLICATION <name> FOR SCHEMA <schema>, ..." */
+    else if (HeadMatches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA"))
+        COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+                            " UNION SELECT 'CURRENT_SCHEMA' "
+                            "UNION SELECT 'WITH ('");

The comment should be updated to "FOR ALL TABLES IN SCHEMA".

Regards
Tang

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have implemented this in the 0003 patch, I have kept it separate to
> reduce the testing effort and also it will be easier if someone
> disagrees with the syntax. I will merge it to the main patch later
> based on the feedback. Attached v22 patch has the changes for the
> same.

I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
TABLE sc1.test;"  maintains the table separately and results in the
following in the \dRp+ output:

Tables:
    "sc1.test"
Schemas:
    "sc1"

and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
still leaves the "sc1.test" table in the publication.

Is there a reason why we don't/can't support "ALTER SUBSCRIPTION ...
SET ALL TABLES;"?
(I know it wasn't supported before, but now "ALTER SUBSCRIPTION ...
SET ALL TABLES IN SCHEMA ..." is being supported)
I notice that the v22-0003 documentation updates for ALTER
SUBSCRIPTION are missing - but you're probably waiting on all feedback
before proceeding with that.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
> >
>
> Few comments on v22-0001-Added-schema-level-support-for-publication:
> ========================================================
> 1. Why in publication_add_schema(), you are registering invalidation
> for all schema relations? It seems this is to allow rebuilding the
> publication info for decoding sessions. But that is not required as
> you are registering rel_sync_cache_publication_cb for
> publication_schema relation. In rel_sync_cache_publication_cb, we are
> marking replicate_valid as false for each entry which will allow
> publication info to be rebuilt in get_rel_sync_entry.
>
> I see that it is done for a single relation in the current code in
> function publication_add_relation but I guess that is also not
> required. You can once test this. If you still think it is required,
> can you please explain me why and then we can accordingly add some
> comments in the patch.

Added a comment for this.

> 2.
> + * Publication object type
> + */
> +typedef enum PublicationObjSpecType
> +{
> + PUBLICATIONOBJ_TABLE, /* Table type */
> + PUBLICATIONOBJ_SCHEMA, /* Schema type */
> + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */
>
> Why add anything related to the sequence in this patch?

This is removed.

> 3.
> +get_object_address_publication_schema(List *object, bool missing_ok)
> +{
> + ObjectAddress address;
> + char    *pubname;
> + Publication *pub;
> + char    *schemaname;
> + Oid schemaoid;
> +
> + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid);
> +
> + /* Fetch schema name and publication name from input list */
> + schemaname = strVal(linitial(object));
> + pubname = strVal(lsecond(object));
> +
> + schemaoid = get_namespace_oid(schemaname, false);
> +
> + /* Now look up the pg_publication tuple */
> + pub = GetPublicationByName(pubname, missing_ok);
> + if (!pub)
> + return address;
>
> Why the schema name handling is different from publication name? Why
> can't we pass missing_ok for schema api and handle it similar
> publication api?

Modified to handle it similarly.

> 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used
> in get_publication_name() whereas it is used for all other syscache
> searches in that function?

Modified it to use missing_ok.

> 5. Don't we need to expose a view for publication schemas similar to
> pg_publication_tables?

I have not handled in this version, I will do it in the next version.

> 6.
> publication_add_schema()
> {
> ..
> + /* Can't be system namespace */
> + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("\"%s\" is a system schema",
> + get_namespace_name(schemaoid)),
> + errdetail("System schemas cannot be added to publications.")));
> +
> + /* Can't be temporary namespace */
> + if (isAnyTempNamespace(schemaoid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("\"%s\" is a temporary schema",
> + get_namespace_name(schemaoid)),
> + errdetail("Temporary schemas cannot be added to publications.")));
> ..
> }
>
> Can we change the first detail message as: "This operation is not
> supported for system schemas." and the second detail message as:
> "Temporary schemas cannot be replicated."? This is to make these
> messages similar to corresponding messages for relations in function
> check_publication_add_relation(). Can we move these checks to a
> separate function?

Modified.

Attached v23 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:


On Mon, Aug 30, 2021 at 8:23 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
>
> Just experimenting with the new syntax so far, and seeing some new
> messages and docs, I have the following suggestions for improvements:
>
> src/backend/commands/publicationcmds.c
>
> (1)
> BEFORE:
> for table/for all tables in schema should be specified before the object
> AFTER:
> FOR TABLE / FOR ALL TABLES IN SCHEMA should be specified before the
> table/schema name(s)

Modified it to "FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s)", I felt we need not have space before and after "/", I had seen it is specified similarly in few other places.

> (2)
> BEFORE:
> Tables cannot be added, dropped or set on FOR ALL TABLES publications.
> AFTER:
> Tables cannot be added to, dropped from, or set on FOR ALL TABLES publications.

Modified

> (3)
> BEFORE:
> Schemas cannot be added, dropped or set on FOR ALL TABLES publications.
> AFTER:
> Schemas cannot be added to, dropped from, or set on FOR ALL TABLES publications.

Modified

> v22-0002
>
> doc/src/sgml/ref/create_publication.sgml
>
> (1)
> BEFORE:
> +   Create a publication that publishes all changes for users and departments
> +   table and that publishes all changes for all the tables present in the
> AFTER:
> +   Create a publication that publishes all changes for tables "users" and
> +   "departments" and that publishes all changes for all the tables
> present in the

Modified.

I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to reduce the
> > testing effort and also it will be easier if someone disagrees with the syntax. I
> > will merge it to the main patch later based on the feedback. Attached v22 patch
> > has the changes for the same.
> > Thoughts?
>
> Hi,
>
> Here are some comments for the new version patches.
>
> About  0001
> 1)
> +                       rel->relpersistence = RELPERSISTENCE_PERMANENT;
>
> It seems we don't need to set this since makeRangeVarFromNameList()
> already set it.

Modified

> 2)
> +                       if (!relids || !schemarelids)
> +                               tables = list_concat(relids, schemarelids);
> +                       else
> +                               tables = list_concat_unique_oid(relids, schemarelids);
> +               }
>
> It seems we can simplify the above code like the following:
> tables = list_concat_unique_oid(relids, schemarelids);

Modified

> 3)
> +               relids = GetPublicationRelations(pubform->oid,
> +                                                                                PUBLICATION_PART_ALL);
> +               schemarelids = GetAllSchemasPublicationRelations(pubform->oid,
> +
PUBLICATION_PART_ALL);
> +               relids = list_concat(relids, schemarelids);
>
> should we invoke list_concat_unique_oid here ?

Modified

> 4)
>
> +                               search_path = fetch_search_path(false);
> +                               if (search_path == NIL) /* nothing valid in search_path? */
>
> It might be better to list_free(search_path) when not used.

Modified

> 5)
> +                       if (list_length(pubobj->name) == 1 &&
> +                               (strcmp(relname, "CURRENT_SCHEMA") == 0))
> +                               ereport(ERROR,
> +                                               errcode(ERRCODE_SYNTAX_ERROR),
> +                                               errmsg("invalid relation name at or near"),
> +                                               parser_errposition(pstate, pubobj->location));
>
> Maybe we don't need this check, because it will report an error in
> OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that
> message seems readable to me.

Allowing CURRENT_SCHEMA is required to support current schema for
schema publications, currently I'm allowing this syntax during parsing
and this error is thrown for relations later, this is done to keep the
similar error as earlier before this feature support. I felt we can
keep it like this to maintain the similar error. Thoughts?

> About  0002
> 6)
>
> diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
> index 0c84d87873..0a479dfe36 100644
> --- a/src/test/subscription/t/001_rep_changes.pl
> +++ b/src/test/subscription/t/001_rep_changes.pl
> @@ -6,7 +6,7 @@ use strict;
>  use warnings;
>  use PostgresNode;
>  use TestLib;
> -use Test::More tests => 32;
> +use Test::More tests => 46;
>
> I think it might be better to move these testcases create a separate perl file.

Modified, added to 025_rep_changes_for_schema.pl

> About  0003
> 7)
> The v22-0003 seems simple and can remove lots of code in patch v22-0001, so
> maybe we can merge 0001 and 0003 into one patch ?

I agree that the code becomes simpler, it reduces a lot of code. I had
kept it like that as the testing effort might be more and also I was
waiting if there was no objection for that syntax from anyone else. I
will wait for a few more reviews and merge it to 0001 if there are no
objections.

I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 30, 2021 at 1:31 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
> > Thoughts?
> >
>
> Thanks for your new patch. Here are some suggestions:
>
> 1.
> If a publication published a table and the schema where the table belonged to, the
> publication name would show twice when using '\d+' for the table.
> Maybe we should add some check to avoid the duplication. Thought?
>
> For example:
>
> create schema sch1;
> create table sch1.tbl(a int);
> create publication pub for table sch1.tbl, all tables in schema sch1;
>
> postgres=# \d+ sch1.tbl
>                                             Table "sch1.tbl"
>  Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
> --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
>  a      | integer |           |          |         | plain   |             |              |
> Publications:
>     "pub"
>     "pub"
> Access method: heap

Modified

> 2. doc/src/sgml/catalogs.sgml
> @@ -6169,6 +6174,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
>         publication instead of its own.
>        </para></entry>
>       </row>
> +
>      </tbody>
>     </tgroup>
>    </table>
>
> It seems that we don't need this change.

Modified

> 3. src/bin/psql/tab-complete.c
>
> +       /* Complete "CREATE PUBLICATION <name> FOR SCHEMA <schema>, ..." */
> +       else if (HeadMatches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA"))
> +               COMPLETE_WITH_QUERY(Query_for_list_of_schemas
> +                                                       " UNION SELECT 'CURRENT_SCHEMA' "
> +                                                       "UNION SELECT 'WITH ('");
>
> The comment should be updated to "FOR ALL TABLES IN SCHEMA".

Modified.

I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 30, 2021 at 2:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
>
> I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> TABLE sc1.test;"  maintains the table separately and results in the
> following in the \dRp+ output:
>
> Tables:
>     "sc1.test"
> Schemas:
>     "sc1"
>
> and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> still leaves the "sc1.test" table in the publication.

I had intentionally implemented this way, the reason being it gives
the flexibility to modify the publications based on the way the
publication is created. My idea was that if a user specified a
table/schema of the same schema while creating the publication, the
user should be allowed to drop any of them at any time. In the above
case if we don't maintain the results separately, users will not be
able to drop the table from the publication at a later point of time.
Thoughts?

> Is there a reason why we don't/can't support "ALTER SUBSCRIPTION ...
> SET ALL TABLES;"?
> (I know it wasn't supported before, but now "ALTER SUBSCRIPTION ...
> SET ALL TABLES IN SCHEMA ..." is being supported)

I agree this can be implemented with the current design. I felt I can
work on getting the current patches into a committable shape and then
work on this after the current patch series is finished. Thoughts?

> I notice that the v22-0003 documentation updates for ALTER
> SUBSCRIPTION are missing - but you're probably waiting on all feedback
> before proceeding with that.

I have fixed this as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Aug 30, 2021 at 12:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> Okay, I got it but let's add few comments in the code related to it.
> Also, I noticed that the code in InvalidatePublicationRels() already
> exists in AlterPublicationOptions(). You can try to refactor the
> existing code as a separate initial patch.

I have made these changes at the v23 patch attached at [1].

> BTW, I noticed that "for all tables", we don't register invalidations
> in the above scenario, and then later that causes conflict on the
> subscriber. I think that is a bug in the current code and we can deal
> with that separately.

I agree that the cache invalidation has been missed in case of "for all tables" publication, I have fixed these and posted a patch for the same at [2].

[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm0zkQznFrxzHBoWZUGsf%3DnKSxhEZZhZ1eTDWLpFok6zZw%40mail.gmail.com

Regards,
Vigneshhas

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> >
> > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > TABLE sc1.test;"  maintains the table separately and results in the
> > following in the \dRp+ output:
> >
> > Tables:
> >     "sc1.test"
> > Schemas:
> >     "sc1"
> >
> > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > still leaves the "sc1.test" table in the publication.
>
> I had intentionally implemented this way, the reason being it gives
> the flexibility to modify the publications based on the way the
> publication is created. My idea was that if a user specified a
> table/schema of the same schema while creating the publication, the
> user should be allowed to drop any of them at any time. In the above
> case if we don't maintain the results separately, users will not be
> able to drop the table from the publication at a later point of time.
> Thoughts?
>

Hmmm. I'm not sure it should work like that (but maybe I'm wrong -
what do others think???).
I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
TABLE sc1.test;" should silently just ignore the "TABLE sc1.test"
part, as that is a table in schema sc1, so it's effectively a
duplicate.

Also, I noticed the following:

postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
postgres-# TABLE sc1.test;
CREATE PUBLICATION
postgres=# \dRp+
                            Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Tables:
    "sc1.test"
Schemas:
    "sc1"

postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
ALTER PUBLICATION
postgres=# \dRp+
                            Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Tables:
    "sc1.test"

postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
ERROR:  schema "sc1" is not part of the publication


I think it seems odd to not drop table "sc1.test" from the publication
after "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;".
Also, after running that command again, it seems odd to report that
schema sc1 is not part of the publication, when there remains one
table from that schema in the publication.
And shouldn't it say "tables from schema ... are not part of the
publication" rather than "schema ... is not part of the publication"?
I think the former is better and more accurate. Schemas can contain
database objects other than tables.
Similarly, I'm also thinking that in the "\dRp+" output, it should say
"Tables from schemas:" instead of "Schemas:".


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Aug 31, 2021 at 10:50 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > >
> > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > > TABLE sc1.test;"  maintains the table separately and results in the
> > > following in the \dRp+ output:
> > >
> > > Tables:
> > >     "sc1.test"
> > > Schemas:
> > >     "sc1"
> > >
> > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > > still leaves the "sc1.test" table in the publication.
> >
> > I had intentionally implemented this way, the reason being it gives
> > the flexibility to modify the publications based on the way the
> > publication is created. My idea was that if a user specified a
> > table/schema of the same schema while creating the publication, the
> > user should be allowed to drop any of them at any time. In the above
> > case if we don't maintain the results separately, users will not be
> > able to drop the table from the publication at a later point of time.
> > Thoughts?
> >
>
> Hmmm. I'm not sure it should work like that (but maybe I'm wrong -
> what do others think???).
> I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> TABLE sc1.test;" should silently just ignore the "TABLE sc1.test"
> part, as that is a table in schema sc1, so it's effectively a
> duplicate.
>

I find the way it is implemented to be more intuitive as that gives
users more flexibility to retain certain tables from the schema and
appears to be exactly what users intended by the command. I don't
think finding duplicates among different object lists (schema, table)
is a good idea because tomorrow for some other objects the same thing
can happen. It might be better to get some other opinions on this
matter though.

> Also, I noticed the following:
>
> postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> postgres-# TABLE sc1.test;
> CREATE PUBLICATION
> postgres=# \dRp+
>                             Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Tables:
>     "sc1.test"
> Schemas:
>     "sc1"
>
> postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> ALTER PUBLICATION
> postgres=# \dRp+
>                             Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Tables:
>     "sc1.test"
>
> postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> ERROR:  schema "sc1" is not part of the publication
>

What will happen if you second time run the command as ALTER
PUBLICATION pub1 DROP Table sc1.test? If that works, I think the
behavior should be fine.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Aug 31, 2021 at 9:15 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 12:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Okay, I got it but let's add few comments in the code related to it.
> > Also, I noticed that the code in InvalidatePublicationRels() already
> > exists in AlterPublicationOptions(). You can try to refactor the
> > existing code as a separate initial patch.
>
> I have made these changes at the v23 patch attached at [1].
>

*
+/*
+ * Invalidate the relations.
+ */
+static inline void
+InvalidatePublicationRels(List *relids)

I don't see the need to make this an inline function.

* On similar lines, the code in function
GetPubPartitionOptionRelations seems to be extracted from existing
function GetPublicationRelations(). Can't we move it into the 0001
patch?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Aug 31, 2021 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 10:50 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > >
> > > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > > > TABLE sc1.test;"  maintains the table separately and results in the
> > > > following in the \dRp+ output:
> > > >
> > > > Tables:
> > > >     "sc1.test"
> > > > Schemas:
> > > >     "sc1"
> > > >
> > > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > > > still leaves the "sc1.test" table in the publication.
> > >
> > > I had intentionally implemented this way, the reason being it gives
> > > the flexibility to modify the publications based on the way the
> > > publication is created. My idea was that if a user specified a
> > > table/schema of the same schema while creating the publication, the
> > > user should be allowed to drop any of them at any time. In the above
> > > case if we don't maintain the results separately, users will not be
> > > able to drop the table from the publication at a later point of time.
> > > Thoughts?
> > >
> >
> > Hmmm. I'm not sure it should work like that (but maybe I'm wrong -
> > what do others think???).
> > I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > TABLE sc1.test;" should silently just ignore the "TABLE sc1.test"
> > part, as that is a table in schema sc1, so it's effectively a
> > duplicate.
> >
>
> I find the way it is implemented to be more intuitive as that gives
> users more flexibility to retain certain tables from the schema and
> appears to be exactly what users intended by the command. I don't
> think finding duplicates among different object lists (schema, table)
> is a good idea because tomorrow for some other objects the same thing
> can happen. It might be better to get some other opinions on this
> matter though.
>
> > Also, I noticed the following:
> >
> > postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > postgres-# TABLE sc1.test;
> > CREATE PUBLICATION
> > postgres=# \dRp+
> >                             Publication pub1
> >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > -------+------------+---------+---------+---------+-----------+----------
> >  gregn | f          | t       | t       | t       | t         | f
> > Tables:
> >     "sc1.test"
> > Schemas:
> >     "sc1"
> >
> > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> > ALTER PUBLICATION
> > postgres=# \dRp+
> >                             Publication pub1
> >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > -------+------------+---------+---------+---------+-----------+----------
> >  gregn | f          | t       | t       | t       | t         | f
> > Tables:
> >     "sc1.test"
> >
> > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;
> > ERROR:  schema "sc1" is not part of the publication
> >
>
> What will happen if you second time run the command as ALTER
> PUBLICATION pub1 DROP Table sc1.test? If that works, I think the
> behavior should be fine.

Alter publication drop table works fine:
postgres=# ALTER PUBLICATION pub1 DROP table sc1.test ;
ALTER PUBLICATION
postgres=# \dRp+
                             Publication pub1
  Owner  | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------+------------+---------+---------+---------+-----------+----------
 vignesh | f          | t       | t       | t       | t         | f
(1 row)

Regards,
Vignesh



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Mon, Aug 30, 2021 11:26 PM vignesh C <vignesh21@gmail.com> wrote:
> On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > 5)
> > +                       if (list_length(pubobj->name) == 1 &&
> > +                               (strcmp(relname, "CURRENT_SCHEMA") ==
> 0))
> > +                               ereport(ERROR,
> > +
> errcode(ERRCODE_SYNTAX_ERROR),
> > +                                               errmsg("invalid relation
> name at or near"),
> > +
> > + parser_errposition(pstate, pubobj->location));
> >
> > Maybe we don't need this check, because it will report an error in
> > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" ,
> > and that message seems readable to me.
> 
> Allowing CURRENT_SCHEMA is required to support current schema for schema
> publications, currently I'm allowing this syntax during parsing and this error is
> thrown for relations later, this is done to keep the similar error as earlier before
> this feature support. I felt we can keep it like this to maintain the similar error.
> Thoughts?

Thanks for the explanation, I got the point.

Here are some other comments for v23-000x patches.

1)

@@ -6225,6 +6342,9 @@ describePublications(const char *pattern)
     bool        has_pubtruncate;
     bool        has_pubviaroot;
 
+    PQExpBufferData title;
+    printTableContent cont;
+
     if (pset.sversion < 100000)
     {
        ...
        PQExpBufferData title;
         printTableOpt myopt = pset.popt.topt;
         printTableContent cont;

Should we delete the inner declaration of 'title' and 'cont' ?

2)
-    /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
+    /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE/SCHEMA */

SCHEMA => ALL TABLES IN SCHEMA

3)

+                              .description = "PUBLICATION SCHEMA",
+                              .section = SECTION_POST_DATA,
+                              .createStmt = query->data));

Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to describe
the schema level table publication ? Because there could be some other type
publication such as 'ALL SEQUENCES IN SCHEMA' in the future, it will be better
to make it clear that we only publish table in schema in this patch.

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Aug 31, 2021 at 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I find the way it is implemented to be more intuitive as that gives
> users more flexibility to retain certain tables from the schema and
> appears to be exactly what users intended by the command. I don't
> think finding duplicates among different object lists (schema, table)
> is a good idea because tomorrow for some other objects the same thing
> can happen. It might be better to get some other opinions on this
> matter though.
>

I think that such functionality needs to be clearly documented (but
currently the documentation doesn't sufficiently explain it).
Yes, I would definitely like to hear other opinions on this.

Note also that currently parts of the documentation are still
referring to "ADD SCHEMA/DROP SCHEMA/SET SCHEMA" instead of the new
syntax "ADD ALL TABLES IN SCHEMA/DROP ALL TABLES IN SCHEMA/SET ALL
TABLES IN SCHEMA":

e.g.
v23-0003:
doc/src/sgml/ref/alter_publication.sgml
+   The fourth, fifth and sixth variants of this command change which schemas
+   are part of the publication.  The <literal>SET SCHEMA</literal> clause will
+   replace the list of schemas in the publication with the specified one.
+   The <literal>ADD SCHEMA</literal> and <literal>DROP SCHEMA</literal> clauses
+   will add and remove one or more schemas from the publication.  Note that
+   adding schemas to a publication that is already subscribed to will require
+   a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on
+   the subscribing side in order to become effective.

With the new syntax changes (SCHEMA -> ALL TABLES IN SCHEMA), it seems
less intuitive that there are separate schema and table operations on
the publication.

I'd expect a lot of users to naturally think that "ALTER PUBLICATION
pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication
all tables that are in schema "sc1", which is not what it is currently
doing.
Since the syntax was changed to specifically refer to FOR ALL TABLES
IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring
to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE
sc1.test", so IMHO it's reasonable to remove duplicates here, rather
than treating these as somehow separate ways of referencing the same
table.

One thing the current scheme doesn't allow is to publish all tables in
a schema except for certain table(s) - and you can't achieve that by
adding all tables from a schema to the publication and then
selectively dropping some of those tables. I thought that this would
be a more common pattern than adding separate tables from schemas that
are already included as part of the publication, in order to "retain"
them if "all tables from schema ..." are later dropped.

postgres=# create schema sc1;
CREATE SCHEMA
postgres=# create table sc1.test(i int);
CREATE TABLE
postgres=# create publication pub1 for all tables in schema sc1;
CREATE PUBLICATION
postgres=# \dRp+

                            Publication pub1
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Schemas:
    "sc1"

postgres=# alter publication pub1 drop table sc1.test;
ERROR:  relation "test" is not part of the publication

The above error message seems slightly misleading (as that table IS
published by that publication) and also note the relation is not
schema-qualified.



Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
> On Monday, August 30, 2021 11:28 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have fixed these comments as part of v23 patch attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw
> %40mail.gmail.com
> 

Thanks for your new patch. Here are some comments on v23 patch. 

1. doc/src/sgml/ref/alter_publication.sgml
+  <para>
+   Add some schemas to the publication:
+<programlisting>
+ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june;
+</programlisting>
+  </para>

This change seems to be added twice, both 0003 and 0004 patch have this change.

2. src/sgml/ref/create_publication.sgml
There is the following description about "FOR TABLE" parameter:
      Only persistent base tables and partitioned tables can be part of a
      publication.  Temporary tables, unlogged tables, foreign tables,
      materialized views, and regular views cannot be part of a publication.

"FOR ALL TABLES IN SCHEMA" parameter also have restrictions, should we add
some doc description for it?

3. When using '\dn+', I noticed that the list of publications only contains the
publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it designed on purpose?
(The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES").

For example:
create schema sch1;
create table sch1.tbl(a int);
create publication pub_schema for all tables in schema sch1;
create publication pub_all_tables for all tables;

postgres=# \d+ sch1.tbl
                                            Table "sch1.tbl"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              |
Publications:
    "pub_all_tables"
    "pub_schema"
Access method: heap

postgres=# \dn+ sch1
                  List of schemas
 Name |  Owner   | Access privileges | Description
------+----------+-------------------+-------------
 sch1 | postgres |                   |
Publications:
    "pub_schema"

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I find the way it is implemented to be more intuitive as that gives
> > users more flexibility to retain certain tables from the schema and
> > appears to be exactly what users intended by the command. I don't
> > think finding duplicates among different object lists (schema, table)
> > is a good idea because tomorrow for some other objects the same thing
> > can happen. It might be better to get some other opinions on this
> > matter though.
> >
>
> I think that such functionality needs to be clearly documented (but
> currently the documentation doesn't sufficiently explain it).
>

I agree that we should document it clearly if we decide to go in this direction.

> Yes, I would definitely like to hear other opinions on this.
>
> Note also that currently parts of the documentation are still
> referring to "ADD SCHEMA/DROP SCHEMA/SET SCHEMA" instead of the new
> syntax "ADD ALL TABLES IN SCHEMA/DROP ALL TABLES IN SCHEMA/SET ALL
> TABLES IN SCHEMA":
>
> e.g.
> v23-0003:
> doc/src/sgml/ref/alter_publication.sgml
> +   The fourth, fifth and sixth variants of this command change which schemas
> +   are part of the publication.  The <literal>SET SCHEMA</literal> clause will
> +   replace the list of schemas in the publication with the specified one.
> +   The <literal>ADD SCHEMA</literal> and <literal>DROP SCHEMA</literal> clauses
> +   will add and remove one or more schemas from the publication.  Note that
> +   adding schemas to a publication that is already subscribed to will require
> +   a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on
> +   the subscribing side in order to become effective.
>
> With the new syntax changes (SCHEMA -> ALL TABLES IN SCHEMA), it seems
> less intuitive that there are separate schema and table operations on
> the publication.
>

I think the documentation could be improved w.r.t above points because
it is quite clear that we support schema and tables separately.


> I'd expect a lot of users to naturally think that "ALTER PUBLICATION
> pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication
> all tables that are in schema "sc1", which is not what it is currently
> doing.
> Since the syntax was changed to specifically refer to FOR ALL TABLES
> IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring
> to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE
> sc1.test", so IMHO it's reasonable to remove duplicates here, rather
> than treating these as somehow separate ways of referencing the same
> table.
>

I see your point and if we decide to go this path then it is better to
give an error than silently removing duplicates.

> One thing the current scheme doesn't allow is to publish all tables in
> a schema except for certain table(s)
>

True, we can always support that as a separate feature. Note that same
is true for existing FOR ALL TABLES syntax where users could expect to
leave few tables with syntax like FOR ALL TABLES EXCEPT t1,t2,... but
we don't have that yet.

> - and you can't achieve that by
> adding all tables from a schema to the publication and then
> selectively dropping some of those tables. I thought that this would
> be a more common pattern than adding separate tables from schemas that
> are already included as part of the publication, in order to "retain"
> them if "all tables from schema ..." are later dropped.
>
> postgres=# create schema sc1;
> CREATE SCHEMA
> postgres=# create table sc1.test(i int);
> CREATE TABLE
> postgres=# create publication pub1 for all tables in schema sc1;
> CREATE PUBLICATION
> postgres=# \dRp+
>
>                             Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Schemas:
>     "sc1"
>
> postgres=# alter publication pub1 drop table sc1.test;
> ERROR:  relation "test" is not part of the publication
>
> The above error message seems slightly misleading (as that table IS
> published by that publication) and also note the relation is not
> schema-qualified.
>

I think we should try to improve this message, maybe we can give
detail message similar to what we give For ALL Tables case (DETAIL:
Tables that are part of SCHEMA cannot be dropped independently).

--
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Peter Smith
Date:
On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 2:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have implemented this in the 0003 patch, I have kept it separate to
> > > reduce the testing effort and also it will be easier if someone
> > > disagrees with the syntax. I will merge it to the main patch later
> > > based on the feedback. Attached v22 patch has the changes for the
> > > same.
> >
> > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1,
> > TABLE sc1.test;"  maintains the table separately and results in the
> > following in the \dRp+ output:
> >
> > Tables:
> >     "sc1.test"
> > Schemas:
> >     "sc1"
> >
> > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;"
> > still leaves the "sc1.test" table in the publication.
>
> I had intentionally implemented this way, the reason being it gives
> the flexibility to modify the publications based on the way the
> publication is created. My idea was that if a user specified a
> table/schema of the same schema while creating the publication, the
> user should be allowed to drop any of them at any time. In the above
> case if we don't maintain the results separately, users will not be
> able to drop the table from the publication at a later point of time.
> Thoughts?

I think ALL should mean ALL.

When you say CREATE PUBLICATION pub1 FOR ALL TABLES; you know it means
ALL tables;

When you say CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1; you
know it means ALL tables in the schema sc1;

Similarly, when you say ALTER PUBLICATION pub1 DROP ALL TABLES IN
SCHEMA sc1; I would expect it means to drop ALL tables in sc1 - not
all of them sometimes but not all of them at other times or even none
of them.

~~

I thought a motivation for this patch was to make it easy for the user
to specify tables en-masse. e.g, saying FOR ALL TABLES IN SCHEMA sc1
is a convenience instead of having to specify every schema table
explicitly.

e.g. What if there are 100s of tables explicitly in a publication.
1. CREATE PUBLICATION pub FOR TABLE sc1.t1,sc1,t2,sc1.t3,....,sc1.t999;
2. ALTER PUBLICATION pub DROP ALL TABLES IN SCHEMA sc1;

IIUC the current patch behaviour for step 2 will do nothing at all.
That doesn't seem right to me. Where is the user-convenience factor
for dropping tables here?

~~

If the rule was simply "ALL means ALL" that hardly even needs
documentation to describe it. OTOH, current patch behaviour is quirky
wrt how the publication was created and would need to be carefully
documented.

It is easy to imagine a user unfamiliar with how the publication was
originally created will be confused when ALTER PUBLICATION DROP ALL
TABLES IN SCHEMA sc1 still leaves some sc1 tables lurking.

~~

Schema objects are not part of the publication. Current only TABLES
are in publications, so I thought that \dRp+ output would just be the
of "Tables" in the publication. Schemas would not even be displayed at
all (except in the table name).

Consider that everything is just going to get messier when SEQUENCES
are added. If you list Schemas like is done now then what's that going
to look like later? I think you'd need to display many lists like -
"Tables" and "Tables for Schemas" and "Sequences" and "Sequences for
Schema"...

IMO it's all beginning to sound like it would become overly complex

~~

For all the above reasons I think ALL should mean ALL, and also \dRp+
should display tables only. (and later table and sequences only)

YMMV.

------
Kind Regards,
Peter Smith
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 1, 2021 at 12:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Similarly, when you say ALTER PUBLICATION pub1 DROP ALL TABLES IN
> SCHEMA sc1; I would expect it means to drop ALL tables in sc1 - not
> all of them sometimes but not all of them at other times or even none
> of them.
>
> ~~
>
> I thought a motivation for this patch was to make it easy for the user
> to specify tables en-masse. e.g, saying FOR ALL TABLES IN SCHEMA sc1
> is a convenience instead of having to specify every schema table
> explicitly.
>
> e.g. What if there are 100s of tables explicitly in a publication.
> 1. CREATE PUBLICATION pub FOR TABLE sc1.t1,sc1,t2,sc1.t3,....,sc1.t999;
> 2. ALTER PUBLICATION pub DROP ALL TABLES IN SCHEMA sc1;
>
> IIUC the current patch behaviour for step 2 will do nothing at all.
> That doesn't seem right to me. Where is the user-convenience factor
> for dropping tables here?
>


It will give an error (ERROR:  schema "sc1" is not part of the
publication), we can probably add DETAIL and HINT message (like try
dropping using DROP TABLE syntax) indicating the reason and what user
need to do in this regard. I think if the user has specified tables
individually instead of ALL TABLES IN SCHEMA, she should drop them
individually rather than expecting them to be dropped via SCHEMA
command. I think in a similar context you can also argue that we
should have DROP ALL TABLES syntax to drop all the tables that were
individually specified by the user with FOR TABLE command.

I think this should be documented as well to avoid any confusion.

> ~~
>
> If the rule was simply "ALL means ALL" that hardly even needs
> documentation to describe it. OTOH, current patch behaviour is quirky
> wrt how the publication was created and would need to be carefully
> documented.
>
> It is easy to imagine a user unfamiliar with how the publication was
> originally created will be confused when ALTER PUBLICATION DROP ALL
> TABLES IN SCHEMA sc1 still leaves some sc1 tables lurking.
>
> ~~
>
> Schema objects are not part of the publication. Current only TABLES
> are in publications, so I thought that \dRp+ output would just be the
> of "Tables" in the publication. Schemas would not even be displayed at
> all (except in the table name).
>

Hmm, I think this will lead to a lot of table names in output. I think
displaying schema names separately is a better approach here.

> Consider that everything is just going to get messier when SEQUENCES
> are added. If you list Schemas like is done now then what's that going
> to look like later? I think you'd need to display many lists like -
> "Tables" and "Tables for Schemas" and "Sequences" and "Sequences for
> Schema"...
>

I think that would be better than displaying all the tables and
sequences as that will be very difficult for users to view/read.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Aug 31, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> >
> > I have made these changes at the v23 patch attached at [1].
> >
>
> *
> +/*
> + * Invalidate the relations.
> + */
> +static inline void
> +InvalidatePublicationRels(List *relids)
>
> I don't see the need to make this an inline function.

Modified

> * On similar lines, the code in function
> GetPubPartitionOptionRelations seems to be extracted from existing
> function GetPublicationRelations(). Can't we move it into the 0001
> patch?

Modified

Thanks for the comments, the attached v24 patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Don't you think some users might want to know all the schema names for
> a publication? I am not completely sure on this point but I think it
> is good to have information for users. It might be also useful to have
> pg_publication_objects where we can display object types (like table,
> schema, sequence, etc) and then object names. If you are not convinced
> then we can wait and see what others think about this.

Thanks for the comment, this is handled in the v24 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 1, 2021 at 6:58 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Here are some other comments for v23-000x patches.
>
> 1)
>
> @@ -6225,6 +6342,9 @@ describePublications(const char *pattern)
>         bool            has_pubtruncate;
>         bool            has_pubviaroot;
>
> +       PQExpBufferData title;
> +       printTableContent cont;
> +
>         if (pset.sversion < 100000)
>         {
>                 ...
>                 PQExpBufferData title;
>                 printTableOpt myopt = pset.popt.topt;
>                 printTableContent cont;
>
> Should we delete the inner declaration of 'title' and 'cont' ?

Modified

> 2)
> -       /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
> +       /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE/SCHEMA */
>
> SCHEMA => ALL TABLES IN SCHEMA

Modified

> 3)
>
> +                                                         .description = "PUBLICATION SCHEMA",
> +                                                         .section = SECTION_POST_DATA,
> +                                                         .createStmt = query->data));
>
> Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to describe
> the schema level table publication ? Because there could be some other type
> publication such as 'ALL SEQUENCES IN SCHEMA' in the future, it will be better
> to make it clear that we only publish table in schema in this patch.

Modified

Thanks for the comments, the v24 patch attached at [1] handles the comments.
[1] - https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 1, 2021 at 11:14 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> > On Monday, August 30, 2021 11:28 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have fixed these comments as part of v23 patch attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw
> > %40mail.gmail.com
> >
>
> Thanks for your new patch. Here are some comments on v23 patch.
>
> 1. doc/src/sgml/ref/alter_publication.sgml
> +  <para>
> +   Add some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june;
> +</programlisting>
> +  </para>
>
> This change seems to be added twice, both 0003 and 0004 patch have this change.

Modified

> 2. src/sgml/ref/create_publication.sgml
> There is the following description about "FOR TABLE" parameter:
>       Only persistent base tables and partitioned tables can be part of a
>       publication.  Temporary tables, unlogged tables, foreign tables,
>       materialized views, and regular views cannot be part of a publication.
>
> "FOR ALL TABLES IN SCHEMA" parameter also have restrictions, should we add
> some doc description for it?

Modified

> 3. When using '\dn+', I noticed that the list of publications only contains the
> publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it designed on purpose?
> (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES").
>
> For example:
> create schema sch1;
> create table sch1.tbl(a int);
> create publication pub_schema for all tables in schema sch1;
> create publication pub_all_tables for all tables;

I'm not sure if it is intentional or not, Do you want to post the
question in a separate thread and see if that should be handled?

Thanks for the comments, the v24 patch attached at [1] handles the comments.
[1] - https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com

Regards,
Vignesh



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Wed, Sep 1, 2021 2:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Schema objects are not part of the publication. Current only TABLES are in
> publications, so I thought that \dRp+ output would just be the of "Tables" in
> the publication. Schemas would not even be displayed at all (except in the
> table name).

I think one use case of schema level publication is it can automatically
publish new table created in the shcema(same as ALL TABLE publication). So,
IMO, \dRp+ should output Schema level publication separately to make the user
aware of it.

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
>

Below are few comments on v23. If you have already addressed anything
in v24, then ignore it.

1. The commit message says: A new system table "pg_publication_schema"
has been added, to maintain the schemas that the user wants to publish
through the publication.". The alternative name for this system table
could be "pg_publication_namespace". The reason why this alternative
comes to my mind is that the current system table to store schema
information is named pg_namespace. So shouldn't we be consistent here?
What do others think about this?

2. In function check_publication_add_schema(), the primary error
message should be "cannot add schema \"%s\" to publication". See
check_publication_add_relation() for similar error messages.

3.
+ObjectAddress
+publication_add_schema(Oid pubid, Oid schemaoid, bool if_not_exists)

Isn't it better to use 'schemaid' so that it is consistent with 'pubid'?

4.
ConvertPubObjSpecListToOidList()
{
..
+ schemaoid = linitial_oid(search_path);
+ nspname = get_namespace_name(schemaoid);
+ if (nspname == NULL) /* recently-deleted namespace? */
+ ereport(ERROR,
+ errcode(ERRCODE_UNDEFINED_SCHEMA),
+ errmsg("no schema has been selected"));
+
+ list_free(search_path);
..
}

You can free the memory for 'nspname' as that is not used afterward.

5.
+ schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL);
+
+ /* Invalidate relcache so that publication info is rebuilt. */
+ InvalidatePublicationRels(schemaRels);

It is better to write this comment above
GetSchemaPublicationRelations, something like below:

+ /* Invalidate relcache so that publication info is rebuilt. */
+ schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL);
+ InvalidatePublicationRels(schemaRels);

6.
+static List *
+GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
+    Oid relid)

I think it is better to name this function as
GetPublicationPartOptRelations as that way it will be more consistent
with existing functions and structure name PublicationPartOpt.

7. All the callers of PublicationAddSchemas() have a superuser check,
then why there is again a check of owner/superuser in that function?

8.
+/*
+ * Gets the list of FOR ALL TABLES IN SCHEMA publication oids associated with a
+ * specified schema oid
+ */
+List *
+GetSchemaPublications(Oid schemaid)

I find it a bit difficult to read this comment. Can we omit "FOR ALL
TABLES IN SCHEMA" from this comment?

9. In the doc patch
(v23-0003-Tests-and-documentation-for-schema-level-support), I see
below line:
   <para>
-   To add a table to a publication, the invoking user must have ownership
-   rights on the table.  The <command>FOR ALL TABLES</command> clause requires
-   the invoking user to be a superuser.
+   To add a table/schema to a publication, the invoking user must have
+   ownership rights on the table/schema.  The <command>FOR ALL TABLES</command>
+   and <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking
+   user to be a superuser.

Is it correct to specify the schema in the first line? AFAIU, all
forms of schema addition requires superuser privilege.


-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Peter Smith
Date:
On Thu, Sep 2, 2021 at 6:50 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Wed, Sep 1, 2021 2:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > Schema objects are not part of the publication. Current only TABLES are in
> > publications, so I thought that \dRp+ output would just be the of "Tables" in
> > the publication. Schemas would not even be displayed at all (except in the
> > table name).
>
> I think one use case of schema level publication is it can automatically
> publish new table created in the shcema(same as ALL TABLE publication). So,
> IMO, \dRp+ should output Schema level publication separately to make the user
> aware of it.

OK. That is a fair point.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Aug 30, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
>
> > 5)
> > +                       if (list_length(pubobj->name) == 1 &&
> > +                               (strcmp(relname, "CURRENT_SCHEMA") == 0))
> > +                               ereport(ERROR,
> > +                                               errcode(ERRCODE_SYNTAX_ERROR),
> > +                                               errmsg("invalid relation name at or near"),
> > +                                               parser_errposition(pstate, pubobj->location));
> >
> > Maybe we don't need this check, because it will report an error in
> > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that
> > message seems readable to me.
>
> Allowing CURRENT_SCHEMA is required to support current schema for
> schema publications, currently I'm allowing this syntax during parsing
> and this error is thrown for relations later, this is done to keep the
> similar error as earlier before this feature support. I felt we can
> keep it like this to maintain the similar error. Thoughts?
>

I find this check quite ad-hoc in the code and I am not sure if we
need to be consistent for the exact message in this case. So, I think
it is better to remove it.

>
> > About  0003
> > 7)
> > The v22-0003 seems simple and can remove lots of code in patch v22-0001, so
> > maybe we can merge 0001 and 0003 into one patch ?
>
> I agree that the code becomes simpler, it reduces a lot of code. I had
> kept it like that as the testing effort might be more and also I was
> waiting if there was no objection for that syntax from anyone else. I
> will wait for a few more reviews and merge it to 0001 if there are no
> objections.
>

+1 to merge the patch as suggested by Hou-San.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 1, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
>
> > I'd expect a lot of users to naturally think that "ALTER PUBLICATION
> > pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication
> > all tables that are in schema "sc1", which is not what it is currently
> > doing.
> > Since the syntax was changed to specifically refer to FOR ALL TABLES
> > IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring
> > to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE
> > sc1.test", so IMHO it's reasonable to remove duplicates here, rather
> > than treating these as somehow separate ways of referencing the same
> > table.
> >
>
> I see your point and if we decide to go this path then it is better to
> give an error than silently removing duplicates.
>

Today, I have thought about this point again and it seems better to
give an error in this case and let the user take the action rather
than silently removing such tables to avoid any confusion.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Below are few comments on v23. If you have already addressed anything
> in v24, then ignore it.
>

More Review comments (on latest version v24):
=======================================
1.
+    Oid            psnspcid BKI_LOOKUP(pg_class);    /* Oid of the schema */
+} FormData_pg_publication_schema;

Why in the above structure you have used pg_class instead of pg_namespace?

2. GetSchemaPublicationRelations() uses two different ways to skip
non-publishable relations in two while loops. Both are correct but I
think it would be easier to follow if they use the same way and in
this case I would prefer a check like if (is_publishable_class(relid,
relForm)). The comments atop function could also be modified to :"Get
the list of publishable relation oids for a specified schema.". I
think it is better to write some comments on why you need to scan and
loop twice.

3. The other point about GetSchemaPublicationRelations() is that I am
afraid that it will collect duplicate relation oids in the final list
when the partitioned table and child tables are in the same schema.

4. In GetRelationPublicationActions(), the handling related to
partitions seems to be missing for the schema. It won't be able to
take care of child tables when they are in a different schema than the
parent table.

5.
If I modify the search path to remove public schema then I get the
below error message:
postgres=# Create publication mypub for all tables in schema current_schema;
ERROR:  no schema has been selected

I think this message is not very clear. How about changing to
something like "current_schema doesn't contain any valid schema"? This
message is used in more than one place, so let's keep it the same at
all the places if you agree to change it.

6. How about naming ConvertPubObjSpecListToOidList() as
ObjectsInPublicationToOids()? I see somewhat similarly named functions
in the code like objectsInSchemaToOids, objectNamesToOids.

7.
+ /*
+ * Schema lock is held until the publication is created to prevent
+ * concurrent schema deletion. No need to unlock the schemas, the
+ * locks will be released automatically at the end of create
+ * publication command.
+ */

In this comment no need to say create publication command as that is
implicit, we can just write ".... at the end of command".

8. Can you please extract changes like tab-completion, dump/restore in
separate patches? This can help to review the core (backend) part of
the patch in more detail.

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Thur, Sep 2, 2021 2:33 PM vignesh C <vignesh21@gmail.com> wrote:
> On Wed, Sep 1, 2021 at 6:58 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > Here are some other comments for v23-000x patches.
> > 3)
> >
> > +                                                         .description =
> "PUBLICATION SCHEMA",
> > +                                                         .section =
> SECTION_POST_DATA,
> > +                                                         .createStmt
> > + = query->data));
> >
> > Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to
> > describe the schema level table publication ? Because there could be
> > some other type publication such as 'ALL SEQUENCES IN SCHEMA' in the
> > future, it will be better to make it clear that we only publish table in schema in
> this patch.
> 
> Modified

Thanks for updating the patch.

I think we might also need to mention the publication object 'table' in the
following types:

1)
+    /* OCLASS_PUBLICATION_SCHEMA */
+    {
+        "publication schema", OBJECT_PUBLICATION_SCHEMA
+    },

2)
+    PUBLICATIONOBJ_SCHEMA,        /* Schema type */
+    PUBLICATIONOBJ_UNKNOWN        /* Unknown type */
+} PublicationObjSpecType;

3)
+    DO_PUBLICATION_SCHEMA,

I think it might be to change the typename like XX_REL_IN_SCHEMA,
and adjust the comments.

Best regards,
Hou zj


RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Thur, Sep 2, 2021 7:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> 
> Below are few comments on v23. If you have already addressed anything in v24,
> then ignore it.
> 
> 1. The commit message says: A new system table "pg_publication_schema"
> has been added, to maintain the schemas that the user wants to publish
> through the publication.". The alternative name for this system table could be
> "pg_publication_namespace". The reason why this alternative comes to my
> mind is that the current system table to store schema information is named
> pg_namespace. So shouldn't we be consistent here?
> What do others think about this?

Since the new system table refer to pg_namespace, so, personally, I am +1 for
"pg_publication_namespace".

Best regards,
Hou zj

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Thursday, September 2, 2021 2:36 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> On Wed, Sep 1, 2021 at 11:14 AM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > 3. When using '\dn+', I noticed that the list of publications only contains the
> > publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it
> designed on purpose?
> > (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES").
> >
> > For example:
> > create schema sch1;
> > create table sch1.tbl(a int);
> > create publication pub_schema for all tables in schema sch1;
> > create publication pub_all_tables for all tables;
> 
> I'm not sure if it is intentional or not, Do you want to post the
> question in a separate thread and see if that should be handled?
> 

Sorry, maybe I didn't make my last question clearly enough. 

In HEAD(where schema level is not supported for publication), there is no publication
information in the result of '\dn+'. 

With this schema patch, '\dn+' shows the publications related to the schema, but ALL
TABLES publications are not shown. Do you think we should add ALL TABLES publications, too?

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Sep 6, 2021 at 9:53 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Thursday, September 2, 2021 2:36 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Sep 1, 2021 at 11:14 AM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > 3. When using '\dn+', I noticed that the list of publications only contains the
> > > publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it
> > designed on purpose?
> > > (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES").
> > >
> > > For example:
> > > create schema sch1;
> > > create table sch1.tbl(a int);
> > > create publication pub_schema for all tables in schema sch1;
> > > create publication pub_all_tables for all tables;
> >
> > I'm not sure if it is intentional or not, Do you want to post the
> > question in a separate thread and see if that should be handled?
> >
>
> Sorry, maybe I didn't make my last question clearly enough.
>
> In HEAD(where schema level is not supported for publication), there is no publication
> information in the result of '\dn+'.
>
> With this schema patch, '\dn+' shows the publications related to the schema, but ALL
> TABLES publications are not shown. Do you think we should add ALL TABLES publications, too?
>

No, I don't think we need to display For All Tables publication under
\dn+. It is already shown with \d+ <table_name> command.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Sep 2, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v24 patch has the changes for the same.
>

Just a note that these patches no longer apply due to a recent commit
related to logical replication (0c6828fa98).

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Below are few comments on v23. If you have already addressed anything
> in v24, then ignore it.
>
> 1. The commit message says: A new system table "pg_publication_schema"
> has been added, to maintain the schemas that the user wants to publish
> through the publication.". The alternative name for this system table
> could be "pg_publication_namespace". The reason why this alternative
> comes to my mind is that the current system table to store schema
> information is named pg_namespace. So shouldn't we be consistent here?
> What do others think about this?

Modified

> 2. In function check_publication_add_schema(), the primary error
> message should be "cannot add schema \"%s\" to publication". See
> check_publication_add_relation() for similar error messages.

Modified

> 3.
> +ObjectAddress
> +publication_add_schema(Oid pubid, Oid schemaoid, bool if_not_exists)
>
> Isn't it better to use 'schemaid' so that it is consistent with 'pubid'?

Modified

> 4.
> ConvertPubObjSpecListToOidList()
> {
> ..
> + schemaoid = linitial_oid(search_path);
> + nspname = get_namespace_name(schemaoid);
> + if (nspname == NULL) /* recently-deleted namespace? */
> + ereport(ERROR,
> + errcode(ERRCODE_UNDEFINED_SCHEMA),
> + errmsg("no schema has been selected"));
> +
> + list_free(search_path);
> ..
> }
>
> You can free the memory for 'nspname' as that is not used afterward.

I have removed get_namespace_name, no need to validate the schemaid,
fetch_search_path will return valid schemaids.


> 5.
> + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL);
> +
> + /* Invalidate relcache so that publication info is rebuilt. */
> + InvalidatePublicationRels(schemaRels);
>
> It is better to write this comment above
> GetSchemaPublicationRelations, something like below:
>
> + /* Invalidate relcache so that publication info is rebuilt. */
> + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL);
> + InvalidatePublicationRels(schemaRels);

Modified

> 6.
> +static List *
> +GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
> +    Oid relid)
>
> I think it is better to name this function as
> GetPublicationPartOptRelations as that way it will be more consistent
> with existing functions and structure name PublicationPartOpt.

Modified

> 7. All the callers of PublicationAddSchemas() have a superuser check,
> then why there is again a check of owner/superuser in that function?

Modified to remove the check

> 8.
> +/*
> + * Gets the list of FOR ALL TABLES IN SCHEMA publication oids associated with a
> + * specified schema oid
> + */
> +List *
> +GetSchemaPublications(Oid schemaid)
>
> I find it a bit difficult to read this comment. Can we omit "FOR ALL
> TABLES IN SCHEMA" from this comment?

Modified.

> 9. In the doc patch
> (v23-0003-Tests-and-documentation-for-schema-level-support), I see
> below line:
>    <para>
> -   To add a table to a publication, the invoking user must have ownership
> -   rights on the table.  The <command>FOR ALL TABLES</command> clause requires
> -   the invoking user to be a superuser.
> +   To add a table/schema to a publication, the invoking user must have
> +   ownership rights on the table/schema.  The <command>FOR ALL TABLES</command>
> +   and <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking
> +   user to be a superuser.
>
> Is it correct to specify the schema in the first line? AFAIU, all
> forms of schema addition requires superuser privilege.

That is not required, modified.

Attached v25 patch has the changes for the same.
Currently I have used a different way to parse which does not require
"Add PublicationTable and PublicationRelInfo structs" committed
changes, I have currently removed the changes in this patch. I will
analyze further and use whichever parsing is better. I will handle
this in the next version.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Sep 3, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> >
> > > 5)
> > > +                       if (list_length(pubobj->name) == 1 &&
> > > +                               (strcmp(relname, "CURRENT_SCHEMA") == 0))
> > > +                               ereport(ERROR,
> > > +                                               errcode(ERRCODE_SYNTAX_ERROR),
> > > +                                               errmsg("invalid relation name at or near"),
> > > +                                               parser_errposition(pstate, pubobj->location));
> > >
> > > Maybe we don't need this check, because it will report an error in
> > > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that
> > > message seems readable to me.
> >
> > Allowing CURRENT_SCHEMA is required to support current schema for
> > schema publications, currently I'm allowing this syntax during parsing
> > and this error is thrown for relations later, this is done to keep the
> > similar error as earlier before this feature support. I felt we can
> > keep it like this to maintain the similar error. Thoughts?
> >
>
> I find this check quite ad-hoc in the code and I am not sure if we
> need to be consistent for the exact message in this case. So, I think
> it is better to remove it.

Modified

> > > About  0003
> > > 7)
> > > The v22-0003 seems simple and can remove lots of code in patch v22-0001, so
> > > maybe we can merge 0001 and 0003 into one patch ?
> >
> > I agree that the code becomes simpler, it reduces a lot of code. I had
> > kept it like that as the testing effort might be more and also I was
> > waiting if there was no objection for that syntax from anyone else. I
> > will wait for a few more reviews and merge it to 0001 if there are no
> > objections.
> >
>
> +1 to merge the patch as suggested by Hou-San.

Modified

This is handled as part of v25 patch attached at [1]

[1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 6, 2021 at 6:56 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Thur, Sep 2, 2021 7:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Below are few comments on v23. If you have already addressed anything in v24,
> > then ignore it.
> >
> > 1. The commit message says: A new system table "pg_publication_schema"
> > has been added, to maintain the schemas that the user wants to publish
> > through the publication.". The alternative name for this system table could be
> > "pg_publication_namespace". The reason why this alternative comes to my
> > mind is that the current system table to store schema information is named
> > pg_namespace. So shouldn't we be consistent here?
> > What do others think about this?
>
> Since the new system table refer to pg_namespace, so, personally, I am +1 for
> "pg_publication_namespace".

I have changed it as part of v25 patch attached at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Below are few comments on v23. If you have already addressed anything
> > in v24, then ignore it.
> >
>
> More Review comments (on latest version v24):
> =======================================
> 1.
> +    Oid            psnspcid BKI_LOOKUP(pg_class);    /* Oid of the schema */
> +} FormData_pg_publication_schema;
>
> Why in the above structure you have used pg_class instead of pg_namespace?

It should be pg_namespace, I have changed it.

> 2. GetSchemaPublicationRelations() uses two different ways to skip
> non-publishable relations in two while loops. Both are correct but I
> think it would be easier to follow if they use the same way and in
> this case I would prefer a check like if (is_publishable_class(relid,
> relForm)). The comments atop function could also be modified to :"Get
> the list of publishable relation oids for a specified schema.". I
> think it is better to write some comments on why you need to scan and
> loop twice.

Modified

> 3. The other point about GetSchemaPublicationRelations() is that I am
> afraid that it will collect duplicate relation oids in the final list
> when the partitioned table and child tables are in the same schema.

Modified it to prepare a list without duplicate relation ids.

> 4. In GetRelationPublicationActions(), the handling related to
> partitions seems to be missing for the schema. It won't be able to
> take care of child tables when they are in a different schema than the
> parent table.

Modified

> 5.
> If I modify the search path to remove public schema then I get the
> below error message:
> postgres=# Create publication mypub for all tables in schema current_schema;
> ERROR:  no schema has been selected
>
> I think this message is not very clear. How about changing to
> something like "current_schema doesn't contain any valid schema"? This
> message is used in more than one place, so let's keep it the same at
> all the places if you agree to change it.

I would prefer to use the existing messages as we have used this in a
few other places similarly. Thoughts?

> 6. How about naming ConvertPubObjSpecListToOidList() as
> ObjectsInPublicationToOids()? I see somewhat similarly named functions
> in the code like objectsInSchemaToOids, objectNamesToOids.

Modified

> 7.
> + /*
> + * Schema lock is held until the publication is created to prevent
> + * concurrent schema deletion. No need to unlock the schemas, the
> + * locks will be released automatically at the end of create
> + * publication command.
> + */
>
> In this comment no need to say create publication command as that is
> implicit, we can just write ".... at the end of command".

Modified

> 8. Can you please extract changes like tab-completion, dump/restore in
> separate patches? This can help to review the core (backend) part of
> the patch in more detail.

Modified

This is handled as part of v25 patch attached at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Sep 3, 2021 at 4:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 1, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> >
> > > I'd expect a lot of users to naturally think that "ALTER PUBLICATION
> > > pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication
> > > all tables that are in schema "sc1", which is not what it is currently
> > > doing.
> > > Since the syntax was changed to specifically refer to FOR ALL TABLES
> > > IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring
> > > to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE
> > > sc1.test", so IMHO it's reasonable to remove duplicates here, rather
> > > than treating these as somehow separate ways of referencing the same
> > > table.
> > >
> >
> > I see your point and if we decide to go this path then it is better to
> > give an error than silently removing duplicates.
> >
>
> Today, I have thought about this point again and it seems better to
> give an error in this case and let the user take the action rather
> than silently removing such tables to avoid any confusion.

I have handled this to throw an error. This is handled as part of v25
patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 5.
> > If I modify the search path to remove public schema then I get the
> > below error message:
> > postgres=# Create publication mypub for all tables in schema current_schema;
> > ERROR:  no schema has been selected
> >
> > I think this message is not very clear. How about changing to
> > something like "current_schema doesn't contain any valid schema"? This
> > message is used in more than one place, so let's keep it the same at
> > all the places if you agree to change it.
>
> I would prefer to use the existing messages as we have used this in a
> few other places similarly. Thoughts?
>

Yeah, I also see the same message in code but I think here usage is a
bit different. If you see a similar SQL statement that causes the same
error message then can you please give an example?

Few comments on latest patch
(v25-0002-Added-schema-level-support-for-publication):
=====================================================================
1.
getPublicationSchemaInfo()
..
+ *nspname = get_namespace_name(pnform->pnnspcid);
+ if (!(*nspname))
+ {
+ Oid schemaid = pnform->pnpubid;
+
+ pfree(*pubname);
+ ReleaseSysCache(tup);
+ if (!missing_ok)
+ elog(ERROR, "cache lookup failed for schema %u",
+ schemaid);

Why are you using pnform->pnpubid to get schemaid? I think you need
pnform->pnnspcid here and it was like that in the previous version,
not sure, why you changed it?

2.
@@ -369,15 +531,20 @@ AlterPublicationTables(AlterPublicationStmt
*stmt, Relation rel,
  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg("publication \"%s\" is defined as FOR ALL TABLES",
  NameStr(pubform->pubname)),
- errdetail("Tables cannot be added to or dropped from FOR ALL TABLES
publications.")));
+ errdetail("Tables cannot be added to, dropped from, or set on FOR
ALL TABLES publications.")));

Why is this message changed? Have we changed anything related to this
as part of this patch?

3.
+ Oid pnnspcid BKI_LOOKUP(pg_namespace); /* Oid of the schema */
+} FormData_pg_publication_namespace;
+
...
...
+DECLARE_UNIQUE_INDEX(pg_publication_namespace_pnnspcid_pnpubid_index,
8903, PublicationNamespacePnnspcidPnpubidIndexId, on
pg_publication_namespace using btree(pnnspcid oid_ops, pnpubid
oid_ops));

Let's use nspid instead nspcid at all places as before we refer that
way in the code. The way you have done is also okay but it is better
to be consistent with existing code.

4. Need to think of comments in GetSchemaPublicationRelations.
+ /* get all the ordinary tables present in schema schemaid */
..

Let's change the above comment to something like: "get all the
relations present in the given schema"

+
+ /*
+ * Get all relations that inherit from the partition table, directly or
+ * indirectly.
+ */
+ scan = table_beginscan_catalog(classRel, keycount, key);


Let's change the above comment to something like: "It is quite
possible that some of the partitions are in a different schema than
the parent table, so we need to get such partitions separately."

5.
+ if (list_member_oid(schemaidlist, relSchemaId))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot add relation \"%s.%s\" to publication",
+    get_namespace_name(relSchemaId),
+    RelationGetRelationName(rel)),
+ errdetail("Table's schema is already included as part of ALL TABLES
IN SCHEMA option."));

I think the better errdetail would be: "Table's schema \"%s\" is
already part of the publication."

+ if (list_member_oid(schemaidlist, relSchemaId))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot add schema \"%s\" to publication",
+    get_namespace_name(get_rel_namespace(tableOid))),
+ errdetail("Table \"%s.%s\" is part of publication, adding same
schema \"%s\" is not supported",
+   get_namespace_name(get_rel_namespace(tableOid)),
+   get_rel_name(tableOid),
+   get_namespace_name(get_rel_namespace(tableOid))));

I think this errdetail message can also be improved. One idea could
be: "Table \"%s\" in schema \"%s\" is already part of the publication,
adding the same schema is not supported.". Do you or anyone else have
better ideas?

6. I think instead of two different functions
CheckRelschemaInSchemaList and CheckSchemaInRels, let's have a single
function RelSchemaIsMemberOfSchemaList and have a boolean variable to
distinguish the two cases.


-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 5.
> > > If I modify the search path to remove public schema then I get the
> > > below error message:
> > > postgres=# Create publication mypub for all tables in schema current_schema;
> > > ERROR:  no schema has been selected
> > >
> > > I think this message is not very clear. How about changing to
> > > something like "current_schema doesn't contain any valid schema"? This
> > > message is used in more than one place, so let's keep it the same at
> > > all the places if you agree to change it.
> >
> > I would prefer to use the existing messages as we have used this in a
> > few other places similarly. Thoughts?
> >
>
> Yeah, I also see the same message in code but I think here usage is a
> bit different. If you see a similar SQL statement that causes the same
> error message then can you please give an example?

I was referring to the error message in create table
postgres=# set search_path='non_existent_schema';
SET
postgres=# create table t1(c1 int);
ERROR:  no schema has been selected to create in
LINE 1: create table t1(c1 int);

If it is not very useful in the case of creating a publication, then I
can change it. Thoughts?

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 8, 2021 at 10:48 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > > 5.
> > > > If I modify the search path to remove public schema then I get the
> > > > below error message:
> > > > postgres=# Create publication mypub for all tables in schema current_schema;
> > > > ERROR:  no schema has been selected
> > > >
> > > > I think this message is not very clear. How about changing to
> > > > something like "current_schema doesn't contain any valid schema"? This
> > > > message is used in more than one place, so let's keep it the same at
> > > > all the places if you agree to change it.
> > >
> > > I would prefer to use the existing messages as we have used this in a
> > > few other places similarly. Thoughts?
> > >
> >
> > Yeah, I also see the same message in code but I think here usage is a
> > bit different. If you see a similar SQL statement that causes the same
> > error message then can you please give an example?
>
> I was referring to the error message in create table
> postgres=# set search_path='non_existent_schema';
> SET
> postgres=# create table t1(c1 int);
> ERROR:  no schema has been selected to create in
> LINE 1: create table t1(c1 int);
>
> If it is not very useful in the case of creating a publication, then I
> can change it. Thoughts?
>

If you want to be consistent with the existing message then why did
you left the trailing part (".... to create in") of the sentence?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 5.
> > > If I modify the search path to remove public schema then I get the
> > > below error message:
> > > postgres=# Create publication mypub for all tables in schema current_schema;
> > > ERROR:  no schema has been selected
> > >
> > > I think this message is not very clear. How about changing to
> > > something like "current_schema doesn't contain any valid schema"? This
> > > message is used in more than one place, so let's keep it the same at
> > > all the places if you agree to change it.
> >
> > I would prefer to use the existing messages as we have used this in a
> > few other places similarly. Thoughts?
> >
>
> Yeah, I also see the same message in code but I think here usage is a
> bit different. If you see a similar SQL statement that causes the same
> error message then can you please give an example?

Changed it to "no schema has been selected for CURRENT_SCHEMA"

> Few comments on latest patch
> (v25-0002-Added-schema-level-support-for-publication):
> =====================================================================
> 1.
> getPublicationSchemaInfo()
> ..
> + *nspname = get_namespace_name(pnform->pnnspcid);
> + if (!(*nspname))
> + {
> + Oid schemaid = pnform->pnpubid;
> +
> + pfree(*pubname);
> + ReleaseSysCache(tup);
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for schema %u",
> + schemaid);
>
> Why are you using pnform->pnpubid to get schemaid? I think you need
> pnform->pnnspcid here and it was like that in the previous version,
> not sure, why you changed it?

Modified

> 2.
> @@ -369,15 +531,20 @@ AlterPublicationTables(AlterPublicationStmt
> *stmt, Relation rel,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>   errmsg("publication \"%s\" is defined as FOR ALL TABLES",
>   NameStr(pubform->pubname)),
> - errdetail("Tables cannot be added to or dropped from FOR ALL TABLES
> publications.")));
> + errdetail("Tables cannot be added to, dropped from, or set on FOR
> ALL TABLES publications.")));
>
> Why is this message changed? Have we changed anything related to this
> as part of this patch?

This change is not required in this patch, reverted

> 3.
> + Oid pnnspcid BKI_LOOKUP(pg_namespace); /* Oid of the schema */
> +} FormData_pg_publication_namespace;
> +
> ...
> ...
> +DECLARE_UNIQUE_INDEX(pg_publication_namespace_pnnspcid_pnpubid_index,
> 8903, PublicationNamespacePnnspcidPnpubidIndexId, on
> pg_publication_namespace using btree(pnnspcid oid_ops, pnpubid
> oid_ops));
>
> Let's use nspid instead nspcid at all places as before we refer that
> way in the code. The way you have done is also okay but it is better
> to be consistent with existing code.

Modified

> 4. Need to think of comments in GetSchemaPublicationRelations.
> + /* get all the ordinary tables present in schema schemaid */
> ..
>
> Let's change the above comment to something like: "get all the
> relations present in the given schema"
>
> +
> + /*
> + * Get all relations that inherit from the partition table, directly or
> + * indirectly.
> + */
> + scan = table_beginscan_catalog(classRel, keycount, key);
>
>
> Let's change the above comment to something like: "It is quite
> possible that some of the partitions are in a different schema than
> the parent table, so we need to get such partitions separately."

Modified

> 5.
> + if (list_member_oid(schemaidlist, relSchemaId))
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot add relation \"%s.%s\" to publication",
> +    get_namespace_name(relSchemaId),
> +    RelationGetRelationName(rel)),
> + errdetail("Table's schema is already included as part of ALL TABLES
> IN SCHEMA option."));
>
> I think the better errdetail would be: "Table's schema \"%s\" is
> already part of the publication."
>
> + if (list_member_oid(schemaidlist, relSchemaId))
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot add schema \"%s\" to publication",
> +    get_namespace_name(get_rel_namespace(tableOid))),
> + errdetail("Table \"%s.%s\" is part of publication, adding same
> schema \"%s\" is not supported",
> +   get_namespace_name(get_rel_namespace(tableOid)),
> +   get_rel_name(tableOid),
> +   get_namespace_name(get_rel_namespace(tableOid))));
>
> I think this errdetail message can also be improved. One idea could
> be: "Table \"%s\" in schema \"%s\" is already part of the publication,
> adding the same schema is not supported.". Do you or anyone else have
> better ideas?

Modified

> 6. I think instead of two different functions
> CheckRelschemaInSchemaList and CheckSchemaInRels, let's have a single
> function RelSchemaIsMemberOfSchemaList and have a boolean variable to
> distinguish the two cases.

Modified
Thanks for the comments, the attached v26 patch has the changes for the same.

Regards,
VIgnesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 6, 2021 at 6:56 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Thur, Sep 2, 2021 2:33 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Wed, Sep 1, 2021 at 6:58 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Here are some other comments for v23-000x patches.
> > > 3)
> > >
> > > +                                                         .description =
> > "PUBLICATION SCHEMA",
> > > +                                                         .section =
> > SECTION_POST_DATA,
> > > +                                                         .createStmt
> > > + = query->data));
> > >
> > > Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to
> > > describe the schema level table publication ? Because there could be
> > > some other type publication such as 'ALL SEQUENCES IN SCHEMA' in the
> > > future, it will be better to make it clear that we only publish table in schema in
> > this patch.
> >
> > Modified
>
> Thanks for updating the patch.
>
> I think we might also need to mention the publication object 'table' in the
> following types:
>
> 1)
> +       /* OCLASS_PUBLICATION_SCHEMA */
> +       {
> +               "publication schema", OBJECT_PUBLICATION_SCHEMA
> +       },
>
> 2)
> +       PUBLICATIONOBJ_SCHEMA,          /* Schema type */
> +       PUBLICATIONOBJ_UNKNOWN          /* Unknown type */
> +} PublicationObjSpecType;
>
> 3)
> +       DO_PUBLICATION_SCHEMA,
>
> I think it might be to change the typename like XX_REL_IN_SCHEMA,
> and adjust the comments.

Thanks for the comments, this is handled in the v26 patch attached at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm3EwAVma8n4YpV1%2BQWiccuVPxpqNfbbrUU3s3XTHcTXew%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Aug 16, 2021 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Smith <smithpb2250@gmail.com> writes:
> > Then the question from Peter E. [2] "Why can't I have a publication
> > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would
> > have an intuitive solution like:
>
> > CREATE PUBLICATION pub1
> > FOR TABLE t1,t2,t3 AND
> > FOR ALL TABLES IN SCHEMA s1,s2,s3;
>
> That seems a bit awkward, since the existing precedent is
> to use commas.
>

AFAICS, the closest to this proposal we have is Grant/Revoke syntax
where we can give privilege on individual objects and all objects in
the schema. Is that you are referring to existing precedent or
something else?

>  We shouldn't need more than one FOR noise-word,
> either.  So I was imagining syntax more like, say,
>
>         CREATE PUBLICATION pub1 FOR
>           TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
>           SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
>
> Abstractly it'd be
>
> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
>
> cpitem := ALL TABLES |
>           TABLE name |
>           ALL TABLES IN SCHEMA name |
>           ALL SEQUENCES |
>           SEQUENCE name |
>           ALL SEQUENCES IN SCHEMA name |
>           name
>
> The grammar output would need some post-analysis to attribute the
> right type to bare "name" items, but that doesn't seem difficult.
>

The current patch (v26-0002-Added-schema-level-support-for-publication
at [1]) implements this syntax in roughly the way you have proposed
here. But, one thing I find a bit awkward is how it needs to keep a
separate flag to distinguish between names of different objects for
the post-analysis phase. The reason is that in CREATE PUBLICATION
syntax [2] one could supply additional decorators like *, ONLY with
table name but the same shouldn't be allowed be with schema name or
other object names. Is that okay or do you have any better ideas about
the same?

OTOH, if we implement something like Grant/Revoke where we can give
privilege on individual objects and all objects in the schema but not
in the same statement then such special flags won't be required to
distinguish different object names and we can build something on the
lines of current "privilege_target:" in gram.y.

[1] - https://www.postgresql.org/message-id/CALDaNm3EwAVma8n4YpV1%2BQWiccuVPxpqNfbbrUU3s3XTHcTXew%40mail.gmail.com
[2] - https://www.postgresql.org/docs/devel/sql-createpublication.html

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Wed, Sept 8, 2021 7:44 PM vignesh C <vignesh21@gmail.com> wrote:
> Modified
> Thanks for the comments, the attached v26 patch has the changes for the same.

Hi,

Thanks for updating the patch, I have a suggestion for the gram.y.

Currently, we have the following two members in PublicationObjSpec to distinguish
between names of different objects for the post-analysis phase.

>bool        inh;
>bool        spl_rel_type_syn;

I was thinking do we have another way to distinguish that which can make code
smaller. I tried serval approaches and found a possible better approach.

First, I refer to the design of Grant/Revoke syntax, it use two members
'targtype' and 'objtype' to mark different type. 'targtype' can be
ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema level)
.'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE or
... . I think if we follow this way, the code could be cleaner.

Second, we can move the special relation expression into a separate rule
'speical_relation_expr' like the following, this can remove duplicate code
used by pubobj_expr.
------
relation_expr:
            qualified_name
                {}
            | speical_relation_expr
                {
                    $$ = $1;
                }
        ;
speical_relation_expr:
            qualified_name '*'
                {}
        | ONLY qualified_name
        {}
        | ONLY '(' qualified_name ')'
        {}
    ;
------

Finally, the gram.y will look like the following.
Personally, the code looks cleaner in this approach.

-----
pubobj_expr:
            any_name
                {
                    /* inheritance query, implicitly */
                    $$ = makeNode(PublicationObjSpec);
                    $$->targettype = TARGETOBJ_UNKNOWN;
                    $$->object = $1;
                }
            | special_relation_expr
                {
                    $$ = makeNode(PublicationObjSpec);
                    $$->targettype = TARGETOBJ_TABLE;
                    $$->object = $1;
                }
            | CURRENT_SCHEMA
                {
                    $$ = makeNode(PublicationObjSpec);
                    $$->object = list_make1(makeString("CURRENT_SCHEMA"));
                    $$->targettype = TARGETOBJ_SCHEMA;
                }
        ;

/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec:    TABLE pubobj_expr
                    {
                        $$ = $2;
                        $$->pubobjtype = PUBLICATIONOBJ_TABLE_LIST;
                        $$->location = @1;
                    }

            | ALL TABLES IN_P SCHEMA pubobj_expr
                    {
                        $$ = $5;
                        $$->pubobjtype = PUBLICATIONOBJ_SCHEMA_LIST;
                        $$->location = @1;
                    }
            | pubobj_expr
                    {
                        $$ = $1;
                        $$->pubobjtype = PUBLICATIONOBJ_UNKNOWN;
                        $$->location = @1;
                    }
        ;

Attach a diff patch based on v26-0002 patch, which change the corresponding
code based on the design above. Please have a look.

Best regards,
Hou zj

Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Friday, September 10, 2021 10:33 AM Hou Zhijie<houzj.fnst@fujitsu.com> wrote:
> From Wed, Sept 8, 2021 7:44 PM vignesh C <vignesh21@gmail.com> wrote:
> > Modified
> > Thanks for the comments, the attached v26 patch has the changes for the
> same.
> 
> Hi,
> 
> Thanks for updating the patch, I have a suggestion for the gram.y.
> 
> Currently, we have the following two members in PublicationObjSpec to
> distinguish
> between names of different objects for the post-analysis phase.
> 
> >bool        inh;
> >bool        spl_rel_type_syn;
> 
> I was thinking do we have another way to distinguish that which can make code
> smaller. I tried serval approaches and found a possible better approach.
> 
> First, I refer to the design of Grant/Revoke syntax, it use two members
> 'targtype' and 'objtype' to mark different type. 'targtype' can be
> ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema
> level)
> .'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE
> or
> ... . I think if we follow this way, the code could be cleaner.
> 
> Second, we can move the special relation expression into a separate rule
> 'speical_relation_expr' like the following, this can remove duplicate code
> used by pubobj_expr.
> ------
> relation_expr:
>             qualified_name
>                 {}
>             | speical_relation_expr
>                 {
>                     $$ = $1;
>                 }
>         ;
> speical_relation_expr:
>             qualified_name '*'
>                 {}
>         | ONLY qualified_name
>         {}
>         | ONLY '(' qualified_name ')'
>         {}
>     ;
> ------
> 
> Finally, the gram.y will look like the following.
> Personally, the code looks cleaner in this approach.

Besides, If we don't want to use a new flag to distinguish tablename and schemaname,
We can only check the NodeTag to distinguish the difference.

Attach two diff patches based on the latest schema patch
which change the code with a flag and without a flag.

Best regards,
Hou zj

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Sep 10, 2021 at 8:54 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Friday, September 10, 2021 10:33 AM Hou Zhijie<houzj.fnst@fujitsu.com> wrote:
>
> Besides, If we don't want to use a new flag to distinguish tablename and schemaname,
> We can only check the NodeTag to distinguish the difference.
>
> Attach two diff patches based on the latest schema patch
> which change the code with a flag and without a flag.
>

I would prefer a version without additional flags unless you think it
is difficult to extend it in the future for other objects like
sequences which as far as I can see shouldn't be the case. Is there a
reason to define pubobj_name similar to any_name? If so, then please
do add the comments. One reason I could think of is that any_name is
not used for schema names currently which might have motivated you to
define a separate naming convention for publication.

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Friday, September 10, 2021 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> >
> > From Friday, September 10, 2021 10:33 AM Hou
> Zhijie<houzj.fnst@fujitsu.com> wrote:
> >
> > Besides, If we don't want to use a new flag to distinguish tablename and
> schemaname,
> > We can only check the NodeTag to distinguish the difference.
> >
> > Attach two diff patches based on the latest schema patch
> > which change the code with a flag and without a flag.
> >
> 
> I would prefer a version without additional flags unless you think it
> is difficult to extend it in the future for other objects like
> sequences which as far as I can see shouldn't be the case.

Ok, I agreed.

> Is there a
> reason to define pubobj_name similar to any_name? If so, then please
> do add the comments. One reason I could think of is that any_name is
> not used for schema names currently which might have motivated you to
> define a separate naming convention for publication.

When I used any_name, Bison reported that the dot('.') in rule attr
would have a shift/reduce conflict with the dot('.') in rule indirection_el
which also used in pubobj_expr. So, I declared a new rule which will directly
use indirection_el to resolve the conflicts.

Attach the without-flag version and add comments about the pubobj_name.

Best regards,
Hou zj

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Sep 10, 2021 at 11:21 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Friday, September 10, 2021 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> > >
> > > From Friday, September 10, 2021 10:33 AM Hou
> > Zhijie<houzj.fnst@fujitsu.com> wrote:
> > >
> > > Besides, If we don't want to use a new flag to distinguish tablename and
> > schemaname,
> > > We can only check the NodeTag to distinguish the difference.
> > >
> > > Attach two diff patches based on the latest schema patch
> > > which change the code with a flag and without a flag.
> > >
> >
> > I would prefer a version without additional flags unless you think it
> > is difficult to extend it in the future for other objects like
> > sequences which as far as I can see shouldn't be the case.
>
> Ok, I agreed.
>
> > Is there a
> > reason to define pubobj_name similar to any_name? If so, then please
> > do add the comments. One reason I could think of is that any_name is
> > not used for schema names currently which might have motivated you to
> > define a separate naming convention for publication.
>
> When I used any_name, Bison reported that the dot('.') in rule attr
> would have a shift/reduce conflict with the dot('.') in rule indirection_el
> which also used in pubobj_expr. So, I declared a new rule which will directly
> use indirection_el to resolve the conflicts.
>
> Attach the without-flag version and add comments about the pubobj_name.

Thanks for the changes, the suggested changes make the parsing code
simpler. I have merged the changes to the main patch. Attached v27
patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the changes, the suggested changes make the parsing code
> simpler. I have merged the changes to the main patch. Attached v27
> patch has the changes for the same.
>

+pubobj_name: ColId { $$ = list_make1(makeString($1)); }
+ | ColId indirection { $$ = lcons(makeString($1), $2); }
  ;

I think "ColId indirection" should handle catalog and schema name as
we are doing in qualified_name. See attached (this can be applied atop
v27-0002). The one other improvement we can do here is to extract the
common code from qualified_name (especially for "ColId indirection")
and pubobj_name. What do you think?

-- 
With Regards,
Amit Kapila.

Attachment

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Sunday, September 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for the changes, the suggested changes make the parsing code
> simpler. I have merged the changes to the main patch. Attached v27
> patch has the changes for the same.
> 

Thanks for your new patch. I had a look at the changes related to document and tried
the patch. Here are several comments.

1.
   <para>
    You must own the publication to use <command>ALTER PUBLICATION</command>.
    Adding a table to a publication additionally requires owning that table.
+   The <literal>ADD SCHEMA</literal> and <literal>SET SCHEMA</literal> to a
+   publication requires the invoking user to be a superuser.
    To alter the owner, you must also be a direct or indirect member of the new
    owning role. The new owner must have <literal>CREATE</literal> privilege on
    the database.  Also, the new owner of a <literal>FOR ALL TABLES</literal>
    publication must be a superuser.  However, a superuser can change the
    ownership of a publication regardless of these restrictions.
   </para>

ADD SCHEMA
->
ADD ALL TABLES IN SCHEMA

SET SCHEMA
->
SET ALL TABLES IN SCHEMA

2.
+  <para>
+   ADD some tables and schemas to the publication:
+<programlisting>
+ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production;
+</programlisting>
+  </para>

ADD some tables and schemas to the publication:
->
Add some tables and schemas to the publication:

3.
+  <para>
+   Drop some schema from the publication:
+<programlisting>
+ALTER PUBLICATION production_quarterly_publication DROP ALL TABLES IN SCHEMA production_july;
+</programlisting>
+  </para>

Drop some schema from the publication:
->
Drop some schemas from the publication:

4.
+   The catalog <structname>pg_publication_namespace</structname> contains the
+   mapping between schemas and publications in the database.  This is a
+   many-to-many mapping.  

There are two Spaces at the end of the paragraph.

5.
Adding a table and the schema where the table belonged to is not supported. But
it didn't report error message when I try to add them in the same statement by
using 'ALTER PUBLICATION'.

For example:
postgres=# create publication pub;
CREATE PUBLICATION
postgres=# alter publication pub add  all tables in schema s1, table s1.tbl;
ALTER PUBLICATION
postgres=# \dRp+
                              Publication pub
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "s1.tbl"
Tables from schemas:
    "s1"

It didn't check if table 's1.tbl' is member of schema 's1'.


6.
I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
list of all tables in schemas should be reset. The publication should only
contain the tables and all tables in schemas which user specified. If user only
specified all tables in schema, and didn't specify tables, the tables which used
to be part of the publication should be dropped, too. But currently, if I didn't
specify tables, the list of tables wouldn't be set to empty. Thoughts?

For example:
postgres=# create publication pub for table s1.tbl;
CREATE PUBLICATION
postgres=# alter publication pub set all tables in schema s2;
ALTER PUBLICATION
postgres=# \dRp+
                              Publication pub
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "s1.tbl"
Tables from schemas:
    "s2"

Regards
Tang

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Sun, Sept 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote:
> On Fri, Sep 10, 2021 at 11:21 AM Hou Zhijie <houzj.fnst@fujitsu.com> wrote:
> > Attach the without-flag version and add comments about the pubobj_name.
> 
> Thanks for the changes, the suggested changes make the parsing code simpler.
> I have merged the changes to the main patch. Attached v27 patch has the
> changes for the same.

Hi,

I have some suggestions for the documents and comments of the new syntax.

IMO, the document could be clearer in the following format.
------
Synopsis
CREATE PUBLICATION name
    [ FOR ALL TABLES 
      | FOR publication object [, ... ] ]
    [ WITH ( publication_parameter [= value] [, ... ] ) ]

where publication object is one of:

    TABLE [ ONLY ] table_name [ * ] [, ... ]
    ALL TABLES IN SCHEMA { schema_name | CURRENT_SCHEMA } [, ... ]
------

Attach a diff(based on v27-*) which change the doc and comments like the
following.

Best regards,
Hou zj

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> 6.
> I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
> list of all tables in schemas should be reset. The publication should only
> contain the tables and all tables in schemas which user specified. If user only
> specified all tables in schema, and didn't specify tables, the tables which used
> to be part of the publication should be dropped, too. But currently, if I didn't
> specify tables, the list of tables wouldn't be set to empty. Thoughts?
>

I think we can go either way here but it seems like we should drop the
tables in the case you mentioned. The idea is that the SET variant in
ALTER PUBLICATION should replace the set of tables and schemas for the
publication which seems to be in line with the current behavior where
we replace the set of tables.

Anyone else wants to weigh in on this?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
>> I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
>> list of all tables in schemas should be reset. The publication should only
>> contain the tables and all tables in schemas which user specified. If user only
>> specified all tables in schema, and didn't specify tables, the tables which used
>> to be part of the publication should be dropped, too. But currently, if I didn't
>> specify tables, the list of tables wouldn't be set to empty. Thoughts?

> I think we can go either way here but it seems like we should drop the
> tables in the case you mentioned. The idea is that the SET variant in
> ALTER PUBLICATION should replace the set of tables and schemas for the
> publication which seems to be in line with the current behavior where
> we replace the set of tables.

Yeah, I think it's sensible to define that there is just one SET variant
that replaces both the list-of-tables and the list-of-schemas.  (Of
course, the syntax for it has to permit both lists to be written.)
You could imagine having two independent SET commands for the two lists,
but that seems fairly confusing.

            regards, tom lane



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Sep 13, 2021 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the changes, the suggested changes make the parsing code
> > simpler. I have merged the changes to the main patch. Attached v27
> > patch has the changes for the same.
> >
>
> +pubobj_name: ColId { $$ = list_make1(makeString($1)); }
> + | ColId indirection { $$ = lcons(makeString($1), $2); }
>   ;
>
> I think "ColId indirection" should handle catalog and schema name as
> we are doing in qualified_name. See attached (this can be applied atop
> v27-0002). The one other improvement we can do here is to extract the
> common code from qualified_name (especially for "ColId indirection")
> and pubobj_name. What do you think?
>

One more suggestion for changes in gram.y:
@@ -12430,7 +12476,14 @@ relation_expr:
  $$->inh = true;
  $$->alias = NULL;
  }
- | qualified_name '*'
+ | special_relation_expr
+ {
+ $$ = $1;
+ }
+ ;
+
+special_relation_expr:
+ qualified_name '*'

I am not sure if naming the above rule as special_relation_expr is a
good idea. Can we name it as extended_relation_expr?

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Tue, Sept 14, 2021 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> >
> > 6.
> > I think if I use 'ALTER PUBLICATION ... SET', both the list of tables
> > and the list of all tables in schemas should be reset. The publication
> > should only contain the tables and all tables in schemas which user
> > specified. If user only specified all tables in schema, and didn't
> > specify tables, the tables which used to be part of the publication
> > should be dropped, too. But currently, if I didn't specify tables, the list of
> tables wouldn't be set to empty. Thoughts?
> >
> 
> I think we can go either way here but it seems like we should drop the tables in
> the case you mentioned. The idea is that the SET variant in ALTER PUBLICATION
> should replace the set of tables and schemas for the publication which seems
> to be in line with the current behavior where we replace the set of tables.
> 
> Anyone else wants to weigh in on this?

I agree that the one SET variant should replaces both the list-of-tables and the list-of-schemas.

Best regards,
Hou zj

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 13, 2021 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the changes, the suggested changes make the parsing code
> > simpler. I have merged the changes to the main patch. Attached v27
> > patch has the changes for the same.
> >
>
> +pubobj_name: ColId { $$ = list_make1(makeString($1)); }
> + | ColId indirection { $$ = lcons(makeString($1), $2); }
>   ;
>
> I think "ColId indirection" should handle catalog and schema name as
> we are doing in qualified_name. See attached (this can be applied atop
> v27-0002). The one other improvement we can do here is to extract the
> common code from qualified_name (especially for "ColId indirection")
> and pubobj_name. What do you think?

I have handled this in the patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Sunday, September 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the changes, the suggested changes make the parsing code
> > simpler. I have merged the changes to the main patch. Attached v27
> > patch has the changes for the same.
> >
>
> Thanks for your new patch. I had a look at the changes related to document and tried
> the patch. Here are several comments.
>
> 1.
>    <para>
>     You must own the publication to use <command>ALTER PUBLICATION</command>.
>     Adding a table to a publication additionally requires owning that table.
> +   The <literal>ADD SCHEMA</literal> and <literal>SET SCHEMA</literal> to a
> +   publication requires the invoking user to be a superuser.
>     To alter the owner, you must also be a direct or indirect member of the new
>     owning role. The new owner must have <literal>CREATE</literal> privilege on
>     the database.  Also, the new owner of a <literal>FOR ALL TABLES</literal>
>     publication must be a superuser.  However, a superuser can change the
>     ownership of a publication regardless of these restrictions.
>    </para>
>
> ADD SCHEMA
> ->
> ADD ALL TABLES IN SCHEMA
>
> SET SCHEMA
> ->
> SET ALL TABLES IN SCHEMA

Modified

> 2.
> +  <para>
> +   ADD some tables and schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production;
> +</programlisting>
> +  </para>
>
> ADD some tables and schemas to the publication:
> ->
> Add some tables and schemas to the publication:

Modified

> 3.
> +  <para>
> +   Drop some schema from the publication:
> +<programlisting>
> +ALTER PUBLICATION production_quarterly_publication DROP ALL TABLES IN SCHEMA production_july;
> +</programlisting>
> +  </para>
>
> Drop some schema from the publication:
> ->
> Drop some schemas from the publication:

Modified

> 4.
> +   The catalog <structname>pg_publication_namespace</structname> contains the
> +   mapping between schemas and publications in the database.  This is a
> +   many-to-many mapping.
>
> There are two Spaces at the end of the paragraph.

I felt this is ok, as both single space and double space are used at
various places.

> 5.
> Adding a table and the schema where the table belonged to is not supported. But
> it didn't report error message when I try to add them in the same statement by
> using 'ALTER PUBLICATION'.
>
> For example:
> postgres=# create publication pub;
> CREATE PUBLICATION
> postgres=# alter publication pub add  all tables in schema s1, table s1.tbl;
> ALTER PUBLICATION
> postgres=# \dRp+
>                               Publication pub
>   Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> ----------+------------+---------+---------+---------+-----------+----------
>  postgres | f          | t       | t       | t       | t         | f
> Tables:
>     "s1.tbl"
> Tables from schemas:
>     "s1"
>
> It didn't check if table 's1.tbl' is member of schema 's1'.

Modified.

> 6.
> I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the
> list of all tables in schemas should be reset. The publication should only
> contain the tables and all tables in schemas which user specified. If user only
> specified all tables in schema, and didn't specify tables, the tables which used
> to be part of the publication should be dropped, too. But currently, if I didn't
> specify tables, the list of tables wouldn't be set to empty. Thoughts?
>
> For example:
> postgres=# create publication pub for table s1.tbl;
> CREATE PUBLICATION
> postgres=# alter publication pub set all tables in schema s2;
> ALTER PUBLICATION
> postgres=# \dRp+
>                               Publication pub
>   Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> ----------+------------+---------+---------+---------+-----------+----------
>  postgres | f          | t       | t       | t       | t         | f
> Tables:
>     "s1.tbl"
> Tables from schemas:
>     "s2"

Modified

I have fixed the comments in the v28 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0OudeDeFN7bSWPro0hgKx%3D1zPgcNFWnvU_G6w3mDPX0Q%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 14, 2021 at 6:31 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Sun, Sept 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Fri, Sep 10, 2021 at 11:21 AM Hou Zhijie <houzj.fnst@fujitsu.com> wrote:
> > > Attach the without-flag version and add comments about the pubobj_name.
> >
> > Thanks for the changes, the suggested changes make the parsing code simpler.
> > I have merged the changes to the main patch. Attached v27 patch has the
> > changes for the same.
>
> Hi,
>
> I have some suggestions for the documents and comments of the new syntax.
>
> IMO, the document could be clearer in the following format.
> ------
> Synopsis
> CREATE PUBLICATION name
>     [ FOR ALL TABLES
>       | FOR publication object [, ... ] ]
>     [ WITH ( publication_parameter [= value] [, ... ] ) ]
>
> where publication object is one of:
>
>     TABLE [ ONLY ] table_name [ * ] [, ... ]
>     ALL TABLES IN SCHEMA { schema_name | CURRENT_SCHEMA } [, ... ]
> ------
>
> Attach a diff(based on v27-*) which change the doc and comments like the
> following.

Thanks for the comments and the changes, I have made a few changes and
merged it into the v28 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0OudeDeFN7bSWPro0hgKx%3D1zPgcNFWnvU_G6w3mDPX0Q%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have handled this in the patch attached.
>

Few comments:
=============
1.
+ * CREATE PUBLICATION FOR pub_obj [, pub_obj2] [WITH options]
+ *
+ * pub_obj is one of:
+ *
+ * TABLE table [, table2]
..
..
- * ALTER PUBLICATION name ADD TABLE table [, table2]
+ * ALTER PUBLICATION name ADD pub_obj [, pub_obj ...]
  *
- * ALTER PUBLICATION name DROP TABLE table [, table2]
+ * ALTER PUBLICATION name DROP pub_obj [, pub_obj ...]
  *
- * ALTER PUBLICATION name SET TABLE table [, table2]
+ * ALTER PUBLICATION name SET pub_obj [, pub_obj ...]

In all the above places, the object names mentioned in square brackets
are not consistent. I suggest using [, ...] everywhere as that is what
we are using in docs as well.

2.
+/*
+ * Check if the relation schema is member of the schema list.
+ */
+static void
+RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)

Can we change the above comment as: "Check if any of the given
relation's schema is a member of the given schema list."?

3.
+ errmsg("cannot add relation \"%s.%s\" to publication",
+ get_namespace_name(relSchemaId),
+ RelationGetRelationName(rel)),
+ errdetail("Table's schema \"%s\" is already part of the publication.",
+ get_namespace_name(relSchemaId)));

This and other parts of error messages in
+RelSchemaIsMemberOfSchemaList are not aligned. I think you can now
run pgindent on your patches that will solve the indentation issues in
the patch.

4.
AlterPublicationSchemas()
{
..
+ /*
+ * If the table option was not specified remove the existing tables
+ * from the publication.
+ */
+ if (!tables)
+ {
+ rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+ PublicationDropTables(pubform->oid, rels, false, true);
+ }
+
+ /* Identify which schemas should be dropped */
+ delschemas = list_difference_oid(oldschemaids, schemaidlist);
+
+ /* And drop them */
+ PublicationDropSchemas(pubform->oid, delschemas, true);

Here, you have neither locked tables to be dropped nor schemas. I
think both need to be locked as we do for tables in similar code in
AlterPublicationTables(). Can you please test via debugger what
happens if we try to drop without taking lock here and concurrently
try to drop the actual object? It should give some error. If we decide
to lock here then we should be able to pass the list of relations to
PublicationDropTables() instead of Oids which would then obviate the
need for any change to that function.

Similarly don't we need to lock schemas before dropping them in
AlterPublicationTables()?

5.
+/*
+ * Find the ObjectAddress for a publication tables in schema.  The first
+ * element of the object parameter is the schema name, the second is the
+ * publication name.
+ */
+static ObjectAddress
+get_object_address_publication_schema(List *object, bool missing_ok)

The first part of the above comment is not clear. Can we write it as:
"Find the ObjectAddress for a publication schema. .."?

6.
+List *
+GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
+{
+ Relation classRel;
+ ScanKeyData key[3];
+ TableScanDesc scan;
+ HeapTuple tuple;
+ List    *result = NIL;
+ int keycount = 0;
+
+ Assert(schemaid != InvalidOid);

Isn't it better to use OidIsValid() in the above assert?

7.
@@ -974,6 +974,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
  case OBJECT_PROCEDURE:
  case OBJECT_PUBLICATION:
  case OBJECT_PUBLICATION_REL:
+ case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
  case OBJECT_ROUTINE:
  case OBJECT_RULE:
  case OBJECT_SCHEMA:
@@ -1050,6 +1051,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
  case OCLASS_EXTENSION:
  case OCLASS_POLICY:
  case OCLASS_PUBLICATION:
+ case OCLASS_PUBLICATION_NAMESPACE:
  case OCLASS_PUBLICATION_REL:
  case OCLASS_SUBSCRIPTION:
  case OCLASS_TRANSFORM:
@@ -2127,6 +2129,7 @@ stringify_grant_objtype(ObjectType objtype)
  case OBJECT_POLICY:
  case OBJECT_PUBLICATION:
  case OBJECT_PUBLICATION_REL:
+ case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
  case OBJECT_ROLE:
  case OBJECT_RULE:
  case OBJECT_STATISTIC_EXT:
@@ -2209,6 +2212,7 @@ stringify_adefprivs_objtype(ObjectType objtype)
  case OBJECT_POLICY:
  case OBJECT_PUBLICATION:
  case OBJECT_PUBLICATION_REL:
+ case OBJECT_PUBLICATION_REL_IN_NAMESPACE:

What is the reason for using different names for object_class and
object_type? Normally, we use the same. Is it making things clear in
any place?

8.
+ if (stmt->action == DEFELEM_ADD)
+ {
+ List *pubschemas = GetPublicationSchemas(pubid);
+
+ /* check if the relation is member of the schema list specified */
+ RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);
+
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication.
+ */
+ RelSchemaIsMemberOfSchemaList(rels, pubschemas, false);

Isn't it better to concat the list of schemas and then check the
membership of relations once?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Sep 14, 2021 at 6:38 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have handled this in the patch attached.
>

Regarding the following function in the v28-0002 patch:

+/*
+ * Check if the relation schema is member of the schema list.
+ */
+static void
+RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)

I think this function is not well named or commented, and I don't like
how the "schemacheck" bool parameter determines the type of objects in
the "rels" list.
I would suggest you simply split this function into two separate
functions, corresponding to each of the blocks of the "if-else" within
the for-loop of the existing RelSchemaIsMemberOfSchemaList function.
The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function
name implies a boolean return value, so seems misleading.
So I think the names of the two functions that I am suggesting should
be "CheckXXXXNotAlreadyInPublication" or something similar.


Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Tuesday, September 14, 2021 4:39 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> I have handled this in the patch attached.

Thanks for updating the patch.
Here are some comments.

1)
+static void
+AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
...
+        /*
+         * If the table option was not specified remove the existing tables
+         * from the publication.
+         */
+        if (!tables)
+        {
+            rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+            PublicationDropTables(pubform->oid, rels, false, true);
+        }


It seems not natural to drop tables in AlterPublication*Schemas*,
I think we'd better do it in AlterPublicationTables.

2)
 static void
 AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 ...
+            /*
+             * If ALL TABLES IN SCHEMA option was not specified remove the
+             * existing schemas from the publication.
+             */
+            List *pubschemas = GetPublicationSchemas(pubid);
+            PublicationDropSchemas(pubform->oid, pubschemas, false);

Same as 1), Is it better to modify the schema list in AlterPublicationSchemas ?


3)
 static void
 AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
...
        /* check if the relation is member of the schema list specified */
        RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);

IIRC, The check here is to check the specified tables and schemas in the
command. Personally, this seems a common operation which can be placed in
function AlterPublication(). If we move this check to AlterPublication() and if
comment 1) and 2) makes sense to you, then we don't need the new function
parameters in AlterPublicationTables() and AlterPublicationSchemas().


Best regards,
Hou zj



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have handled this in the patch attached.
> >
>
> 4.
> AlterPublicationSchemas()
> {
> ..
> + /*
> + * If the table option was not specified remove the existing tables
> + * from the publication.
> + */
> + if (!tables)
> + {
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + PublicationDropTables(pubform->oid, rels, false, true);
> + }
> +
> + /* Identify which schemas should be dropped */
> + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> +
> + /* And drop them */
> + PublicationDropSchemas(pubform->oid, delschemas, true);
>
> Here, you have neither locked tables to be dropped nor schemas. I
> think both need to be locked as we do for tables in similar code in
> AlterPublicationTables(). Can you please test via debugger what
> happens if we try to drop without taking lock here and concurrently
> try to drop the actual object? It should give some error. If we decide
> to lock here then we should be able to pass the list of relations to
> PublicationDropTables() instead of Oids which would then obviate the
> need for any change to that function.
>
> Similarly don't we need to lock schemas before dropping them in
> AlterPublicationTables()?
>

I think there is one more similar locking problem.
AlterPublicationSchemas()
{
..
+ if (stmt->action == DEFELEM_ADD)
+ {
+ List *rels;
+
+ rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+ RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true);
...
...
}

Here, we don't have a lock on the relation. So, what if the relation
is concurrently dropped after you get the rel list by
GetPublicationRelations?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 15, 2021 at 4:45 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Sep 14, 2021 at 6:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have handled this in the patch attached.
> >
>
> Regarding the following function in the v28-0002 patch:
>
> +/*
> + * Check if the relation schema is member of the schema list.
> + */
> +static void
> +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)
>
> I think this function is not well named or commented, and I don't like
> how the "schemacheck" bool parameter determines the type of objects in
> the "rels" list.
>

I think after fixing the comments in my previous email, the rels list
will become the same for this function but surely the extra parameter
is required for giving object-specific errors.

> I would suggest you simply split this function into two separate
> functions, corresponding to each of the blocks of the "if-else" within
> the for-loop of the existing RelSchemaIsMemberOfSchemaList function.
> The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function
> name implies a boolean return value, so seems misleading.
> So I think the names of the two functions that I am suggesting should
> be "CheckXXXXNotAlreadyInPublication" or something similar.
>

I think if we write individual functions then we need to add new
functions as and when we add new object types like sequences. The
other idea could be to keep a single function like now say
CheckObjSchemaNotAlreadyInPublication and instead of the bool
parameter as the patch has now, we can keep an enum parameter
"add_obj_type" for 'rel', 'schema', 'sequence'. We can either use
exiting enum PublicationObjSpecType or define a new one for the same.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Sep 16, 2021 at 8:59 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, September 14, 2021 4:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have handled this in the patch attached.
>
> Thanks for updating the patch.
> Here are some comments.
>
> 1)
> +static void
> +AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
> ...
> +               /*
> +                * If the table option was not specified remove the existing tables
> +                * from the publication.
> +                */
> +               if (!tables)
> +               {
> +                       rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> +                       PublicationDropTables(pubform->oid, rels, false, true);
> +               }
>
>
> It seems not natural to drop tables in AlterPublication*Schemas*,
> I think we'd better do it in AlterPublicationTables.

I felt keeping the current way keeps it better to avoid additional
checks. Thoughts?

> 2)
>  static void
>  AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
>  ...
> +                       /*
> +                        * If ALL TABLES IN SCHEMA option was not specified remove the
> +                        * existing schemas from the publication.
> +                        */
> +                       List *pubschemas = GetPublicationSchemas(pubid);
> +                       PublicationDropSchemas(pubform->oid, pubschemas, false);
>
> Same as 1), Is it better to modify the schema list in AlterPublicationSchemas ?

This is similar to above.

> 3)
>  static void
>  AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
> ...
>                 /* check if the relation is member of the schema list specified */
>                 RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);
>
> IIRC, The check here is to check the specified tables and schemas in the
> command. Personally, this seems a common operation which can be placed in
> function AlterPublication(). If we move this check to AlterPublication() and if
> comment 1) and 2) makes sense to you, then we don't need the new function
> parameters in AlterPublicationTables() and AlterPublicationSchemas().

I felt we can keep the checks as is currently, else we will have to
extra checks outside and addition calls for conversion from oid to
Relation like:
  if (stmt->options)
    AlterPublicationOptions(pstate, stmt, rel, tup);
  else
  {
    if (relations)
    {
      if (stmt->action != DEFELEM_DROP)
      {
        List *rels = OpenTableList(relations);

        /* check if relation is member of the schema list specified */
        RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);
        CloseTableList(rels);
      }

      AlterPublicationTables(stmt, rel, tup, relations,
                   list_length(schemaidlist));
    }
    if (schemaidlist)
      AlterPublicationSchemas(stmt, rel, tup, schemaidlist,
                  list_length(relations));
  }

Thoughts?

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have handled this in the patch attached.
> >
>
> Few comments:
> =============
> 1.
> + * CREATE PUBLICATION FOR pub_obj [, pub_obj2] [WITH options]
> + *
> + * pub_obj is one of:
> + *
> + * TABLE table [, table2]
> ..
> ..
> - * ALTER PUBLICATION name ADD TABLE table [, table2]
> + * ALTER PUBLICATION name ADD pub_obj [, pub_obj ...]
>   *
> - * ALTER PUBLICATION name DROP TABLE table [, table2]
> + * ALTER PUBLICATION name DROP pub_obj [, pub_obj ...]
>   *
> - * ALTER PUBLICATION name SET TABLE table [, table2]
> + * ALTER PUBLICATION name SET pub_obj [, pub_obj ...]
>
> In all the above places, the object names mentioned in square brackets
> are not consistent. I suggest using [, ...] everywhere as that is what
> we are using in docs as well.

Modified

> 2.
> +/*
> + * Check if the relation schema is member of the schema list.
> + */
> +static void
> +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)
>
> Can we change the above comment as: "Check if any of the given
> relation's schema is a member of the given schema list."?

Modified

> 3.
> + errmsg("cannot add relation \"%s.%s\" to publication",
> + get_namespace_name(relSchemaId),
> + RelationGetRelationName(rel)),
> + errdetail("Table's schema \"%s\" is already part of the publication.",
> + get_namespace_name(relSchemaId)));
>
> This and other parts of error messages in
> +RelSchemaIsMemberOfSchemaList are not aligned. I think you can now
> run pgindent on your patches that will solve the indentation issues in
> the patch.

Changed by running pgindent.

> 4.
> AlterPublicationSchemas()
> {
> ..
> + /*
> + * If the table option was not specified remove the existing tables
> + * from the publication.
> + */
> + if (!tables)
> + {
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + PublicationDropTables(pubform->oid, rels, false, true);
> + }
> +
> + /* Identify which schemas should be dropped */
> + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> +
> + /* And drop them */
> + PublicationDropSchemas(pubform->oid, delschemas, true);
>
> Here, you have neither locked tables to be dropped nor schemas. I
> think both need to be locked as we do for tables in similar code in
> AlterPublicationTables(). Can you please test via debugger what
> happens if we try to drop without taking lock here and concurrently
> try to drop the actual object? It should give some error. If we decide
> to lock here then we should be able to pass the list of relations to
> PublicationDropTables() instead of Oids which would then obviate the
> need for any change to that function.
>
> Similarly don't we need to lock schemas before dropping them in
> AlterPublicationTables()?

we will get the following error, if concurrently dropped from another
session during debugging:
postgres=# alter publication pub1 set all tables in schema sch2;
ERROR:  cache lookup failed for publication table 16418
Modified to add locking

> 5.
> +/*
> + * Find the ObjectAddress for a publication tables in schema.  The first
> + * element of the object parameter is the schema name, the second is the
> + * publication name.
> + */
> +static ObjectAddress
> +get_object_address_publication_schema(List *object, bool missing_ok)
>
> The first part of the above comment is not clear. Can we write it as:
> "Find the ObjectAddress for a publication schema. .."?

Modified

> 6.
> +List *
> +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
> +{
> + Relation classRel;
> + ScanKeyData key[3];
> + TableScanDesc scan;
> + HeapTuple tuple;
> + List    *result = NIL;
> + int keycount = 0;
> +
> + Assert(schemaid != InvalidOid);
>
> Isn't it better to use OidIsValid() in the above assert?

Modified

> 7.
> @@ -974,6 +974,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
>   case OBJECT_PROCEDURE:
>   case OBJECT_PUBLICATION:
>   case OBJECT_PUBLICATION_REL:
> + case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
>   case OBJECT_ROUTINE:
>   case OBJECT_RULE:
>   case OBJECT_SCHEMA:
> @@ -1050,6 +1051,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
>   case OCLASS_EXTENSION:
>   case OCLASS_POLICY:
>   case OCLASS_PUBLICATION:
> + case OCLASS_PUBLICATION_NAMESPACE:
>   case OCLASS_PUBLICATION_REL:
>   case OCLASS_SUBSCRIPTION:
>   case OCLASS_TRANSFORM:
> @@ -2127,6 +2129,7 @@ stringify_grant_objtype(ObjectType objtype)
>   case OBJECT_POLICY:
>   case OBJECT_PUBLICATION:
>   case OBJECT_PUBLICATION_REL:
> + case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
>   case OBJECT_ROLE:
>   case OBJECT_RULE:
>   case OBJECT_STATISTIC_EXT:
> @@ -2209,6 +2212,7 @@ stringify_adefprivs_objtype(ObjectType objtype)
>   case OBJECT_POLICY:
>   case OBJECT_PUBLICATION:
>   case OBJECT_PUBLICATION_REL:
> + case OBJECT_PUBLICATION_REL_IN_NAMESPACE:
>
> What is the reason for using different names for object_class and
> object_type? Normally, we use the same. Is it making things clear in
> any place?

I thought it might be easier to review, I have changed it to the
standard way of naming object_class and object_type. Renamed it to
OBJECT_PUBLICATION_NAMESPACE.

> 8.
> + if (stmt->action == DEFELEM_ADD)
> + {
> + List *pubschemas = GetPublicationSchemas(pubid);
> +
> + /* check if the relation is member of the schema list specified */
> + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);
> +
> + /*
> + * Check if the relation is member of the existing schema in the
> + * publication.
> + */
> + RelSchemaIsMemberOfSchemaList(rels, pubschemas, false);
>
> Isn't it better to concat the list of schemas and then check the
> membership of relations once?

Modified.

Attached v29 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have handled this in the patch attached.
> > >
> >
> > 4.
> > AlterPublicationSchemas()
> > {
> > ..
> > + /*
> > + * If the table option was not specified remove the existing tables
> > + * from the publication.
> > + */
> > + if (!tables)
> > + {
> > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > + PublicationDropTables(pubform->oid, rels, false, true);
> > + }
> > +
> > + /* Identify which schemas should be dropped */
> > + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> > +
> > + /* And drop them */
> > + PublicationDropSchemas(pubform->oid, delschemas, true);
> >
> > Here, you have neither locked tables to be dropped nor schemas. I
> > think both need to be locked as we do for tables in similar code in
> > AlterPublicationTables(). Can you please test via debugger what
> > happens if we try to drop without taking lock here and concurrently
> > try to drop the actual object? It should give some error. If we decide
> > to lock here then we should be able to pass the list of relations to
> > PublicationDropTables() instead of Oids which would then obviate the
> > need for any change to that function.
> >
> > Similarly don't we need to lock schemas before dropping them in
> > AlterPublicationTables()?
> >
>
> I think there is one more similar locking problem.
> AlterPublicationSchemas()
> {
> ..
> + if (stmt->action == DEFELEM_ADD)
> + {
> + List *rels;
> +
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true);
> ...
> ...
> }
>
> Here, we don't have a lock on the relation. So, what if the relation
> is concurrently dropped after you get the rel list by
> GetPublicationRelations?

This works fine without locking even after concurrent drop, I felt
this works because of MVCC.

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Sep 16, 2021 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 15, 2021 at 4:45 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Tue, Sep 14, 2021 at 6:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have handled this in the patch attached.
> > >
> >
> > Regarding the following function in the v28-0002 patch:
> >
> > +/*
> > + * Check if the relation schema is member of the schema list.
> > + */
> > +static void
> > +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)
> >
> > I think this function is not well named or commented, and I don't like
> > how the "schemacheck" bool parameter determines the type of objects in
> > the "rels" list.
> >
>
> I think after fixing the comments in my previous email, the rels list
> will become the same for this function but surely the extra parameter
> is required for giving object-specific errors.
>
> > I would suggest you simply split this function into two separate
> > functions, corresponding to each of the blocks of the "if-else" within
> > the for-loop of the existing RelSchemaIsMemberOfSchemaList function.
> > The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function
> > name implies a boolean return value, so seems misleading.
> > So I think the names of the two functions that I am suggesting should
> > be "CheckXXXXNotAlreadyInPublication" or something similar.
> >
>
> I think if we write individual functions then we need to add new
> functions as and when we add new object types like sequences. The
> other idea could be to keep a single function like now say
> CheckObjSchemaNotAlreadyInPublication and instead of the bool
> parameter as the patch has now, we can keep an enum parameter
> "add_obj_type" for 'rel', 'schema', 'sequence'. We can either use
> exiting enum PublicationObjSpecType or define a new one for the same.

Modified the function name and changed the parameter to
PublicationObjSpecType. The changes are present at the v29 patch
posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1Wb%3D_HGd85wp2WM%2BfLc-8PSJ824TOZEJ6nDz3akWTidw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Sep 17, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 4.
> > AlterPublicationSchemas()
> > {
> > ..
> > + /*
> > + * If the table option was not specified remove the existing tables
> > + * from the publication.
> > + */
> > + if (!tables)
> > + {
> > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > + PublicationDropTables(pubform->oid, rels, false, true);
> > + }
> > +
> > + /* Identify which schemas should be dropped */
> > + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> > +
> > + /* And drop them */
> > + PublicationDropSchemas(pubform->oid, delschemas, true);
> >
> > Here, you have neither locked tables to be dropped nor schemas. I
> > think both need to be locked as we do for tables in similar code in
> > AlterPublicationTables(). Can you please test via debugger what
> > happens if we try to drop without taking lock here and concurrently
> > try to drop the actual object? It should give some error. If we decide
> > to lock here then we should be able to pass the list of relations to
> > PublicationDropTables() instead of Oids which would then obviate the
> > need for any change to that function.
> >
> > Similarly don't we need to lock schemas before dropping them in
> > AlterPublicationTables()?
>
> we will get the following error, if concurrently dropped from another
> session during debugging:
> postgres=# alter publication pub1 set all tables in schema sch2;
> ERROR:  cache lookup failed for publication table 16418
> Modified to add locking
>

But you haven't followed my other suggestion related to
PublicationDropTables(). I don't think after doing this, you need to
pass 'true' as the last parameter to PublicationDropTables. In fact,
you can remove that parameter altogether or in other words, we don't
need any change in PublicationDropTables for this patch. Is there a
reason why we shouldn't make this change?

Few other comments:
===================
1. The ordering of lock acquisition for schema and relation in
AlterPublicationSchemas() and AlterPublicationTables() is opposite
which would generally lead to deadlock but it won't here because we
acquire share lock on the schema. But, I think it may still be better
to keep the locking order the same and it might help us to keep schema
and relation code separate

2. One more thing, I think one can concurrently add-relation for a
particular schema and that particular schema. To protect that
AlterPublication should acquire an exclusive lock similar to how we do
in AlterSubscription.

3.
+ /*
+ * If the table option was not specified remove the existing tables
+ * from the publication.
+ */
+ if (!relsSpecified)
+ {
+ List    *relations = NIL;
+ List    *tables = NIL;
+
+ rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+ tables = RelationOidsToRangevars(rels);
+ relations = OpenTableList(tables);

One problem with using OpenTableList here is that it might try to lock
inherited children twice. Also, you don't need to first convert to
rangevar for locking relations, you can directly use table_open here.

4.
+ | extended_relation_expr
+ {
+ $$ = makeNode(PublicationObjSpec);
+ $$->object = (Node *)$1;
+ }
+ | CURRENT_SCHEMA
+ {
+ $$ = makeNode(PublicationObjSpec);
+ $$->object = (Node *)makeString("CURRENT_SCHEMA");
+ }
  ;

-publication_for_tables:
- FOR TABLE publication_table_list
+/* This can be either a schema or relation name. */
+pubobj_name:
+ ColId
  {
- $$ = (Node *) $3;
+ $$ = (Node *) makeString($1);
  }
- | FOR ALL TABLES
+ | ColId indirection
  {
- $$ = (Node *) makeInteger(true);
+ $$ = (Node *) makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
  }

In some places, you have given space after (Node *) and at other
places, there is no space. Isn't it better to be consistent?

5.
+/* This can be either a schema or relation name. */
+pubobj_name:

Here, we can modify the comment as "This can be either a schema or
relation name. For relations, the inheritance will be implicit." And
then remove the inheritance related comment from code below:

+ /* inheritance query, implicitly */
+ $$ = makeNode(PublicationObjSpec);
+ $$->object = $1;

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Sep 17, 2021 at 5:40 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I think there is one more similar locking problem.
> > AlterPublicationSchemas()
> > {
> > ..
> > + if (stmt->action == DEFELEM_ADD)
> > + {
> > + List *rels;
> > +
> > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true);
> > ...
> > ...
> > }
> >
> > Here, we don't have a lock on the relation. So, what if the relation
> > is concurrently dropped after you get the rel list by
> > GetPublicationRelations?
>
> This works fine without locking even after concurrent drop, I felt
> this works because of MVCC.
>

Can you share the exact scenario you have tested? I think here it can
give a wrong error because it might access invalid cache entry, so I
think a lock is required here. Also, as said before, this might help
to make the rel list consistent in function
CheckObjSchemaNotAlreadyInPublication().

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 20, 2021 at 3:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 17, 2021 at 5:40 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I think there is one more similar locking problem.
> > > AlterPublicationSchemas()
> > > {
> > > ..
> > > + if (stmt->action == DEFELEM_ADD)
> > > + {
> > > + List *rels;
> > > +
> > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true);
> > > ...
> > > ...
> > > }
> > >
> > > Here, we don't have a lock on the relation. So, what if the relation
> > > is concurrently dropped after you get the rel list by
> > > GetPublicationRelations?
> >
> > This works fine without locking even after concurrent drop, I felt
> > this works because of MVCC.
> >
>
> Can you share the exact scenario you have tested? I think here it can
> give a wrong error because it might access invalid cache entry, so I
> think a lock is required here. Also, as said before, this might help
> to make the rel list consistent in function
> CheckObjSchemaNotAlreadyInPublication().

This is the scenario I tried:
create schema sch1;
create table sch1.t1 (c1 int);
create publication pub1 for table sch1.t1;
alter publication pub1 add all tables in schema sch1;  -- concurrently
drop table sch1.t1 from another session.

I will add the locking and changing of
CheckObjSchemaNotAlreadyInPublication in the next version.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v29 patch has the fixes for the same.
>

Some minor comments on the v29-0002 patch:

(1)
In get_object_address_publication_schema(), the error message:

+ errmsg("publication tables of schema \"%s\" in publication \"%s\"
does not exist",

isn't grammatically correct. It should probably be:

+ errmsg("publication tables of schema \"%s\" in publication \"%s\" do
not exist",

(2)
getPublicationSchemaInfo() function header.
"string" should be "strings"

BEFORE:
+ * nspname. Both pubname and nspname are palloc'd string which will be freed by
AFTER:
+ * nspname. Both pubname and nspname are palloc'd strings which will
be freed by

(3)
getPublicationSchemaInfo()

In the case of "if (!(*nspname))", the following line should probably
be added before returning false:

   *pubname = NULL;

(4)
GetAllSchemasPublicationRelations() function header

Shouldn't:

+ * Gets the list of all relations published by FOR ALL TABLES IN SCHEMA
+ * publication(s).

actually say:

+ * Gets the list of all relations published by a FOR ALL TABLES IN SCHEMA
+ * publication.

since it is for a specified publication?

(5)
I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of
checking "checkobjtype" each loop iteration, wouldn't it be better to
just use the same for-loop in each IF block?


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v29 patch has the fixes for the same.
> >
>
> Some minor comments on the v29-0002 patch:
>
> (1)
> In get_object_address_publication_schema(), the error message:
>
> + errmsg("publication tables of schema \"%s\" in publication \"%s\"
> does not exist",
>
> isn't grammatically correct. It should probably be:
>
> + errmsg("publication tables of schema \"%s\" in publication \"%s\" do
> not exist",

"does not exist" is used across the file. Should we keep it like that
to maintain consistency. Thoughts?

> (2)
> getPublicationSchemaInfo() function header.
> "string" should be "strings"
>
> BEFORE:
> + * nspname. Both pubname and nspname are palloc'd string which will be freed by
> AFTER:
> + * nspname. Both pubname and nspname are palloc'd strings which will
> be freed by

I will change it in the next version.

> (3)
> getPublicationSchemaInfo()
>
> In the case of "if (!(*nspname))", the following line should probably
> be added before returning false:
>
>    *pubname = NULL;

In case of failure we return false and don't access it. I felt we
could keep it as it is. Thoughts?

> (4)
> GetAllSchemasPublicationRelations() function header
>
> Shouldn't:
>
> + * Gets the list of all relations published by FOR ALL TABLES IN SCHEMA
> + * publication(s).
>
> actually say:
>
> + * Gets the list of all relations published by a FOR ALL TABLES IN SCHEMA
> + * publication.
>
> since it is for a specified publication?

I will change it in the next version.

> (5)
> I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of
> checking "checkobjtype" each loop iteration, wouldn't it be better to
> just use the same for-loop in each IF block?

I will be changing it to:
static void
CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist,
                    PublicationObjSpecType checkobjtype)
{
  ListCell   *lc;

  foreach(lc, rels)
  {
    Relation  rel = (Relation) lfirst(lc);
    Oid     relSchemaId = RelationGetNamespace(rel);

    if (list_member_oid(schemaidlist, relSchemaId))
    {
      if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA)
        ereport(ERROR,
            errcode(ERRCODE_INVALID_PARAMETER_VALUE),
            errmsg("cannot add schema \"%s\" to publication",
                 get_namespace_name(relSchemaId)),
            errdetail("Table \"%s\" in schema \"%s\" is already part
of the publication, adding the same schema is not supported.",
                  RelationGetRelationName(rel),
                  get_namespace_name(relSchemaId)));
      else if (checkobjtype == PUBLICATIONOBJ_TABLE)
        ereport(ERROR,
            errcode(ERRCODE_INVALID_PARAMETER_VALUE),
            errmsg("cannot add relation \"%s.%s\" to publication",
                 get_namespace_name(relSchemaId),
                 RelationGetRelationName(rel)),
            errdetail("Table's schema \"%s\" is already part of the
publication.",
                  get_namespace_name(relSchemaId)));
    }
  }
}
After the change checkobjtype will be checked only once in case of error.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Sep 21, 2021 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote:
>
> > (1)
> > In get_object_address_publication_schema(), the error message:
> >
> > + errmsg("publication tables of schema \"%s\" in publication \"%s\"
> > does not exist",
> >
> > isn't grammatically correct. It should probably be:
> >
> > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do
> > not exist",
>
> "does not exist" is used across the file. Should we keep it like that
> to maintain consistency. Thoughts?
>

When it's singular, "does not exist" is correct.
I think currently only this case exists in the publication code.
e.g.
"publication \"%s\" does not exist"
"publication relation \"%s\" in publication \"%s\" does not exist"

But "publication tables" is plural, so it needs to say "do not exist"
rather than "does not exist".

> >
> > In the case of "if (!(*nspname))", the following line should probably
> > be added before returning false:
> >
> >    *pubname = NULL;
>
> In case of failure we return false and don't access it. I felt we
> could keep it as it is. Thoughts?
>

OK then, I might be being a bit pedantic.
(I was just thinking, strictly speaking, we probably shouldn't be
writing into the caller's pass-by-reference parameters in the case
false is returned)

> > (5)
> > I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of
> > checking "checkobjtype" each loop iteration, wouldn't it be better to
> > just use the same for-loop in each IF block?
>
> I will be changing it to:
> static void
> CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist,
>                     PublicationObjSpecType checkobjtype)
> {
>   ListCell   *lc;
>
>   foreach(lc, rels)
>   {
>     Relation  rel = (Relation) lfirst(lc);
>     Oid     relSchemaId = RelationGetNamespace(rel);
>
>     if (list_member_oid(schemaidlist, relSchemaId))
>     {
>       if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA)
>         ereport(ERROR,
>             errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>             errmsg("cannot add schema \"%s\" to publication",
>                  get_namespace_name(relSchemaId)),
>             errdetail("Table \"%s\" in schema \"%s\" is already part
> of the publication, adding the same schema is not supported.",
>                   RelationGetRelationName(rel),
>                   get_namespace_name(relSchemaId)));
>       else if (checkobjtype == PUBLICATIONOBJ_TABLE)
>         ereport(ERROR,
>             errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>             errmsg("cannot add relation \"%s.%s\" to publication",
>                  get_namespace_name(relSchemaId),
>                  RelationGetRelationName(rel)),
>             errdetail("Table's schema \"%s\" is already part of the
> publication.",
>                   get_namespace_name(relSchemaId)));
>     }
>   }
> }
> After the change checkobjtype will be checked only once in case of error.
>

OK.

One thing related to this code is the following:

i)
postgres=# create publication pub1 for all tables in schema sch1,
table sch1.test;
ERROR:  cannot add relation "sch1.test" to publication
DETAIL:  Table's schema "sch1" is already part of the publication.

ii)
postgres=# create publication pub1 for table sch1.test, all tables in
schema sch1;
ERROR:  cannot add relation "sch1.test" to publication
DETAIL:  Table's schema "sch1" is already part of the publication.

Notice that in case (ii), the same error message is used, but the
order of items to be "added" to the publication is the reverse of case
(i), and really implies the table "sch1.test" was added first, but
this is not reflected by the error message. So it seems slightly odd
to say the schema is already part of the publication, when the table
was actually listed first.
I'm wondering if this can be improved?

One idea I had was the following more generic type of message, but I'm
not 100% happy with the wording:

    DETAIL:  Schema "sch1" and one of its tables can't separately be
part of the publication.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 20, 2021 at 3:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 17, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 4.
> > > AlterPublicationSchemas()
> > > {
> > > ..
> > > + /*
> > > + * If the table option was not specified remove the existing tables
> > > + * from the publication.
> > > + */
> > > + if (!tables)
> > > + {
> > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > > + PublicationDropTables(pubform->oid, rels, false, true);
> > > + }
> > > +
> > > + /* Identify which schemas should be dropped */
> > > + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> > > +
> > > + /* And drop them */
> > > + PublicationDropSchemas(pubform->oid, delschemas, true);
> > >
> > > Here, you have neither locked tables to be dropped nor schemas. I
> > > think both need to be locked as we do for tables in similar code in
> > > AlterPublicationTables(). Can you please test via debugger what
> > > happens if we try to drop without taking lock here and concurrently
> > > try to drop the actual object? It should give some error. If we decide
> > > to lock here then we should be able to pass the list of relations to
> > > PublicationDropTables() instead of Oids which would then obviate the
> > > need for any change to that function.
> > >
> > > Similarly don't we need to lock schemas before dropping them in
> > > AlterPublicationTables()?
> >
> > we will get the following error, if concurrently dropped from another
> > session during debugging:
> > postgres=# alter publication pub1 set all tables in schema sch2;
> > ERROR:  cache lookup failed for publication table 16418
> > Modified to add locking
> >
>
> But you haven't followed my other suggestion related to
> PublicationDropTables(). I don't think after doing this, you need to
> pass 'true' as the last parameter to PublicationDropTables. In fact,
> you can remove that parameter altogether or in other words, we don't
> need any change in PublicationDropTables for this patch. Is there a
> reason why we shouldn't make this change?

Modified.

> Few other comments:
> ===================
> 1. The ordering of lock acquisition for schema and relation in
> AlterPublicationSchemas() and AlterPublicationTables() is opposite
> which would generally lead to deadlock but it won't here because we
> acquire share lock on the schema. But, I think it may still be better
> to keep the locking order the same and it might help us to keep schema
> and relation code separate

Modified

> 2. One more thing, I think one can concurrently add-relation for a
> particular schema and that particular schema. To protect that
> AlterPublication should acquire an exclusive lock similar to how we do
> in AlterSubscription.

Modified

> 3.
> + /*
> + * If the table option was not specified remove the existing tables
> + * from the publication.
> + */
> + if (!relsSpecified)
> + {
> + List    *relations = NIL;
> + List    *tables = NIL;
> +
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + tables = RelationOidsToRangevars(rels);
> + relations = OpenTableList(tables);
>
> One problem with using OpenTableList here is that it might try to lock
> inherited children twice. Also, you don't need to first convert to
> rangevar for locking relations, you can directly use table_open here.

Modified

> 4.
> + | extended_relation_expr
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->object = (Node *)$1;
> + }
> + | CURRENT_SCHEMA
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->object = (Node *)makeString("CURRENT_SCHEMA");
> + }
>   ;
>
> -publication_for_tables:
> - FOR TABLE publication_table_list
> +/* This can be either a schema or relation name. */
> +pubobj_name:
> + ColId
>   {
> - $$ = (Node *) $3;
> + $$ = (Node *) makeString($1);
>   }
> - | FOR ALL TABLES
> + | ColId indirection
>   {
> - $$ = (Node *) makeInteger(true);
> + $$ = (Node *) makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
>   }
>
> In some places, you have given space after (Node *) and at other
> places, there is no space. Isn't it better to be consistent?

I have changed it to "(Node *)" without space in *.y, I did not change
in *.c as I noticed it is used with space like "(Node *) " in other
places of *.c files.

> 5.
> +/* This can be either a schema or relation name. */
> +pubobj_name:
>
> Here, we can modify the comment as "This can be either a schema or
> relation name. For relations, the inheritance will be implicit."

Modified

 And
> then remove the inheritance related comment from code below:
>
> + /* inheritance query, implicitly */
> + $$ = makeNode(PublicationObjSpec);
> + $$->object = $1;

Modified.

Attached v30 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 20, 2021 at 4:20 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Sep 20, 2021 at 3:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Sep 17, 2021 at 5:40 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I think there is one more similar locking problem.
> > > > AlterPublicationSchemas()
> > > > {
> > > > ..
> > > > + if (stmt->action == DEFELEM_ADD)
> > > > + {
> > > > + List *rels;
> > > > +
> > > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> > > > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true);
> > > > ...
> > > > ...
> > > > }
> > > >
> > > > Here, we don't have a lock on the relation. So, what if the relation
> > > > is concurrently dropped after you get the rel list by
> > > > GetPublicationRelations?
> > >
> > > This works fine without locking even after concurrent drop, I felt
> > > this works because of MVCC.
> > >
> >
> > Can you share the exact scenario you have tested? I think here it can
> > give a wrong error because it might access invalid cache entry, so I
> > think a lock is required here. Also, as said before, this might help
> > to make the rel list consistent in function
> > CheckObjSchemaNotAlreadyInPublication().
>
> This is the scenario I tried:
> create schema sch1;
> create table sch1.t1 (c1 int);
> create publication pub1 for table sch1.t1;
> alter publication pub1 add all tables in schema sch1;  -- concurrently
> drop table sch1.t1 from another session.
>
> I will add the locking and changing of
> CheckObjSchemaNotAlreadyInPublication in the next version.

I have made the changes for the above at the v30 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3aFtXpkD4m28-ENG9F4faBEVdGNUrEhgKV0pHr2S_C2g%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v29 patch has the fixes for the same.
> >
>
> Some minor comments on the v29-0002 patch:
>
> (1)
> In get_object_address_publication_schema(), the error message:
>
> + errmsg("publication tables of schema \"%s\" in publication \"%s\"
> does not exist",
>
> isn't grammatically correct. It should probably be:
>
> + errmsg("publication tables of schema \"%s\" in publication \"%s\" do
> not exist",

Modified

> (2)
> getPublicationSchemaInfo() function header.
> "string" should be "strings"
>
> BEFORE:
> + * nspname. Both pubname and nspname are palloc'd string which will be freed by
> AFTER:
> + * nspname. Both pubname and nspname are palloc'd strings which will
> be freed by

Modified

> (3)
> getPublicationSchemaInfo()
>
> In the case of "if (!(*nspname))", the following line should probably
> be added before returning false:
>
>    *pubname = NULL;

I left it as it is, as we don't access it in case of return false.

> (4)
> GetAllSchemasPublicationRelations() function header
>
> Shouldn't:
>
> + * Gets the list of all relations published by FOR ALL TABLES IN SCHEMA
> + * publication(s).
>
> actually say:
>
> + * Gets the list of all relations published by a FOR ALL TABLES IN SCHEMA
> + * publication.
>
> since it is for a specified publication?

Modified

> (5)
> I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of
> checking "checkobjtype" each loop iteration, wouldn't it be better to
> just use the same for-loop in each IF block?

Modified

I have made the changes for the above at the v30 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3aFtXpkD4m28-ENG9F4faBEVdGNUrEhgKV0pHr2S_C2g%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Wed, Sep 22, 2021 at 4:02 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v30 patch has the fixes for the same.
>

Thanks for all the patch updates.
I have some suggested updates to the v30-0005 documentation patch:

doc/src/sgml/ref/alter_publication.sgml
(1)
I'm thinking it might be better to simplify the description, because
it's a bit wordy and difficult to read with the "all tables in schema"
bits.
Suggested update is below (thoughts?):

BEFORE:
+   The first three variants change which tables and/or all tables in schema are
+   part of the publication.  The <literal>SET</literal> clause will replace
+   the list of tables and/or all tables in schema in the publication with the
+   specified one, the existing tables and all tables in schema that were
+   present in the publication will be removed.  The <literal>ADD</literal>
+   clause will add one or more tables and/or all tables in schema to the
+   publication. The <literal>DROP</literal> clauses will remove one or more
+   tables and/or all tables in schema from the publication.  Note that adding
+   tables and/or all tables in schema to a publication that is already
+   subscribed to will require a <literal>ALTER SUBSCRIPTION ...
REFRESH PUBLICATION</literal>
+   action on the subscribing side in order to become effective.
AFTER:
+   The first three variants change which tables/schemas are
+   part of the publication.  The <literal>SET</literal> clause will replace
+   the list of tables/schemas in the publication with the
+   specified list; the existing tables/schemas that were
+   present in the publication will be removed.  The <literal>ADD</literal>
+   clause will add one or more tables/schemas to the
+   publication. The <literal>DROP</literal> clauses will remove one or more
+   tables/schemas from the publication.  Note that adding
+   tables/schemas to a publication that is already
+   subscribed to will require a <literal>ALTER SUBSCRIPTION ...
REFRESH PUBLICATION</literal>
+   action on the subscribing side in order to become effective.


doc/src/sgml/ref/create_publication.sgml
(2)
I suggest an update similar to the following:

BEFORE:
+      Specifying a table that is part of schema specified in
+      <literal>FOR ALL TABLES IN SCHEMA</literal> option is not supported.
AFTER:
+      Specifying a table that is part of a schema already included in
the publication is not supported.


(3)
I find the following description a little unclear:

+     <para>
+      Specifying a schema along with schema's table specified as part of
+      <literal>FOR TABLE</literal> option is not supported.
+     </para>

Perhaps the following would be better:

+     <para>
+      Specifying a schema that contains a table already included in the
+      publication is not supported.
+     </para>

(4)
Minor fix:
BEFORE:
+   rights on the table.  The <command>FOR ALL TABLES</command> and
+   <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking
+   user to be a superuser.
AFTER:
+   rights on the table.  The <command>FOR ALL TABLES</command> and
+   <command>FOR ALL TABLES IN SCHEMA</command> clauses require the invoking
+   user to be a superuser.


Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wednesday, September 22, 2021 2:02 AM vignesh C <vignesh21@gmail.com> wrote:
> Attached v30 patch has the fixes for the same.

Thanks for updating the patches.

I have one comment.
@@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt)
...
+        if (list_length(relations))
+        {
...
+            /* remove the existing schemas from the publication */
+            PublicationDropSchemas(pubform->oid, delschemas, false);
...
+        }

After more thoughts on it, I still don't think drop all the schemas under " if
(list_length(relations))" is a good idea. I think 1) we'd better keep schema
and relation code separate. 2) if we support other type object(SEQUENCE) I the
future and only SET xx SEQUENCE, I think the above logic won't work because
both relations and schemaidlist will be NIL.

Same as the logic of drop all tables under " if (list_length(schemaidlist))".

Thoughs ?

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Attached v30 patch has the fixes for the same.
>

Thank you for updating the patches.

Here are random comments on v30-0002 patch:

+
+                       if (stmt->action == DEFELEM_SET &&
!list_length(schemaidlist))
+                       {
+                               delschemas =
GetPublicationSchemas(pubform->oid);
+                               LockSchemaList(delschemas);
+                       }

I think "list_length(schemaidlist) > 0" would be more readable.

---
This patch introduces some new static functions to publicationcmds.c
but some have function prototypes but some don't. As far as I checked,

ObjectsInPublicationToOids()
CheckObjSchemaNotAlreadyInPublication()
GetAlterPublicationDelRelations()
AlterPublicationSchemas()
CheckPublicationAlterTables()
CheckPublicationAlterSchemas()
LockSchemaList()
OpenReliIdList()
PublicationAddSchemas()
PublicationDropSchemas()

are newly introduced but only four functions:

OpenReliIdList()
LockSchemaList()
PublicationAddSchemas()
PublicationDropSchemas()

have function prototypes. Actually, there already are functions that
don't have their prototype in publicationcmds.c. But it seems better
to clarify what kind of functions don't need to have a prototype at
least in this file.

---
ISTM we can inline the contents of three functions:
GetAlterPublicationDelRelations(), CheckPublicationAlterTables(), and
CheckPublicationAlterSchemas(). These have only one caller and ISTM
makes the readability worse. I think it's not necessary to make
functions for them.

---
+ * This is dispatcher function for AlterPublicationOptions,
+ * AlterPublicationSchemas and AlterPublicationTables.

As this comment mentioned, AlterPublication() calls
AlterPublicationTables() and AlterPublicationSchemas() but this
function also a lot of pre-processing such as building the list and
some checks, depending on stmt->action before calling these two
functions. And those two functions also perform some operation
depending on stmt->action. So ISTM it's better to move those
pre-processing to these two functions and have AlterPublication() just
call these two functions. What do you think?

---
+List *
+GetAllSchemasPublicationRelations(Oid puboid, PublicationPartOpt pub_partopt)

Since this function gets all relations in the schema publication, I
think GetAllSchemaPublicationRelations() would be better as a function
name (removing 's' before 'P').

---
+                       if (!IsA(node, String))
+                               ereport(ERROR,
+                                               errcode(ERRCODE_SYNTAX_ERROR),
+                                               errmsg("invalid schema
name at or near"),
+
parser_errposition(pstate, pubobj->location));

The error message should mention where the invalid schema name is at
or near. Also, In the following example, the error position in the
error message seems not to be where the invalid schemaname s.s is:

postgres(1:47707)=# create publication p for all tables in schema s.s;
ERROR:  invalid schema name at or near
LINE 1: create publication p for all tables in schema s.s;
                                 ^

---
+               if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE)
+               {
+                       if (IsA(node, RangeVar))
+                               *rels = lappend(*rels, (RangeVar *) node);
+                       else if (IsA(node, String))
+                       {
+                               RangeVar   *rel = makeRangeVar(NULL,
strVal(node),
+
                    pubobj->location);
+
+                               *rels = lappend(*rels, rel);
+                       }
+               }
+               else if (pubobj->pubobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA)
+               {
(snip)
+                       /* Filter out duplicates if user specifies
"sch1, sch1" */
+                       *schemas = list_append_unique_oid(*schemas, schemaid);
+               }

Do we need to filter out duplicates also in PUBLICATIONOBJ_TABLE case
since users can specify things like "TABLE tbl, tbl, tbl"?

---
+       if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser())
+               ereport(ERROR,
+                               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                                errmsg("must be superuser to add or
set schemas")));

Why do we require the superuser privilege only for ADD and SET but not for DROP?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 22, 2021 at 8:02 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, September 22, 2021 2:02 AM vignesh C <vignesh21@gmail.com> wrote:
> > Attached v30 patch has the fixes for the same.
>
> Thanks for updating the patches.
>
> I have one comment.
> @@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt)
> ...
> +               if (list_length(relations))
> +               {
> ...
> +                       /* remove the existing schemas from the publication */
> +                       PublicationDropSchemas(pubform->oid, delschemas, false);
> ...
> +               }
>
> After more thoughts on it, I still don't think drop all the schemas under " if
> (list_length(relations))" is a good idea. I think 1) we'd better keep schema
> and relation code separate.
>

How do you suggest changing it?

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Wednesday, September 22, 2021 11:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> ---
> +                       if (!IsA(node, String))
> +                               ereport(ERROR,
> +                                               errcode(ERRCODE_SYNTAX_ERROR),
> +                                               errmsg("invalid schema
> name at or near"),
> +
> parser_errposition(pstate, pubobj->location));
> 
> The error message should mention where the invalid schema name is at
> or near. Also, In the following example, the error position in the
> error message seems not to be where the invalid schemaname s.s is:
> 
> postgres(1:47707)=# create publication p for all tables in schema s.s;
> ERROR:  invalid schema name at or near
> LINE 1: create publication p for all tables in schema s.s;
>                                  ^
> 

I noticed this, too. And I think it could be fixed by the following change, thoughts?

--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9681,7 +9681,7 @@ PublicationObjSpec:       TABLE pubobj_expr
                                        {
                                                $$ = $5;
                                                $$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA;
-                                               $$->location = @1;
+                                               $$->location = @5;
                                        }
                        | pubobj_expr
                                        {


Besides, about this change in tab-complete.c:

+    else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "SCHEMA"))
+        COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+                            " UNION SELECT 'CURRENT_SCHEMA'");

It should be "ALL TABLES IN SCHEMA" not "SCHEMA" at the first line, right?

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Sep 21, 2021 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Attached v29 patch has the fixes for the same.
> > >
> >
> > Some minor comments on the v29-0002 patch:
> >
> > (1)
> > In get_object_address_publication_schema(), the error message:
> >
> > + errmsg("publication tables of schema \"%s\" in publication \"%s\"
> > does not exist",
> >
> > isn't grammatically correct. It should probably be:
> >
> > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do
> > not exist",
>
> Modified
>

I still see the old message in v30. But I have a different suggestion
for this message. How about changing it to: "publication schema \"%s\"
in publication \"%s\" does not exist"? This will make it similar to
other messages and I don't see the need here to add 'tables' as we
have it in grammar.

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wed, Sep 22, 2021 1:29 PMAmit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 8:02 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wednesday, September 22, 2021 2:02 AM vignesh C
> <vignesh21@gmail.com> wrote:
> > > Attached v30 patch has the fixes for the same.
> >
> > Thanks for updating the patches.
> >
> > I have one comment.
> > @@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate,
> > AlterPublicationStmt *stmt) ...
> > +               if (list_length(relations))
> > +               {
> > ...
> > +                       /* remove the existing schemas from the publication
> */
> > +                       PublicationDropSchemas(pubform->oid,
> > + delschemas, false);
> > ...
> > +               }
> >
> > After more thoughts on it, I still don't think drop all the schemas
> > under " if (list_length(relations))" is a good idea. I think 1) we'd
> > better keep schema and relation code separate.
> >
> 
> How do you suggest changing it?

Personally, I think we'd better move the code about changing publication's
tablelist into AlterPublicationTables and the code about changing publication's
schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset
did, the difference is the SET action, I suggest we drop all the tables in
function AlterPublicationTables when user only set schemas and drop all the
schema in AlterPublicationSchemas when user only set tables. In this approach,
we can keep schema and relation code separate and don't need to worry
about the locking order.

Attach a top-up patch which refactor the code like above.
Thoughts ?

Best regards,
Hou zj

 


Attachment

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> >
> > How do you suggest changing it?
>
> Personally, I think we'd better move the code about changing publication's
> tablelist into AlterPublicationTables and the code about changing publication's
> schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset
> did, the difference is the SET action, I suggest we drop all the tables in
> function AlterPublicationTables when user only set schemas and drop all the
> schema in AlterPublicationSchemas when user only set tables. In this approach,
> we can keep schema and relation code separate and don't need to worry
> about the locking order.
>
> Attach a top-up patch which refactor the code like above.
> Thoughts ?
>

Sounds like a good idea.
Is it possible to incorporate the existing
CheckPublicationAlterTables() and CheckPublicationAlterSchemas()
functions into your suggested update?
I think it might tidy up the error-checking a bit.

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 22, 2021 at 5:03 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Personally, I think we'd better move the code about changing publication's
> tablelist into AlterPublicationTables and the code about changing publication's
> schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset
> did, the difference is the SET action, I suggest we drop all the tables in
> function AlterPublicationTables when user only set schemas and drop all the
> schema in AlterPublicationSchemas when user only set tables. In this approach,
> we can keep schema and relation code separate and don't need to worry
> about the locking order.
>
> Attach a top-up patch which refactor the code like above.
>

Good suggestion. I think it would still be better if we can move the
checks related to superuser and puballtables into a separate function
that gets called before taking a lock on publication.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> ---
> This patch introduces some new static functions to publicationcmds.c
> but some have function prototypes but some don't. As far as I checked,
>
> ObjectsInPublicationToOids()
> CheckObjSchemaNotAlreadyInPublication()
> GetAlterPublicationDelRelations()
> AlterPublicationSchemas()
> CheckPublicationAlterTables()
> CheckPublicationAlterSchemas()
> LockSchemaList()
> OpenReliIdList()
> PublicationAddSchemas()
> PublicationDropSchemas()
>
> are newly introduced but only four functions:
>
> OpenReliIdList()
> LockSchemaList()
> PublicationAddSchemas()
> PublicationDropSchemas()
>
> have function prototypes. Actually, there already are functions that
> don't have their prototype in publicationcmds.c. But it seems better
> to clarify what kind of functions don't need to have a prototype at
> least in this file.
>

I think if the function is defined after its use then we declare it at
the top. Do you prefer to declare all static functions to allow ease
of usage? Do you have something else in mind?

>
> ---
> +       if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser())
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                errmsg("must be superuser to add or
> set schemas")));
>
> Why do we require the superuser privilege only for ADD and SET but not for DROP?
>

For Add/Set of for all tables of Schema is similar to all tables
publication requirement. For Drop, I don't think it is mandatory to
allow only to superuser. The same applies to Alter Publication ...
Drop table case where you don't need to be table owner whereas, for
Add, you need to be. We had a discussion on these points in this
thread. See [1] and some emails prior to it.

[1] - https://www.postgresql.org/message-id/CAA4eK1KqhUBHbcpT92VMPvUUDgGvyOK0ekXOwjNR6L%3DY_bcsGw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
From Thurs, Sep 23, 2021 12:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 5:03 PM Hou Zhijie <houzj.fnst@fujitsu.com> wrote:
> >
> > Personally, I think we'd better move the code about changing publication's
> > tablelist into AlterPublicationTables and the code about changing
> publication's
> > schemalist into AlterPublicationSchemas. It's similar to what the
> v29-patchset
> > did, the difference is the SET action, I suggest we drop all the tables in
> > function AlterPublicationTables when user only set schemas and drop all the
> > schema in AlterPublicationSchemas when user only set tables. In this
> approach,
> > we can keep schema and relation code separate and don't need to worry
> > about the locking order.
> >
> > Attach a top-up patch which refactor the code like above.
> >
> 
> Good suggestion. I think it would still be better if we can move the
> checks related to superuser and puballtables into a separate function
> that gets called before taking a lock on publication.

I agreed.

I noticed v30-0001 has been committed with some minor changes, and the V30-0002
patchset need to be rebased accordingly. Attach a rebased version patch set to
make cfbot happy. Also Attach the two top-up patches which refactor the code as
suggested. (top-up patch 1 is to keep schema and table code separate, top-up
patch 2 is to move some cheap check into a function and invoke it before
locking.)

Best regards,
Hou zj


Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Thur, Sep 23, 2021 11:06 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > >
> > > How do you suggest changing it?
> >
> > Personally, I think we'd better move the code about changing
> > publication's tablelist into AlterPublicationTables and the code about
> > changing publication's schemalist into AlterPublicationSchemas. It's
> > similar to what the v29-patchset did, the difference is the SET
> > action, I suggest we drop all the tables in function
> > AlterPublicationTables when user only set schemas and drop all the
> > schema in AlterPublicationSchemas when user only set tables. In this
> > approach, we can keep schema and relation code separate and don't need to
> worry about the locking order.
> >
> > Attach a top-up patch which refactor the code like above.
> > Thoughts ?
> >
> 
> Sounds like a good idea.
> Is it possible to incorporate the existing
> CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions
> into your suggested update?
> I think it might tidy up the error-checking a bit.

I agreed we can put the check about ALL TABLE and superuser into a function
like what the v30-patchset did. But I have some hesitations about the code
related to CheckObjSchemaNotAlreadyInPublication(). Currently, we need to open
and lock the table before invoking the CheckObjxxx function, ISTM we'd better
open the table in function AlterPublicationTables. Maybe we can wait for the
author's(Vignesh) opinion.

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Sep 23, 2021 at 5:02 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> > Sounds like a good idea.
> > Is it possible to incorporate the existing
> > CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions
> > into your suggested update?
> > I think it might tidy up the error-checking a bit.
>
> I agreed we can put the check about ALL TABLE and superuser into a function
> like what the v30-patchset did. But I have some hesitations about the code
> related to CheckObjSchemaNotAlreadyInPublication(). Currently, we need to open
> and lock the table before invoking the CheckObjxxx function, ISTM we'd better
> open the table in function AlterPublicationTables. Maybe we can wait for the
> author's(Vignesh) opinion.
>

Yes, I think you're right, the code related to
CheckObjSchemaNotAlreadyInPublication() should be left where it is
(according to your schema refactor patch).

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 22, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 4:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v30 patch has the fixes for the same.
> >
>
> Thanks for all the patch updates.
> I have some suggested updates to the v30-0005 documentation patch:
>
> doc/src/sgml/ref/alter_publication.sgml
> (1)
> I'm thinking it might be better to simplify the description, because
> it's a bit wordy and difficult to read with the "all tables in schema"
> bits.
> Suggested update is below (thoughts?):
>
> BEFORE:
> +   The first three variants change which tables and/or all tables in schema are
> +   part of the publication.  The <literal>SET</literal> clause will replace
> +   the list of tables and/or all tables in schema in the publication with the
> +   specified one, the existing tables and all tables in schema that were
> +   present in the publication will be removed.  The <literal>ADD</literal>
> +   clause will add one or more tables and/or all tables in schema to the
> +   publication. The <literal>DROP</literal> clauses will remove one or more
> +   tables and/or all tables in schema from the publication.  Note that adding
> +   tables and/or all tables in schema to a publication that is already
> +   subscribed to will require a <literal>ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION</literal>
> +   action on the subscribing side in order to become effective.
> AFTER:
> +   The first three variants change which tables/schemas are
> +   part of the publication.  The <literal>SET</literal> clause will replace
> +   the list of tables/schemas in the publication with the
> +   specified list; the existing tables/schemas that were
> +   present in the publication will be removed.  The <literal>ADD</literal>
> +   clause will add one or more tables/schemas to the
> +   publication. The <literal>DROP</literal> clauses will remove one or more
> +   tables/schemas from the publication.  Note that adding
> +   tables/schemas to a publication that is already
> +   subscribed to will require a <literal>ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION</literal>
> +   action on the subscribing side in order to become effective.

Modified

>
> doc/src/sgml/ref/create_publication.sgml
> (2)
> I suggest an update similar to the following:
>
> BEFORE:
> +      Specifying a table that is part of schema specified in
> +      <literal>FOR ALL TABLES IN SCHEMA</literal> option is not supported.
> AFTER:
> +      Specifying a table that is part of a schema already included in
> the publication is not supported.

This doc content is for the following example:
create publication pub1 for all tables in schema sch1, table sch1.t1;
It is  about the schema specified in all tables in schema option along
with table option. I think the existing content is better.

> (3)
> I find the following description a little unclear:
>
> +     <para>
> +      Specifying a schema along with schema's table specified as part of
> +      <literal>FOR TABLE</literal> option is not supported.
> +     </para>
>
> Perhaps the following would be better:
>
> +     <para>
> +      Specifying a schema that contains a table already included in the
> +      publication is not supported.
> +     </para>

Similar to above

> (4)
> Minor fix:
> BEFORE:
> +   rights on the table.  The <command>FOR ALL TABLES</command> and
> +   <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking
> +   user to be a superuser.
> AFTER:
> +   rights on the table.  The <command>FOR ALL TABLES</command> and
> +   <command>FOR ALL TABLES IN SCHEMA</command> clauses require the invoking
> +   user to be a superuser.

Modified

Attached v32 patch has the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 21, 2021 at 6:05 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Sep 21, 2021 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > (1)
> > > In get_object_address_publication_schema(), the error message:
> > >
> > > + errmsg("publication tables of schema \"%s\" in publication \"%s\"
> > > does not exist",
> > >
> > > isn't grammatically correct. It should probably be:
> > >
> > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do
> > > not exist",
> >
> > "does not exist" is used across the file. Should we keep it like that
> > to maintain consistency. Thoughts?
> >
>
> When it's singular, "does not exist" is correct.
> I think currently only this case exists in the publication code.
> e.g.
> "publication \"%s\" does not exist"
> "publication relation \"%s\" in publication \"%s\" does not exist"
>
> But "publication tables" is plural, so it needs to say "do not exist"
> rather than "does not exist".
>
> > >
> > > In the case of "if (!(*nspname))", the following line should probably
> > > be added before returning false:
> > >
> > >    *pubname = NULL;
> >
> > In case of failure we return false and don't access it. I felt we
> > could keep it as it is. Thoughts?
> >
>
> OK then, I might be being a bit pedantic.
> (I was just thinking, strictly speaking, we probably shouldn't be
> writing into the caller's pass-by-reference parameters in the case
> false is returned)
>
> > > (5)
> > > I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of
> > > checking "checkobjtype" each loop iteration, wouldn't it be better to
> > > just use the same for-loop in each IF block?
> >
> > I will be changing it to:
> > static void
> > CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist,
> >                     PublicationObjSpecType checkobjtype)
> > {
> >   ListCell   *lc;
> >
> >   foreach(lc, rels)
> >   {
> >     Relation  rel = (Relation) lfirst(lc);
> >     Oid     relSchemaId = RelationGetNamespace(rel);
> >
> >     if (list_member_oid(schemaidlist, relSchemaId))
> >     {
> >       if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA)
> >         ereport(ERROR,
> >             errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >             errmsg("cannot add schema \"%s\" to publication",
> >                  get_namespace_name(relSchemaId)),
> >             errdetail("Table \"%s\" in schema \"%s\" is already part
> > of the publication, adding the same schema is not supported.",
> >                   RelationGetRelationName(rel),
> >                   get_namespace_name(relSchemaId)));
> >       else if (checkobjtype == PUBLICATIONOBJ_TABLE)
> >         ereport(ERROR,
> >             errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >             errmsg("cannot add relation \"%s.%s\" to publication",
> >                  get_namespace_name(relSchemaId),
> >                  RelationGetRelationName(rel)),
> >             errdetail("Table's schema \"%s\" is already part of the
> > publication.",
> >                   get_namespace_name(relSchemaId)));
> >     }
> >   }
> > }
> > After the change checkobjtype will be checked only once in case of error.
> >
>
> OK.
>
> One thing related to this code is the following:
>
> i)
> postgres=# create publication pub1 for all tables in schema sch1,
> table sch1.test;
> ERROR:  cannot add relation "sch1.test" to publication
> DETAIL:  Table's schema "sch1" is already part of the publication.
>
> ii)
> postgres=# create publication pub1 for table sch1.test, all tables in
> schema sch1;
> ERROR:  cannot add relation "sch1.test" to publication
> DETAIL:  Table's schema "sch1" is already part of the publication.
>
> Notice that in case (ii), the same error message is used, but the
> order of items to be "added" to the publication is the reverse of case
> (i), and really implies the table "sch1.test" was added first, but
> this is not reflected by the error message. So it seems slightly odd
> to say the schema is already part of the publication, when the table
> was actually listed first.
> I'm wondering if this can be improved?
>
> One idea I had was the following more generic type of message, but I'm
> not 100% happy with the wording:
>
>     DETAIL:  Schema "sch1" and one of its tables can't separately be
> part of the publication.

This is common code applicable for the following scenarios:
i) create publication pub1 for all tables in schema sch1, table sch1.test;
ii) create publication pub1 for table sch1.test, all tables in schema sch1;
iii) create publication pub1 for table sch1.test;
alter publication pub1 add all tables in schema sch1;

I have changed it to make it suitable for all the cases.

ereport(ERROR,
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot add relation \"%s.%s\" to publication",
get_namespace_name(relSchemaId),
RelationGetRelationName(rel)),
errdetail("Table's schema \"%s\" is already part of the publication or
part of the specified schema list.",
get_namespace_name(relSchemaId)));

The v32 patch attached at [1] handles the above.
[1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Attached v30 patch has the fixes for the same.
> >
>
> Thank you for updating the patches.
>
> Here are random comments on v30-0002 patch:
>
> +
> +                       if (stmt->action == DEFELEM_SET &&
> !list_length(schemaidlist))
> +                       {
> +                               delschemas =
> GetPublicationSchemas(pubform->oid);
> +                               LockSchemaList(delschemas);
> +                       }
>
> I think "list_length(schemaidlist) > 0" would be more readable.

We have refactored the patch which  has removed these changes.

> ---
> This patch introduces some new static functions to publicationcmds.c
> but some have function prototypes but some don't. As far as I checked,
>
> ObjectsInPublicationToOids()
> CheckObjSchemaNotAlreadyInPublication()
> GetAlterPublicationDelRelations()
> AlterPublicationSchemas()
> CheckPublicationAlterTables()
> CheckPublicationAlterSchemas()
> LockSchemaList()
> OpenReliIdList()
> PublicationAddSchemas()
> PublicationDropSchemas()
>
> are newly introduced but only four functions:
>
> OpenReliIdList()
> LockSchemaList()
> PublicationAddSchemas()
> PublicationDropSchemas()
>
> have function prototypes. Actually, there already are functions that
> don't have their prototype in publicationcmds.c. But it seems better
> to clarify what kind of functions don't need to have a prototype at
> least in this file.

My thoughts are the same as that Amit had replied at [1].

> ---
> ISTM we can inline the contents of three functions:
> GetAlterPublicationDelRelations(), CheckPublicationAlterTables(), and
> CheckPublicationAlterSchemas(). These have only one caller and ISTM
> makes the readability worse. I think it's not necessary to make
> functions for them.

We have refactored the patch which  has removed these changes.

> ---
> + * This is dispatcher function for AlterPublicationOptions,
> + * AlterPublicationSchemas and AlterPublicationTables.
>
> As this comment mentioned, AlterPublication() calls
> AlterPublicationTables() and AlterPublicationSchemas() but this
> function also a lot of pre-processing such as building the list and
> some checks, depending on stmt->action before calling these two
> functions. And those two functions also perform some operation
> depending on stmt->action. So ISTM it's better to move those
> pre-processing to these two functions and have AlterPublication() just
> call these two functions. What do you think?

We have refactored the patch which has removed these changes.

> ---
> +List *
> +GetAllSchemasPublicationRelations(Oid puboid, PublicationPartOpt pub_partopt)
>
> Since this function gets all relations in the schema publication, I
> think GetAllSchemaPublicationRelations() would be better as a function
> name (removing 's' before 'P').

Modified

> ---
> +                       if (!IsA(node, String))
> +                               ereport(ERROR,
> +                                               errcode(ERRCODE_SYNTAX_ERROR),
> +                                               errmsg("invalid schema
> name at or near"),
> +
> parser_errposition(pstate, pubobj->location));
>
> The error message should mention where the invalid schema name is at
> or near. Also, In the following example, the error position in the
> error message seems not to be where the invalid schemaname s.s is:

> postgres(1:47707)=# create publication p for all tables in schema s.s;
> ERROR:  invalid schema name at or near
> LINE 1: create publication p for all tables in schema s.s;
>                                  ^

Modified

> ---
> +               if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE)
> +               {
> +                       if (IsA(node, RangeVar))
> +                               *rels = lappend(*rels, (RangeVar *) node);
> +                       else if (IsA(node, String))
> +                       {
> +                               RangeVar   *rel = makeRangeVar(NULL,
> strVal(node),
> +
>                     pubobj->location);
> +
> +                               *rels = lappend(*rels, rel);
> +                       }
> +               }
> +               else if (pubobj->pubobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA)
> +               {
> (snip)
> +                       /* Filter out duplicates if user specifies
> "sch1, sch1" */
> +                       *schemas = list_append_unique_oid(*schemas, schemaid);
> +               }
>
> Do we need to filter out duplicates also in PUBLICATIONOBJ_TABLE case
> since users can specify things like "TABLE tbl, tbl, tbl"?

Currently the handling of tables is taken care at OpenTableList:
.....
    /*
     * Filter out duplicates if user specifies "foo, foo".
     *
     * Note that this algorithm is known to not be very efficient (O(N^2))
     * but given that it only works on list of tables given to us by user
     * it's deemed acceptable.
     */
    if (list_member_oid(relids, myrelid))
    {
        table_close(rel, ShareUpdateExclusiveLock);
        continue;
    }
.....

> ---
> +       if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser())
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                errmsg("must be superuser to add or
> set schemas")));
>
> Why do we require the superuser privilege only for ADD and SET but not for DROP?

Amit had replied with the comments for this at [1].
The v32 patch attached at [2] has the fixes for the above.

[1] - https://www.postgresql.org/message-id/CAA4eK1KmccaVdKFrwKLXhewDt6rSCD2msOoYbWWWxYK5%3DbX5cg%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 22, 2021 at 11:27 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Wednesday, September 22, 2021 11:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > ---
> > +                       if (!IsA(node, String))
> > +                               ereport(ERROR,
> > +                                               errcode(ERRCODE_SYNTAX_ERROR),
> > +                                               errmsg("invalid schema
> > name at or near"),
> > +
> > parser_errposition(pstate, pubobj->location));
> >
> > The error message should mention where the invalid schema name is at
> > or near. Also, In the following example, the error position in the
> > error message seems not to be where the invalid schemaname s.s is:
> >
> > postgres(1:47707)=# create publication p for all tables in schema s.s;
> > ERROR:  invalid schema name at or near
> > LINE 1: create publication p for all tables in schema s.s;
> >                                  ^
> >
>
> I noticed this, too. And I think it could be fixed by the following change, thoughts?

I fixed it by updating the location at pubobj_expr

> Besides, about this change in tab-complete.c:
>
> +       else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "SCHEMA"))
> +               COMPLETE_WITH_QUERY(Query_for_list_of_schemas
> +                                                       " UNION SELECT 'CURRENT_SCHEMA'");
>
> It should be "ALL TABLES IN SCHEMA" not "SCHEMA" at the first line, right?

Modified.

The v32 patch attached at [1] handles the above.
[1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com

Regards.
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 22, 2021 at 11:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 21, 2021 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> > > On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Attached v29 patch has the fixes for the same.
> > > >
> > >
> > > Some minor comments on the v29-0002 patch:
> > >
> > > (1)
> > > In get_object_address_publication_schema(), the error message:
> > >
> > > + errmsg("publication tables of schema \"%s\" in publication \"%s\"
> > > does not exist",
> > >
> > > isn't grammatically correct. It should probably be:
> > >
> > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do
> > > not exist",
> >
> > Modified
> >
>
> I still see the old message in v30. But I have a different suggestion
> for this message. How about changing it to: "publication schema \"%s\"
> in publication \"%s\" does not exist"? This will make it similar to
> other messages and I don't see the need here to add 'tables' as we
> have it in grammar.

Modified

The v32 patch attached at [1] handles the above.
[1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Sep 23, 2021 at 12:22 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Thurs, Sep 23, 2021 12:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Wed, Sep 22, 2021 at 5:03 PM Hou Zhijie <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Personally, I think we'd better move the code about changing publication's
> > > tablelist into AlterPublicationTables and the code about changing
> > publication's
> > > schemalist into AlterPublicationSchemas. It's similar to what the
> > v29-patchset
> > > did, the difference is the SET action, I suggest we drop all the tables in
> > > function AlterPublicationTables when user only set schemas and drop all the
> > > schema in AlterPublicationSchemas when user only set tables. In this
> > approach,
> > > we can keep schema and relation code separate and don't need to worry
> > > about the locking order.
> > >
> > > Attach a top-up patch which refactor the code like above.
> > >
> >
> > Good suggestion. I think it would still be better if we can move the
> > checks related to superuser and puballtables into a separate function
> > that gets called before taking a lock on publication.
>
> I agreed.
>
> I noticed v30-0001 has been committed with some minor changes, and the V30-0002
> patchset need to be rebased accordingly. Attach a rebased version patch set to
> make cfbot happy. Also Attach the two top-up patches which refactor the code as
> suggested. (top-up patch 1 is to keep schema and table code separate, top-up
> patch 2 is to move some cheap check into a function and invoke it before
> locking.)

Thanks for the patches, the changes simplifies alterpublications code
and handles the drop object in a better way. I have merged it to 0001
patch at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Thu, Sep 23, 2021 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > ---
> > This patch introduces some new static functions to publicationcmds.c
> > but some have function prototypes but some don't. As far as I checked,
> >
> > ObjectsInPublicationToOids()
> > CheckObjSchemaNotAlreadyInPublication()
> > GetAlterPublicationDelRelations()
> > AlterPublicationSchemas()
> > CheckPublicationAlterTables()
> > CheckPublicationAlterSchemas()
> > LockSchemaList()
> > OpenReliIdList()
> > PublicationAddSchemas()
> > PublicationDropSchemas()
> >
> > are newly introduced but only four functions:
> >
> > OpenReliIdList()
> > LockSchemaList()
> > PublicationAddSchemas()
> > PublicationDropSchemas()
> >
> > have function prototypes. Actually, there already are functions that
> > don't have their prototype in publicationcmds.c. But it seems better
> > to clarify what kind of functions don't need to have a prototype at
> > least in this file.
> >
>
> I think if the function is defined after its use then we declare it at
> the top. Do you prefer to declare all static functions to allow ease
> of usage? Do you have something else in mind?

I prefer to declare all static functions since if we have a function
prototype we don't need to include the change about the function in
future (unrelated) commits that might add a new function which uses
the function and is defined before their declarations. But it seems to
me that the policy varies per file. For instance, all functions in
vacuumlazy.c have their function prototype but functions in
publicationcmds.c seems not. I'm not going to insist on that so please
ignore this comment.

>
> >
> > ---
> > +       if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser())
> > +               ereport(ERROR,
> > +                               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > +                                errmsg("must be superuser to add or
> > set schemas")));
> >
> > Why do we require the superuser privilege only for ADD and SET but not for DROP?
> >
>
> For Add/Set of for all tables of Schema is similar to all tables
> publication requirement. For Drop, I don't think it is mandatory to
> allow only to superuser. The same applies to Alter Publication ...
> Drop table case where you don't need to be table owner whereas, for
> Add, you need to be. We had a discussion on these points in this
> thread. See [1] and some emails prior to it.

Thank you for sharing the link. That makes sense.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Sep 23, 2021 at 12:32 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thur, Sep 23, 2021 11:06 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > >
> > > >
> > > > How do you suggest changing it?
> > >
> > > Personally, I think we'd better move the code about changing
> > > publication's tablelist into AlterPublicationTables and the code about
> > > changing publication's schemalist into AlterPublicationSchemas. It's
> > > similar to what the v29-patchset did, the difference is the SET
> > > action, I suggest we drop all the tables in function
> > > AlterPublicationTables when user only set schemas and drop all the
> > > schema in AlterPublicationSchemas when user only set tables. In this
> > > approach, we can keep schema and relation code separate and don't need to
> > worry about the locking order.
> > >
> > > Attach a top-up patch which refactor the code like above.
> > > Thoughts ?
> > >
> >
> > Sounds like a good idea.
> > Is it possible to incorporate the existing
> > CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions
> > into your suggested update?
> > I think it might tidy up the error-checking a bit.
>
> I agreed we can put the check about ALL TABLE and superuser into a function
> like what the v30-patchset did. But I have some hesitations about the code
> related to CheckObjSchemaNotAlreadyInPublication(). Currently, we need to open
> and lock the table before invoking the CheckObjxxx function, ISTM we'd better
> open the table in function AlterPublicationTables. Maybe we can wait for the
> author's(Vignesh) opinion.

I felt keeping the code related to
CheckObjSchemaNotAlreadyInPublication as it is in
AlterPublicationTables and AlterPublicationSchemas is better.

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote:
>

Attached v33 patch has the preprocess_pubobj_list review comment fix
suggested by Alvaro at [1]. The
v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
the grammar changes as suggested by Alvaro at [1]. If we agree this is
better, I will merge this into the 0001 patch.
[1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Monday, September 27, 2021 12:32 PM vignesh C <vignesh21@gmail.com> wrote:
> On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> 
> Attached v33 patch has the preprocess_pubobj_list review comment fix
> suggested by Alvaro at [1]. The
> v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
> the grammar changes as suggested by Alvaro at [1]. If we agree this is better, I
> will merge this into the 0001 patch.
> [1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql

Hi,

The grammar change basically looks good to me. Only one suggestion is that it
will be better to add some more comments in gram.y to describe the rule
PublicationObjSpec. Because it's a new style syntax in postgresql, people might
wonder how the code work and why we choose this design when they first time
see this rule in gram.y.

Maybe something like the following:

+/*
+ * FOR TABLE and FOR ALL TABLES IN SCHEMA specifications
+ *
+ * This rule parses publication object with and without keyword prefix.
+ * 
+ * The actual type of the object without keyword prefix depends on the previous
+ * one with keyword prefix. It will be preprocessed in preprocess_pubobj_list().
+ * 
+ * For the object without keyword prefix, we cannot just use relation_expr here,
+ * because some extended expression in relation_expr cannot be used as a
+ * schemaname and we cannot differentiate it. So, we extract the rules from
+ * relation_expr here.
+ */
PublicationObjSpec:
            TABLE relation_expr
            ...

My words might not be good, but I think it will be better to add some comments
to explain a bit about the code in gram.y. Thoughts ?

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v33 patch has the preprocess_pubobj_list review comment fix
> suggested by Alvaro at [1].

A minor point I noticed in the v33-0002 patch, in the code added to
the listSchemas() function of src/bin/psql/describe.c, shouldn't it
"return false" (not true) if PSQLexec() fails?
Also, since the PQExpBufferData buf is re-used in the added code, it's
handling is a little inconsistent to similar existing code.
See below for suggested update.

Regards,
Greg Nancarrow
Fujitsu Australia


diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 953e1f52cf..1d28809050 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5077,9 +5077,11 @@ listSchemas(const char *pattern, bool verbose,
bool showSystem)
     appendPQExpBufferStr(&buf, "ORDER BY 1;");

     res = PSQLexec(buf.data);
-    termPQExpBuffer(&buf);
     if (!res)
+    {
+        termPQExpBuffer(&buf);
         return false;
+    }

     myopt.nullPrint = NULL;
     myopt.title = _("List of schemas");
@@ -5100,7 +5102,10 @@ listSchemas(const char *pattern, bool verbose,
bool showSystem)
                           pattern);
         result = PSQLexec(buf.data);
         if (!result)
-            return true;
+        {
+            termPQExpBuffer(&buf);
+            return false;
+        }
         else
             pub_schema_tuples = PQntuples(result);

@@ -5132,6 +5137,7 @@ listSchemas(const char *pattern, bool verbose,
bool showSystem)

     printQuery(res, &myopt, pset.queryFout, false, pset.logfile);

+    termPQExpBuffer(&buf);
     PQclear(res);

     /* Free the memory allocated for the footer */



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v33 patch has the preprocess_pubobj_list review comment fix
> suggested by Alvaro at [1].

In the v33-0003 patch, there's a couple of error-case tests that have
comments copied from success-case tests:

+-- should be able to add table to schema publication
    ...
+-- should be able to drop table from schema publication
    ...

These should be changed to something similar to that used for other
error-case tests, like:

+-- fail - can't add a table of the same schema to the schema publication
+-- fail - can't drop a table from the schema publication which isn't
in the publication

Also, for the following error:

    ERROR:  cannot add ... to publication
    DETAIL:  Table's schema "xxxx" is already part of the publication
or part of the specified schema list.

there needs to be a test case to test the "... or part of the
specified schema list" case.

Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Monday, September 27, 2021 1:32 PM, vignesh C <vignesh21@gmail.com> wrote:

>Attached v33 patch has the preprocess_pubobj_list review comment fix
>suggested by Alvaro at [1]. The
>v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
>the grammar changes as suggested by Alvaro at [1]. If we agree this is
>better, I will merge this into the 0001 patch.
>[1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql

About the schema patch, I think a schema and a table which belongs to this schema shouldn't be specified at the same
time.
 
But what if someone uses "ALTER TABLE ... SET SCHEMA ..." after "CREATE PUBLICATION"?

For example:

create schema sch1;
create schema sch2;
create table sch2.t (a int);
create publication pub1 for all tables in schema sch1, table sch2.t; alter table sch2.t set schema sch1;

postgres=# \dRp+
                              Publication pub1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+------
----------+------------+---------+---------+---------+-----------+----
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "sch1.t"
Tables from schemas:
    "sch1"

Table t has been output twice.
I think this should not be supported, should we do something for this scenario?

Regards
Tang

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, September 27, 2021 12:32 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Attached v33 patch has the preprocess_pubobj_list review comment fix
> > suggested by Alvaro at [1]. The
> > v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
> > the grammar changes as suggested by Alvaro at [1]. If we agree this is better, I
> > will merge this into the 0001 patch.
> > [1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql
>
> Hi,
>
> The grammar change basically looks good to me. Only one suggestion is that it
> will be better to add some more comments in gram.y to describe the rule
> PublicationObjSpec. Because it's a new style syntax in postgresql, people might
> wonder how the code work and why we choose this design when they first time
> see this rule in gram.y.
>
> Maybe something like the following:
>
> +/*
> + * FOR TABLE and FOR ALL TABLES IN SCHEMA specifications
> + *
> + * This rule parses publication object with and without keyword prefix.
> + *
> + * The actual type of the object without keyword prefix depends on the previous
> + * one with keyword prefix. It will be preprocessed in preprocess_pubobj_list().
> + *
> + * For the object without keyword prefix, we cannot just use relation_expr here,
> + * because some extended expression in relation_expr cannot be used as a
> + * schemaname and we cannot differentiate it. So, we extract the rules from
> + * relation_expr here.
> + */
> PublicationObjSpec:
>                         TABLE relation_expr
>                         ...
>
> My words might not be good, but I think it will be better to add some comments
> to explain a bit about the code in gram.y. Thoughts ?

Modified.

Attached v34 patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 27, 2021 at 2:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v33 patch has the preprocess_pubobj_list review comment fix
> > suggested by Alvaro at [1].
>
> A minor point I noticed in the v33-0002 patch, in the code added to
> the listSchemas() function of src/bin/psql/describe.c, shouldn't it
> "return false" (not true) if PSQLexec() fails?
> Also, since the PQExpBufferData buf is re-used in the added code, it's
> handling is a little inconsistent to similar existing code.
> See below for suggested update.

Modified
This is handled in the v34 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Sep 27, 2021 at 4:51 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v33 patch has the preprocess_pubobj_list review comment fix
> > suggested by Alvaro at [1].
>
> In the v33-0003 patch, there's a couple of error-case tests that have
> comments copied from success-case tests:
>
> +-- should be able to add table to schema publication
>     ...
> +-- should be able to drop table from schema publication
>     ...
>
> These should be changed to something similar to that used for other
> error-case tests, like:
>
> +-- fail - can't add a table of the same schema to the schema publication
> +-- fail - can't drop a table from the schema publication which isn't
> in the publication

Modified

> Also, for the following error:
>
>     ERROR:  cannot add ... to publication
>     DETAIL:  Table's schema "xxxx" is already part of the publication
> or part of the specified schema list.
>
> there needs to be a test case to test the "... or part of the
> specified schema list" case.

Added the test
This is handled in the v34 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Sep 28, 2021 at 4:35 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, September 27, 2021 1:32 PM, vignesh C <vignesh21@gmail.com> wrote:
>
> >Attached v33 patch has the preprocess_pubobj_list review comment fix
> >suggested by Alvaro at [1]. The
> >v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
> >the grammar changes as suggested by Alvaro at [1]. If we agree this is
> >better, I will merge this into the 0001 patch.
> >[1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql
>
> About the schema patch, I think a schema and a table which belongs to this schema shouldn't be specified at the same
time.
> But what if someone uses "ALTER TABLE ... SET SCHEMA ..." after "CREATE PUBLICATION"?
>
> For example:
>
> create schema sch1;
> create schema sch2;
> create table sch2.t (a int);
> create publication pub1 for all tables in schema sch1, table sch2.t; alter table sch2.t set schema sch1;
>
> postgres=# \dRp+
>                               Publication pub1
>   Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> ----------+------------+---------+---------+---------+-----------+------
> ----------+------------+---------+---------+---------+-----------+----
>  postgres | f          | t       | t       | t       | t         | f
> Tables:
>     "sch1.t"
> Tables from schemas:
>     "sch1"
>
> Table t has been output twice.
> I think this should not be supported, should we do something for this scenario?

Yes this should not be supported, we should throw an error in this case.
This is handled in the v34 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com

Regards,
Vignesh



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Monday, Tuesday, September 28, 2021 10:49 PM, vignesh C <vignesh21@gmail.com> wrote:
> 
> Yes this should not be supported, we should throw an error in this case.
> This is handled in the v34 patch attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.g
> mail.com
> 

Thanks for fixing it. I confirmed the error can be output as expected.

Here is a problem related to publish_via_partition_root option when using this
patch. With this option on, I think pg_get_publication_tables function gave an
unexcepted result and the subscriber would get dual data during table sync.


For example:
(I used pg_publication_tables view to make it looks clearer)

create schema sch1;
create table sch1.tbl1 (a int) partition by range ( a );
create table sch1.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
create table sch1.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20);
create table sch1.tbl1_part3 partition of sch1.tbl1 for values from (20) to (30);
create publication pub for all tables in schema sch1 with(publish_via_partition_root=1);

postgres=# select * from pg_publication_tables where pubname='pub';
 pubname | schemaname | tablename
---------+------------+------------
 pub     | sch1       | tbl1_part1
 pub     | sch1       | tbl1_part2
 pub     | sch1       | tbl1_part3
 pub     | sch1       | tbl1
(4 rows)


It shows both the partitioned table and its leaf partitions. But the result of
FOR ALL TABLES publication couldn't show the leaf partitions.


postgres=# create publication pub_all for all tables with(publish_via_partition_root=1);
CREATE PUBLICATION
postgres=# select * from pg_publication_tables where pubname='pub_all';
 pubname | schemaname | tablename
---------+------------+-----------
 pub_all | sch1       | tbl1
(1 row)


How about make the following change to avoid it? I tried it and it also fixed dual
data issue during table sync.


diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 04e785b192..4e8ccdabc6 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -632,7 +632,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
                Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
                Oid                     relid = relForm->oid;

-               if (is_publishable_class(relid, relForm))
+               if (is_publishable_class(relid, relForm) &&
+                       !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT))
                        result = lappend_oid(result, relid);
        }


Regards
Tang

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote:
> Attached v34 patch has the changes for the same.

Thanks for updating the patch.
Here are a few comments.

1)
+ *        ALL TABLES IN SCHEMA schema [[, ...]

[[ -> [

2)
+    /* ALTER PUBLICATION ... ADD/DROP TABLE/ALL TABLES IN SCHEMA parameters */

The two '/' seems a bit unclear and it doesn't mention the SET case.
Maybe we can write like:

/* parameters used for ALTER PUBLICATION ... ADD/DROP/SET publication objects */

3)
+    /*
+     * Check if setting the relation to a different schema will result in the
+     * publication having schema and same schema's table in the publication.
+     */
+    if (stmt->objectType == OBJECT_TABLE)
+    {
+        ListCell   *lc;
+        List       *schemaPubids = GetSchemaPublications(nspOid);
+        foreach(lc, schemaPubids)
+        {
+            Oid        pubid = lfirst_oid(lc);
+            if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
+                                relid))
+                ereport(ERROR,

How about we check this case like the following ?

List       *schemaPubids = GetSchemaPublications(nspOid);
List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
if (list_intersection(schemaPubids, relPubids))
    ereport(ERROR, ...

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote:
> > Attached v34 patch has the changes for the same.
>
> 3)
> +       /*
> +        * Check if setting the relation to a different schema will result in the
> +        * publication having schema and same schema's table in the publication.
> +        */
> +       if (stmt->objectType == OBJECT_TABLE)
> +       {
> +               ListCell   *lc;
> +               List       *schemaPubids = GetSchemaPublications(nspOid);
> +               foreach(lc, schemaPubids)
> +               {
> +                       Oid             pubid = lfirst_oid(lc);
> +                       if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
> +                                                               relid))
> +                               ereport(ERROR,
>
> How about we check this case like the following ?
>
> List       *schemaPubids = GetSchemaPublications(nspOid);
> List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
> if (list_intersection(schemaPubids, relPubids))
>         ereport(ERROR, ...
>

Won't this will allow changing one of the partitions for which only
partitioned table is part of the target schema? And then probably we
won't be able to provide the exact publication in the error message if
we followed the above?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Sep 28, 2021 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
>
> Attached v34 patch has the changes for the same.
>

Few comments on v34-0001-Added-schema-level-support-for-publication
==========================================================
1.
+ * This rule parses publication object with and without keyword prefix.

I think we should write it as: "This rule parses publication objects
with and without keyword prefixes."

2.
+ * For the object without keyword prefix, we cannot just use
relation_expr here,
+ * because some extended expression in relation_expr cannot be used as a

/expression/expressions

3.
+/*
+ * Process pubobjspec_list to check for errors in any of the objects and
+ * convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType
+ * type.

4.
+ /*
+ * Check if setting the relation to a different schema will result in the
+ * publication having schema and same schema's table in the publication.
+ */
+ if (stmt->objectType == OBJECT_TABLE)
+ {
+ ListCell   *lc;
+ List    *schemaPubids = GetSchemaPublications(nspOid);
+ foreach(lc, schemaPubids)
+ {
+ Oid pubid = lfirst_oid(lc);
+ if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
+ relid))
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot move table \"%s\" to schema \"%s\"",
+    RelationGetRelationName(rel), stmt->newschema),
+ errdetail("Altering table will result in having schema \"%s\" and
same schema's table \"%s\" in the publication \"%s\" which is not
supported.",
+   stmt->newschema,
+   RelationGetRelationName(rel),
+   get_publication_name(pubid, false)));
+ }
+ }

Let's slightly change the comment as: "Check that setting the relation
to a different schema won't result in the publication having schema
and the same schema's table." and errdetail as: "The schema \"%s\" and
same schema's table \"%s\" cannot be part of the same publication
\"%s\"."

Maybe it is better to specify that this will disallow the partition table case.

5.
ObjectsInPublicationToOids()
{
..
+ pubobj = (PublicationObjSpec *) lfirst(cell);
+ if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE)

It is better to keep an empty line between above two lines.

6.
List    *schemaPubids = GetSchemaPublications(nspOid);
foreach(lc, schemaPubids)
..
Oid pubid = lfirst_oid(lc);
if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),

Add an empty line between each of the above two lines.

7.
+ /*
+ * Schema lock is held until the publication is altered to prevent
+ * concurrent schema deletion. No need to unlock the schemas, the locks
+ * will be released automatically at the end of alter publication command.
+ */
+ LockSchemaList(schemaidlist);

I think it is better to add a similar comment at other places where
this function is called. And we can shorten the comment atop
LockSchemaList to something like: "The schemas specified in the schema
list are locked in AccessShareLock mode in order to prevent concurrent
schema deletion."

8. In CreatePublication(), the check if (stmt->for_all_tables) can be
the first check and then in else if we can process tables and schemas.

9.
AlterPublication()
{
..
+ /* Lock the publication so nobody else can do anything with it. */
+ LockDatabaseObject(PublicationRelationId, pubform->oid, 0,
+    AccessExclusiveLock);

I think it is better to say why we need this lock. So, can we change
the comment to something like: "Lock the publication so nobody else
can do anything with it. This prevents concurrent alter to add
table(s) that were already going to become part of the publication by
adding corresponding schema(s) via this command and similarly it will
prevent the concurrent addition of schema(s) for which there is any
corresponding table being added by this command."

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Wed, Sep 29, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> 4.
> + /*
> + * Check if setting the relation to a different schema will result in the
> + * publication having schema and same schema's table in the publication.
> + */
> + if (stmt->objectType == OBJECT_TABLE)
> + {
> + ListCell   *lc;
> + List    *schemaPubids = GetSchemaPublications(nspOid);
> + foreach(lc, schemaPubids)
> + {
> + Oid pubid = lfirst_oid(lc);
> + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
> + relid))
> + ereport(ERROR,
> + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot move table \"%s\" to schema \"%s\"",
> +    RelationGetRelationName(rel), stmt->newschema),
> + errdetail("Altering table will result in having schema \"%s\" and
> same schema's table \"%s\" in the publication \"%s\" which is not
> supported.",
> +   stmt->newschema,
> +   RelationGetRelationName(rel),
> +   get_publication_name(pubid, false)));
> + }
> + }
>
> Let's slightly change the comment as: "Check that setting the relation
> to a different schema won't result in the publication having schema
> and the same schema's table." and errdetail as: "The schema \"%s\" and
> same schema's table \"%s\" cannot be part of the same publication
> \"%s\"."
>

Since this code is in AlterTableNamespace() and the relation being
checked may or may not be part of a publication, I'd use "a
publication" instead of "the publication" in the comment.
Also, I'd say that we're doing the check because the mentioned
combination is not supported.

i.e. "Check that setting the relation to a different schema won't
result in a publication having both a schema and the same schema's
table, as this is not supported."

Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com>
> On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote:
> > > Attached v34 patch has the changes for the same.
> >
> > 3)
> > +       /*
> > +        * Check if setting the relation to a different schema will result in the
> > +        * publication having schema and same schema's table in the
> publication.
> > +        */
> > +       if (stmt->objectType == OBJECT_TABLE)
> > +       {
> > +               ListCell   *lc;
> > +               List       *schemaPubids =
> GetSchemaPublications(nspOid);
> > +               foreach(lc, schemaPubids)
> > +               {
> > +                       Oid             pubid = lfirst_oid(lc);
> > +                       if (list_member_oid(GetPublicationRelations(pubid,
> PUBLICATION_PART_ALL),
> > +                                                               relid))
> > +                               ereport(ERROR,
> >
> > How about we check this case like the following ?
> >
> > List       *schemaPubids = GetSchemaPublications(nspOid);
> > List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
> > if (list_intersection(schemaPubids, relPubids))
> >         ereport(ERROR, ...
> >
> 
> Won't this will allow changing one of the partitions for which only partitioned
> table is part of the target schema?

I think it still disallow changing partition's schema to the published one.
I tested with the following SQLs.
-----
create schema sch1;
create schema sch2;
create schema sch3;

create table sch1.tbl1 (a int) partition by range ( a );
create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101);
create table sch3.tbl1_part2 partition of sch1.tbl1 for values from (101) to (200);
create publication pub for ALL TABLES IN schema sch1, TABLE sch2.tbl1_part1;
alter table sch2.tbl1_part1 set schema sch1;
---* It will report an error here *
-----

Did I miss something ?

Best regards,
Hou zj


Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 29, 2021 at 11:59 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com>
> > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > Attached v34 patch has the changes for the same.
> > >
> > > 3)
> > > +       /*
> > > +        * Check if setting the relation to a different schema will result in the
> > > +        * publication having schema and same schema's table in the
> > publication.
> > > +        */
> > > +       if (stmt->objectType == OBJECT_TABLE)
> > > +       {
> > > +               ListCell   *lc;
> > > +               List       *schemaPubids =
> > GetSchemaPublications(nspOid);
> > > +               foreach(lc, schemaPubids)
> > > +               {
> > > +                       Oid             pubid = lfirst_oid(lc);
> > > +                       if (list_member_oid(GetPublicationRelations(pubid,
> > PUBLICATION_PART_ALL),
> > > +                                                               relid))
> > > +                               ereport(ERROR,
> > >
> > > How about we check this case like the following ?
> > >
> > > List       *schemaPubids = GetSchemaPublications(nspOid);
> > > List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
> > > if (list_intersection(schemaPubids, relPubids))
> > >         ereport(ERROR, ...
> > >
> >
> > Won't this will allow changing one of the partitions for which only partitioned
> > table is part of the target schema?
>
> I think it still disallow changing partition's schema to the published one.
> I tested with the following SQLs.
> -----
> create schema sch1;
> create schema sch2;
> create schema sch3;
>
> create table sch1.tbl1 (a int) partition by range ( a );
> create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101);
> create table sch3.tbl1_part2 partition of sch1.tbl1 for values from (101) to (200);
> create publication pub for ALL TABLES IN schema sch1, TABLE sch2.tbl1_part1;
> alter table sch2.tbl1_part1 set schema sch1;
> ---* It will report an error here *
> -----
>

Use all steps before "create publication" and then try below. These
will give an error with the patch proposed but if I change it to what
you are proposing then it won't give an error.
create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1;
alter table sch3.tbl1_part2 set schema sch2;

But now again thinking about it, I am not sure if we really want to
give error in this case. What do you think? Also, if we use
list_intersection trick, then how will we tell the publication due to
which this problem has occurred, or do you think we should leave that
as an exercise for the user?

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wed, Sep 29, 2021 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Sep 29, 2021 at 11:59 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com>
> > > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > > > > Attached v34 patch has the changes for the same.
> > > > How about we check this case like the following ?
> > > >
> > > > List       *schemaPubids = GetSchemaPublications(nspOid);
> > > > List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
> > > > if (list_intersection(schemaPubids, relPubids))
> > > >         ereport(ERROR, ...
> > > >
> > >
> > > Won't this will allow changing one of the partitions for which only
> > > partitioned table is part of the target schema?
> >
> > I think it still disallow changing partition's schema to the published one.
> > I tested with the following SQLs.
> > -----
> > create schema sch1;
> > create schema sch2;
> > create schema sch3;
> >
> > create table sch1.tbl1 (a int) partition by range ( a ); create table
> > sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101);
> > create table sch3.tbl1_part2 partition of sch1.tbl1 for values from
> > (101) to (200); create publication pub for ALL TABLES IN schema sch1,
> > TABLE sch2.tbl1_part1; alter table sch2.tbl1_part1 set schema sch1;
> > ---* It will report an error here *
> > -----
> >
> 
> Use all steps before "create publication" and then try below. These will give an
> error with the patch proposed but if I change it to what you are proposing then
> it won't give an error.
> create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1; alter
> table sch3.tbl1_part2 set schema sch2;
> 
> But now again thinking about it, I am not sure if we really want to give error in
> this case. What do you think?

Personally, I think we can allow the above case.

Because if user specify the partitioned table in the publication like above,
they cannot drop the partition separately. And the partitioned table is the
actual one in pg_publication_rel. So, I think allowing this case seems won't
make people feel confused.

Besides, in the current patch, we have allowed similar case in CREATE/ALTER
PUBLICATION cases. In this SQL: "create publication pub for ALL TABLES IN
schema sch2, Table sch1.tbl1;", one of the partitions ' sch2.tbl1_part1' is
from schema 'sch2' which is published. It might be better to make the behavior
consistent.

> Also, if we use list_intersection trick, then how will
> we tell the publication due to which this problem has occurred, or do you think
> we should leave that as an exercise for the user?

I thought list_intersection will return a puboids list in which the puboid exists in both input list.
We can choose the first puboid and output it in the error message which seems the same as
the current patch.

But I noticed list_intersection doesn't support T_OidList, so we might need to search the puboid
Manually. Like:

    foreach(cell, relPubids)
    {
        if (list_member_oid(schemaPubids, lfirst_oid(cell)))
                ereport(ERROR,
                        errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                        errmsg("cannot move table \"%s\" to schema \"%s\"",
                               RelationGetRelationName(rel), stmt->newschema),
                        errdetail("Altering table will result in having schema \"%s\" and same schema's table \"%s\" in
thepublication \"%s\" which is not supported.",
 
                                  stmt->newschema,
                                  RelationGetRelationName(rel),
                                  get_publication_name(lfirst_oid(cell), false)));
    }

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Sep 29, 2021 at 5:48 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wed, Sep 29, 2021 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Wed, Sep 29, 2021 at 11:59 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com>
> > > > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com
> > > > <houzj.fnst@fujitsu.com> wrote:
> > > > >
> > > > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > > > > Attached v34 patch has the changes for the same.
> > > > > How about we check this case like the following ?
> > > > >
> > > > > List       *schemaPubids = GetSchemaPublications(nspOid);
> > > > > List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
> > > > > if (list_intersection(schemaPubids, relPubids))
> > > > >         ereport(ERROR, ...
> > > > >
> > > >
> > > > Won't this will allow changing one of the partitions for which only
> > > > partitioned table is part of the target schema?
> > >
> > > I think it still disallow changing partition's schema to the published one.
> > > I tested with the following SQLs.
> > > -----
> > > create schema sch1;
> > > create schema sch2;
> > > create schema sch3;
> > >
> > > create table sch1.tbl1 (a int) partition by range ( a ); create table
> > > sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101);
> > > create table sch3.tbl1_part2 partition of sch1.tbl1 for values from
> > > (101) to (200); create publication pub for ALL TABLES IN schema sch1,
> > > TABLE sch2.tbl1_part1; alter table sch2.tbl1_part1 set schema sch1;
> > > ---* It will report an error here *
> > > -----
> > >
> >
> > Use all steps before "create publication" and then try below. These will give an
> > error with the patch proposed but if I change it to what you are proposing then
> > it won't give an error.
> > create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1; alter
> > table sch3.tbl1_part2 set schema sch2;
> >
> > But now again thinking about it, I am not sure if we really want to give error in
> > this case. What do you think?
>
> Personally, I think we can allow the above case.
>
> Because if user specify the partitioned table in the publication like above,
> they cannot drop the partition separately. And the partitioned table is the
> actual one in pg_publication_rel. So, I think allowing this case seems won't
> make people feel confused.
>

Yeah, I also thought on similar lines. So, let's allow this case.

>
> > Also, if we use list_intersection trick, then how will
> > we tell the publication due to which this problem has occurred, or do you think
> > we should leave that as an exercise for the user?
>
> I thought list_intersection will return a puboids list in which the puboid exists in both input list.
> We can choose the first puboid and output it in the error message which seems the same as
> the current patch.
>
> But I noticed list_intersection doesn't support T_OidList, so we might need to search the puboid
> Manually. Like:
>
>         foreach(cell, relPubids)
>         {
>                 if (list_member_oid(schemaPubids, lfirst_oid(cell)))
>                                 ereport(ERROR,
>                                                 errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                                                 errmsg("cannot move table \"%s\" to schema \"%s\"",
>                                                            RelationGetRelationName(rel), stmt->newschema),
>                                                 errdetail("Altering table will result in having schema \"%s\" and
sameschema's table \"%s\" in the publication \"%s\" which is not supported.",
 
>                                                                   stmt->newschema,
>                                                                   RelationGetRelationName(rel),
>                                                                   get_publication_name(lfirst_oid(cell), false)));
>         }
>

Looks good to me.


-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 29, 2021 at 8:47 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, Tuesday, September 28, 2021 10:49 PM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > Yes this should not be supported, we should throw an error in this case.
> > This is handled in the v34 patch attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.g
> > mail.com
> >
>
> Thanks for fixing it. I confirmed the error can be output as expected.
>
> Here is a problem related to publish_via_partition_root option when using this
> patch. With this option on, I think pg_get_publication_tables function gave an
> unexcepted result and the subscriber would get dual data during table sync.
>
>
> For example:
> (I used pg_publication_tables view to make it looks clearer)
>
> create schema sch1;
> create table sch1.tbl1 (a int) partition by range ( a );
> create table sch1.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
> create table sch1.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20);
> create table sch1.tbl1_part3 partition of sch1.tbl1 for values from (20) to (30);
> create publication pub for all tables in schema sch1 with(publish_via_partition_root=1);
>
> postgres=# select * from pg_publication_tables where pubname='pub';
>  pubname | schemaname | tablename
> ---------+------------+------------
>  pub     | sch1       | tbl1_part1
>  pub     | sch1       | tbl1_part2
>  pub     | sch1       | tbl1_part3
>  pub     | sch1       | tbl1
> (4 rows)
>
>
> It shows both the partitioned table and its leaf partitions. But the result of
> FOR ALL TABLES publication couldn't show the leaf partitions.
>
>
> postgres=# create publication pub_all for all tables with(publish_via_partition_root=1);
> CREATE PUBLICATION
> postgres=# select * from pg_publication_tables where pubname='pub_all';
>  pubname | schemaname | tablename
> ---------+------------+-----------
>  pub_all | sch1       | tbl1
> (1 row)
>
>
> How about make the following change to avoid it? I tried it and it also fixed dual
> data issue during table sync.
>
>
> diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
> index 04e785b192..4e8ccdabc6 100644
> --- a/src/backend/catalog/pg_publication.c
> +++ b/src/backend/catalog/pg_publication.c
> @@ -632,7 +632,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
>                 Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
>                 Oid                     relid = relForm->oid;
>
> -               if (is_publishable_class(relid, relForm))
> +               if (is_publishable_class(relid, relForm) &&
> +                       !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT))
>                         result = lappend_oid(result, relid);
>         }

The suggested change works, I have modified it in the attached patch.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote:
> > Attached v34 patch has the changes for the same.
>
> Thanks for updating the patch.
> Here are a few comments.
>
> 1)
> + *             ALL TABLES IN SCHEMA schema [[, ...]
>
> [[ -> [

Modified

> 2)
> +       /* ALTER PUBLICATION ... ADD/DROP TABLE/ALL TABLES IN SCHEMA parameters */
>
> The two '/' seems a bit unclear and it doesn't mention the SET case.
> Maybe we can write like:
>
> /* parameters used for ALTER PUBLICATION ... ADD/DROP/SET publication objects */

Modified

> 3)
> +       /*
> +        * Check if setting the relation to a different schema will result in the
> +        * publication having schema and same schema's table in the publication.
> +        */
> +       if (stmt->objectType == OBJECT_TABLE)
> +       {
> +               ListCell   *lc;
> +               List       *schemaPubids = GetSchemaPublications(nspOid);
> +               foreach(lc, schemaPubids)
> +               {
> +                       Oid             pubid = lfirst_oid(lc);
> +                       if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
> +                                                               relid))
> +                               ereport(ERROR,
>
> How about we check this case like the following ?
>
> List       *schemaPubids = GetSchemaPublications(nspOid);
> List       *relPubids = GetRelationPublications(RelationGetRelid(rel));
> if (list_intersection(schemaPubids, relPubids))
>         ereport(ERROR, ...

Modified it with slight changes without using list_intersection.

These comments are handled in the v35 version patch attached at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm2yJOEPCqR%3DgTMEwveJujH9c9_z4LhKmk2T3vZH7T1DLQ%40mail.gmail.com

Regards,
VIgnesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 29, 2021 at 10:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 28, 2021 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> >
> > Attached v34 patch has the changes for the same.
> >
>
> Few comments on v34-0001-Added-schema-level-support-for-publication
> ==========================================================
> 1.
> + * This rule parses publication object with and without keyword prefix.
>
> I think we should write it as: "This rule parses publication objects
> with and without keyword prefixes."

Modified

> 2.
> + * For the object without keyword prefix, we cannot just use
> relation_expr here,
> + * because some extended expression in relation_expr cannot be used as a
>
> /expression/expressions

Modified

> 3.
> +/*
> + * Process pubobjspec_list to check for errors in any of the objects and
> + * convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType
> + * type.

Modified to remove type as discussed offline.

> 4.
> + /*
> + * Check if setting the relation to a different schema will result in the
> + * publication having schema and same schema's table in the publication.
> + */
> + if (stmt->objectType == OBJECT_TABLE)
> + {
> + ListCell   *lc;
> + List    *schemaPubids = GetSchemaPublications(nspOid);
> + foreach(lc, schemaPubids)
> + {
> + Oid pubid = lfirst_oid(lc);
> + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
> + relid))
> + ereport(ERROR,
> + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot move table \"%s\" to schema \"%s\"",
> +    RelationGetRelationName(rel), stmt->newschema),
> + errdetail("Altering table will result in having schema \"%s\" and
> same schema's table \"%s\" in the publication \"%s\" which is not
> supported.",
> +   stmt->newschema,
> +   RelationGetRelationName(rel),
> +   get_publication_name(pubid, false)));
> + }
> + }
>
> Let's slightly change the comment as: "Check that setting the relation
> to a different schema won't result in the publication having schema
> and the same schema's table." and errdetail as: "The schema \"%s\" and
> same schema's table \"%s\" cannot be part of the same publication
> \"%s\"."
>
> Maybe it is better to specify that this will disallow the partition table case.

Modified, I did not add the above as we are allowing it.

> 5.
> ObjectsInPublicationToOids()
> {
> ..
> + pubobj = (PublicationObjSpec *) lfirst(cell);
> + if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE)
>
> It is better to keep an empty line between above two lines.

Modified

> 6.
> List    *schemaPubids = GetSchemaPublications(nspOid);
> foreach(lc, schemaPubids)
> ..
> Oid pubid = lfirst_oid(lc);
> if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
>
> Add an empty line between each of the above two lines.

Modified

> 7.
> + /*
> + * Schema lock is held until the publication is altered to prevent
> + * concurrent schema deletion. No need to unlock the schemas, the locks
> + * will be released automatically at the end of alter publication command.
> + */
> + LockSchemaList(schemaidlist);
>
> I think it is better to add a similar comment at other places where
> this function is called. And we can shorten the comment atop
> LockSchemaList to something like: "The schemas specified in the schema
> list are locked in AccessShareLock mode in order to prevent concurrent
> schema deletion."

Modified

> 8. In CreatePublication(), the check if (stmt->for_all_tables) can be
> the first check and then in else if we can process tables and schemas.

Modified

> 9.
> AlterPublication()
> {
> ..
> + /* Lock the publication so nobody else can do anything with it. */
> + LockDatabaseObject(PublicationRelationId, pubform->oid, 0,
> +    AccessExclusiveLock);
>
> I think it is better to say why we need this lock. So, can we change
> the comment to something like: "Lock the publication so nobody else
> can do anything with it. This prevents concurrent alter to add
> table(s) that were already going to become part of the publication by
> adding corresponding schema(s) via this command and similarly it will
> prevent the concurrent addition of schema(s) for which there is any
> corresponding table being added by this command."

Modified

These comments are handled in the v35 version patch attached at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm2yJOEPCqR%3DgTMEwveJujH9c9_z4LhKmk2T3vZH7T1DLQ%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Sep 29, 2021 at 11:49 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Sep 29, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 4.
> > + /*
> > + * Check if setting the relation to a different schema will result in the
> > + * publication having schema and same schema's table in the publication.
> > + */
> > + if (stmt->objectType == OBJECT_TABLE)
> > + {
> > + ListCell   *lc;
> > + List    *schemaPubids = GetSchemaPublications(nspOid);
> > + foreach(lc, schemaPubids)
> > + {
> > + Oid pubid = lfirst_oid(lc);
> > + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
> > + relid))
> > + ereport(ERROR,
> > + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> > + errmsg("cannot move table \"%s\" to schema \"%s\"",
> > +    RelationGetRelationName(rel), stmt->newschema),
> > + errdetail("Altering table will result in having schema \"%s\" and
> > same schema's table \"%s\" in the publication \"%s\" which is not
> > supported.",
> > +   stmt->newschema,
> > +   RelationGetRelationName(rel),
> > +   get_publication_name(pubid, false)));
> > + }
> > + }
> >
> > Let's slightly change the comment as: "Check that setting the relation
> > to a different schema won't result in the publication having schema
> > and the same schema's table." and errdetail as: "The schema \"%s\" and
> > same schema's table \"%s\" cannot be part of the same publication
> > \"%s\"."
> >
>
> Since this code is in AlterTableNamespace() and the relation being
> checked may or may not be part of a publication, I'd use "a
> publication" instead of "the publication" in the comment.
> Also, I'd say that we're doing the check because the mentioned
> combination is not supported.
>
> i.e. "Check that setting the relation to a different schema won't
> result in a publication having both a schema and the same schema's
> table, as this is not supported."

Thanks for the comment, I have handled it in the v35 version patch
attached at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm2yJOEPCqR%3DgTMEwveJujH9c9_z4LhKmk2T3vZH7T1DLQ%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Sep 30, 2021 at 3:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> The suggested change works, I have modified it in the attached patch.
>

I have reviewed the latest version and made a number of changes to the
0001 patch. The changes are in v1-0001-Changes-by-Amit. It includes
(a) Changed preprocess_pubobj_list() to make the code easy to
understand, (b) the handling of few variables was missing in equal
function, (c) the ordering of functions, and a few parameters were not
matching .c and .h files, (d) added/edited multiple comments and other
cosmetic changes.

Apart from that, I have few other comments:
1. It seems you have started using unique list variants in
GetPubPartitionOptionRelations because one of its caller
GetSchemaPublicationRelations need it. I think the unique variants are
costlier, so isn't it better to use it where it is required? I think
it would be good to use in GetPubPartitionOptionRelations, if most of
its caller requires the same.

2. In GetSchemaPublicationRelations(), I think we need to perform a
second scan using RELKIND_PARTITIONED_TABLE only if we
publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
what we are doing in GetAllTablesPublicationRelations. Is there a
reason to be different here?

3.
@@ -538,7 +788,7 @@ RemovePublicationById(Oid pubid)
  if (!HeapTupleIsValid(tup))
  elog(ERROR, "cache lookup failed for publication %u", pubid);

- pubform = (Form_pg_publication)GETSTRUCT(tup);
+ pubform = (Form_pg_publication) GETSTRUCT(tup);

We don't need the above change for this patch. I think this may be due
pgindent but we can do this separately rather than as part of this
patch.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 30, 2021 at 3:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The suggested change works, I have modified it in the attached patch.
> >
>
> I have reviewed the latest version and made a number of changes to the
> 0001 patch. The changes are in v1-0001-Changes-by-Amit. It includes
> (a) Changed preprocess_pubobj_list() to make the code easy to
> understand, (b) the handling of few variables was missing in equal
> function, (c) the ordering of functions, and a few parameters were not
> matching .c and .h files, (d) added/edited multiple comments and other
> cosmetic changes.

I have merged these changes into the main patch.

> Apart from that, I have few other comments:
> 1. It seems you have started using unique list variants in
> GetPubPartitionOptionRelations because one of its caller
> GetSchemaPublicationRelations need it. I think the unique variants are
> costlier, so isn't it better to use it where it is required? I think
> it would be good to use in GetPubPartitionOptionRelations, if most of
> its caller requires the same.

I have removed unique list changes from GetPubPartitionOptionRelations
and handled it in GetSchemaPublicationRelations.

> 2. In GetSchemaPublicationRelations(), I think we need to perform a
> second scan using RELKIND_PARTITIONED_TABLE only if we
> publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
> what we are doing in GetAllTablesPublicationRelations. Is there a
> reason to be different here?

In the first table scan we are getting all the ordinary tables present
in the schema. In the second table scan we will get all the
partitioned table present in the schema and the relations will be
added based on pub_partopt. I felt if we have the check we will not
get the relations in the following case:
create schema sch1;
create schema sch2;
create table sch1.p (a int) partition by list (a);
create table sch2.c1 partition of sch1.p for values in (1);

> 3.
> @@ -538,7 +788,7 @@ RemovePublicationById(Oid pubid)
>   if (!HeapTupleIsValid(tup))
>   elog(ERROR, "cache lookup failed for publication %u", pubid);
>
> - pubform = (Form_pg_publication)GETSTRUCT(tup);
> + pubform = (Form_pg_publication) GETSTRUCT(tup);
>
> We don't need the above change for this patch. I think this may be due
> pgindent but we can do this separately rather than as part of this
> patch.

Removed this change.

Attached v36 patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sun, Oct 3, 2021 at 11:25 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 2. In GetSchemaPublicationRelations(), I think we need to perform a
> > second scan using RELKIND_PARTITIONED_TABLE only if we
> > publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
> > what we are doing in GetAllTablesPublicationRelations. Is there a
> > reason to be different here?
>
> In the first table scan we are getting all the ordinary tables present
> in the schema. In the second table scan we will get all the
> partitioned table present in the schema and the relations will be
> added based on pub_partopt. I felt if we have the check we will not
> get the relations in the following case:
> create schema sch1;
> create schema sch2;
> create table sch1.p (a int) partition by list (a);
> create table sch2.c1 partition of sch1.p for values in (1);
>

But we don't need to get the partitioned tables for the invalidations,
see the corresponding case for tables. So, I am not sure why you have
used two scans to the system table for such scenarios?

Few additional comments on
v36-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN:
=========================================================================
1.
@@ -3961,21 +3965,25 @@ getPublications(Archive *fout, int *numPublications)
  appendPQExpBuffer(query,
    "SELECT p.tableoid, p.oid, p.pubname, "
    "(%s p.pubowner) AS rolname, "
-   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
p.pubtruncate, p.pubviaroot "
+   "p.puballtables, p.pubinsert, p.pubupdate, "
+   "p.pubdelete, p.pubtruncate, p.pubviaroot "
    "FROM pg_publication p",
    username_subquery);
  else if (fout->remoteVersion >= 110000)
  appendPQExpBuffer(query,
    "SELECT p.tableoid, p.oid, p.pubname, "
    "(%s p.pubowner) AS rolname, "
-   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
p.pubtruncate, false AS pubviaroot "
+   "p.puballtables, p.pubinsert, p.pubupdate, "
+   "p.pubdelete, p.pubtruncate, false AS pubviaroot "
    "FROM pg_publication p",
    username_subquery);
  else
  appendPQExpBuffer(query,
    "SELECT p.tableoid, p.oid, p.pubname, "
    "(%s p.pubowner) AS rolname, "
-   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
pubtruncate, false AS pubviaroot "
+   "p.puballtables, p.pubinsert, p.pubupdate, "
+   "p.pubdelete, false AS pubtruncate, "
+   "false AS pubviaroot "
    "FROM pg_publication p",
    username_subquery);

Is there a reason to change this part of the code?

2.
@@ -257,6 +257,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
  pg_log_info("reading publication membership");
  getPublicationTables(fout, tblinfo, numTables);

+ pg_log_info("reading publication tables in schemas");
+ getPublicationNamespaces(fout, nspinfo, numNamespaces);

I think for the above change, the first should be changed to "reading
publication membership of tables" and the second one should be changed
to "reading publication membership of schemas".

3. The function names getPublicationNamespaces and
dumpPublicationSchema are not in sync. Let's name the second one as
dumpPublicationNamespace.

4. It is not clear to me why the patch has introduced a new component
type object DUMP_COMPONENT_PUBSCHEMA. In particular, in the below code
if we are already setting DUMP_COMPONENT_ALL, how the additional
setting of DUMP_COMPONENT_PUBSCHEMA helps?

@@ -1631,9 +1631,13 @@ selectDumpableNamespace(NamespaceInfo *nsinfo,
Archive *fout)
  if (nsinfo->nspowner == ROLE_PG_DATABASE_OWNER)
  nsinfo->dobj.dump &= ~DUMP_COMPONENT_DEFINITION;
  nsinfo->dobj.dump_contains = DUMP_COMPONENT_ALL;
+ nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA;
  }
  else
+ {
  nsinfo->dobj.dump_contains = nsinfo->dobj.dump = DUMP_COMPONENT_ALL;
+ nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA;
+ }

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Oct 4, 2021 at 4:55 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v36 patch has the changes for the same.
>

I have some comments on the v36-0001 patch:

src/backend/commands/publicationcmds.c

(1)
GetPublicationSchemas()

+ /* Find all publications associated with the schema */
+ pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock);

I think the comment is not correct. It should say:

+ /* Find all schemas associated with the publication */

(2)
AlterPublicationSchemas

I think that a comment should be added for the following lines,
something like the comment used in the similar check in
AlterPublicationTables():

+ if (!schemaidlist && stmt->action != DEFELEM_SET)
+ return;

(3)
CheckAlterPublication

Minor comment fix suggested:

BEFORE:
+ * Check if relations and schemas can be in given publication and throws
AFTER:
+ * Check if relations and schemas can be in a given publication and throw

(4)
LockSchemaList()

Suggest re-word of comment, to match imperative comment style used
elsewhere in this code.

BEFORE:
+ * The schemas specified in the schema list are locked in AccessShareLock mode
AFTER:
+ * Lock the schemas specified in the schema list in AccessShareLock mode


src/backend/commands/tablecmds.c

(5)

Code has been added to prevent a table being set (via ALTER TABLE) to
UNLOGGED if it is part of a publication, but I found that I could
still add all tables of a schema having a table that is UNLOGGED:

postgres=# create schema sch;
CREATE SCHEMA
postgres=# create unlogged table sch.test(i int);
CREATE TABLE
postgres=# create publication pub for table sch.test;
ERROR:  cannot add relation "test" to publication
DETAIL:  Temporary and unlogged relations cannot be replicated.
postgres=# create publication pub for all tables in schema sch;
CREATE PUBLICATION


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Oct 5, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> (5)
>
> Code has been added to prevent a table being set (via ALTER TABLE) to
> UNLOGGED if it is part of a publication, but I found that I could
> still add all tables of a schema having a table that is UNLOGGED:
>
> postgres=# create schema sch;
> CREATE SCHEMA
> postgres=# create unlogged table sch.test(i int);
> CREATE TABLE
> postgres=# create publication pub for table sch.test;
> ERROR:  cannot add relation "test" to publication
> DETAIL:  Temporary and unlogged relations cannot be replicated.
> postgres=# create publication pub for all tables in schema sch;
> CREATE PUBLICATION
>

What about when you use "create publication pub for all tables;"? I
think that also works, now on similar lines shouldn't the behavior of
"all tables in schema" publication be the same? I mean if we want we
can detect and give an error but is it advisable to give an error if
there are just one or few tables in schema that are unlogged?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Code has been added to prevent a table being set (via ALTER TABLE) to
> > UNLOGGED if it is part of a publication, but I found that I could
> > still add all tables of a schema having a table that is UNLOGGED:
> >
> > postgres=# create schema sch;
> > CREATE SCHEMA
> > postgres=# create unlogged table sch.test(i int);
> > CREATE TABLE
> > postgres=# create publication pub for table sch.test;
> > ERROR:  cannot add relation "test" to publication
> > DETAIL:  Temporary and unlogged relations cannot be replicated.
> > postgres=# create publication pub for all tables in schema sch;
> > CREATE PUBLICATION
> >
>
> What about when you use "create publication pub for all tables;"? I
> think that also works, now on similar lines shouldn't the behavior of
> "all tables in schema" publication be the same? I mean if we want we
> can detect and give an error but is it advisable to give an error if
> there are just one or few tables in schema that are unlogged?
>

OK, it seems that for the ALL TABLES case, there is no such error
check, and it just silently skips replication of any
temporary/unlogged tables. This is intentional, right?
I couldn't see any documentation specifically related to this, so I
think perhaps it should be updated to describe this behaviour. At the
moment, the existing documentation just states FOR TABLE that
"Temporary tables, unlogged tables, foreign tables, materialized
views, and regular views cannot be part of a publication".
Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the
ALL TABLES case.
Problem is, shouldn't setting UNLOGGED on a table only then be
disallowed if that table was publicised using FOR TABLE?

With the patch applied:

postgres=# create publication pub3 for all tables in schema sch;
CREATE PUBLICATION
postgres=# create table sch.test3(i int);
CREATE TABLE
postgres=# alter table sch.test3 set unlogged;
ERROR:  cannot change table "test3" to unlogged because it is part of
a publication
DETAIL:  Unlogged relations cannot be replicated.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Tue, Oct 5, 2021 at 4:40 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
>At the
> moment, the existing documentation just states FOR TABLE that
> "Temporary tables, unlogged tables, foreign tables, materialized
> views, and regular views cannot be part of a publication".

Oh, I see that in the v36-0004 doc update patch, it has added the
following for CREATE PUBLICATION ... FOR ALL TABLES IN SCHEMA:
"Only persistent base tables and partitioned tables present in the
schema will be included as part of the publication.  Temporary tables,
unlogged tables, foreign tables, materialized views, and regular views
from the schema will not be part of the publication."


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Oct 5, 2021 at 11:10 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Code has been added to prevent a table being set (via ALTER TABLE) to
> > > UNLOGGED if it is part of a publication, but I found that I could
> > > still add all tables of a schema having a table that is UNLOGGED:
> > >
> > > postgres=# create schema sch;
> > > CREATE SCHEMA
> > > postgres=# create unlogged table sch.test(i int);
> > > CREATE TABLE
> > > postgres=# create publication pub for table sch.test;
> > > ERROR:  cannot add relation "test" to publication
> > > DETAIL:  Temporary and unlogged relations cannot be replicated.
> > > postgres=# create publication pub for all tables in schema sch;
> > > CREATE PUBLICATION
> > >
> >
> > What about when you use "create publication pub for all tables;"? I
> > think that also works, now on similar lines shouldn't the behavior of
> > "all tables in schema" publication be the same? I mean if we want we
> > can detect and give an error but is it advisable to give an error if
> > there are just one or few tables in schema that are unlogged?
> >
>
>
..
> Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the
> ALL TABLES case.
> Problem is, shouldn't setting UNLOGGED on a table only then be
> disallowed if that table was publicised using FOR TABLE?
>

Right, I also think so. Let us see what Vignesh or others think on this matter.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Oct 5, 2021 at 4:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 5, 2021 at 11:10 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > Code has been added to prevent a table being set (via ALTER TABLE) to
> > > > UNLOGGED if it is part of a publication, but I found that I could
> > > > still add all tables of a schema having a table that is UNLOGGED:
> > > >
> > > > postgres=# create schema sch;
> > > > CREATE SCHEMA
> > > > postgres=# create unlogged table sch.test(i int);
> > > > CREATE TABLE
> > > > postgres=# create publication pub for table sch.test;
> > > > ERROR:  cannot add relation "test" to publication
> > > > DETAIL:  Temporary and unlogged relations cannot be replicated.
> > > > postgres=# create publication pub for all tables in schema sch;
> > > > CREATE PUBLICATION
> > > >
> > >
> > > What about when you use "create publication pub for all tables;"? I
> > > think that also works, now on similar lines shouldn't the behavior of
> > > "all tables in schema" publication be the same? I mean if we want we
> > > can detect and give an error but is it advisable to give an error if
> > > there are just one or few tables in schema that are unlogged?
> > >
> >
> >
> ..
> > Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the
> > ALL TABLES case.
> > Problem is, shouldn't setting UNLOGGED on a table only then be
> > disallowed if that table was publicised using FOR TABLE?
> >
>
> Right, I also think so. Let us see what Vignesh or others think on this matter.

Even I felt ALL TABLES IN SCHEMA should behave the same way as the ALL
TABLES case. I will keep the create publication behavior as it is i.e.
to allow even if unlogged tables are present and change the below
alter table behavior which was throwing error to be successful to keep
it similar to ALL TABLES publication:
alter table sch.test3 set unlogged;
ERROR:  cannot change table "test3" to unlogged because it is part of
a publication
DETAIL:  Unlogged relations cannot be replicated.

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Oct 4, 2021 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Oct 3, 2021 at 11:25 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 2. In GetSchemaPublicationRelations(), I think we need to perform a
> > > second scan using RELKIND_PARTITIONED_TABLE only if we
> > > publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is
> > > what we are doing in GetAllTablesPublicationRelations. Is there a
> > > reason to be different here?
> >
> > In the first table scan we are getting all the ordinary tables present
> > in the schema. In the second table scan we will get all the
> > partitioned table present in the schema and the relations will be
> > added based on pub_partopt. I felt if we have the check we will not
> > get the relations in the following case:
> > create schema sch1;
> > create schema sch2;
> > create table sch1.p (a int) partition by list (a);
> > create table sch2.c1 partition of sch1.p for values in (1);
> >
>
> But we don't need to get the partitioned tables for the invalidations,
> see the corresponding case for tables. So, I am not sure why you have
> used two scans to the system table for such scenarios?

The second loop is required to get the 'p' relkind relations like in
the below case:
create schema sch1;
create schema sch2;
create table sch1.p1 (a int) partition by list (a);
create table sch2.c1 partition of sch1.p1 for values in (1);
create table sch2.p2(a int) partition by list (a);
create table sch1.c2 partition of sch2.p2 for values in (1);
create publication pub1 for all tables in schema sch2;
The first loop will give us sch2.c1 relation and the second loop will
get sch2.p2, sch1.c2, this is required for schema publication as the
schema can have both r relkind and p relkind tables and all of them
need to be invalidated. This is not required in case of table
publication as we can get the required relations from that particular
table.

> Few additional comments on
> v36-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN:
> =========================================================================
> 1.
> @@ -3961,21 +3965,25 @@ getPublications(Archive *fout, int *numPublications)
>   appendPQExpBuffer(query,
>     "SELECT p.tableoid, p.oid, p.pubname, "
>     "(%s p.pubowner) AS rolname, "
> -   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, p.pubviaroot "
> +   "p.puballtables, p.pubinsert, p.pubupdate, "
> +   "p.pubdelete, p.pubtruncate, p.pubviaroot "
>     "FROM pg_publication p",
>     username_subquery);
>   else if (fout->remoteVersion >= 110000)
>   appendPQExpBuffer(query,
>     "SELECT p.tableoid, p.oid, p.pubname, "
>     "(%s p.pubowner) AS rolname, "
> -   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, false AS pubviaroot "
> +   "p.puballtables, p.pubinsert, p.pubupdate, "
> +   "p.pubdelete, p.pubtruncate, false AS pubviaroot "
>     "FROM pg_publication p",
>     username_subquery);
>   else
>   appendPQExpBuffer(query,
>     "SELECT p.tableoid, p.oid, p.pubname, "
>     "(%s p.pubowner) AS rolname, "
> -   "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
> pubtruncate, false AS pubviaroot "
> +   "p.puballtables, p.pubinsert, p.pubupdate, "
> +   "p.pubdelete, false AS pubtruncate, "
> +   "false AS pubviaroot "
>     "FROM pg_publication p",
>     username_subquery);
>
> Is there a reason to change this part of the code?

It is not required, I have removed it.

> 2.
> @@ -257,6 +257,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
>   pg_log_info("reading publication membership");
>   getPublicationTables(fout, tblinfo, numTables);
>
> + pg_log_info("reading publication tables in schemas");
> + getPublicationNamespaces(fout, nspinfo, numNamespaces);
>
> I think for the above change, the first should be changed to "reading
> publication membership of tables" and the second one should be changed
> to "reading publication membership of schemas".

Modified

> 3. The function names getPublicationNamespaces and
> dumpPublicationSchema are not in sync. Let's name the second one as
> dumpPublicationNamespace.

Modified

> 4. It is not clear to me why the patch has introduced a new component
> type object DUMP_COMPONENT_PUBSCHEMA. In particular, in the below code
> if we are already setting DUMP_COMPONENT_ALL, how the additional
> setting of DUMP_COMPONENT_PUBSCHEMA helps?
>
> @@ -1631,9 +1631,13 @@ selectDumpableNamespace(NamespaceInfo *nsinfo,
> Archive *fout)
>   if (nsinfo->nspowner == ROLE_PG_DATABASE_OWNER)
>   nsinfo->dobj.dump &= ~DUMP_COMPONENT_DEFINITION;
>   nsinfo->dobj.dump_contains = DUMP_COMPONENT_ALL;
> + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA;
>   }
>   else
> + {
>   nsinfo->dobj.dump_contains = nsinfo->dobj.dump = DUMP_COMPONENT_ALL;
> + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA;
> + }

I have removed DUMP_COMPONENT_PUBSCHEMA and used DUMP_COMPONENT_NONE
to identify the publications that should be dumped.

Attached v37 patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Oct 5, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Oct 4, 2021 at 4:55 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v36 patch has the changes for the same.
> >
>
> I have some comments on the v36-0001 patch:
>
> src/backend/commands/publicationcmds.c
>
> (1)
> GetPublicationSchemas()
>
> + /* Find all publications associated with the schema */
> + pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock);
>
> I think the comment is not correct. It should say:
>
> + /* Find all schemas associated with the publication */

Modified

> (2)
> AlterPublicationSchemas
>
> I think that a comment should be added for the following lines,
> something like the comment used in the similar check in
> AlterPublicationTables():
>
> + if (!schemaidlist && stmt->action != DEFELEM_SET)
> + return;

Modified

> (3)
> CheckAlterPublication
>
> Minor comment fix suggested:
>
> BEFORE:
> + * Check if relations and schemas can be in given publication and throws
> AFTER:
> + * Check if relations and schemas can be in a given publication and throw

Modified

> (4)
> LockSchemaList()
>
> Suggest re-word of comment, to match imperative comment style used
> elsewhere in this code.
>
> BEFORE:
> + * The schemas specified in the schema list are locked in AccessShareLock mode
> AFTER:
> + * Lock the schemas specified in the schema list in AccessShareLock mode

Modified

>
> src/backend/commands/tablecmds.c
>
> (5)
>
> Code has been added to prevent a table being set (via ALTER TABLE) to
> UNLOGGED if it is part of a publication, but I found that I could
> still add all tables of a schema having a table that is UNLOGGED:
>
> postgres=# create schema sch;
> CREATE SCHEMA
> postgres=# create unlogged table sch.test(i int);
> CREATE TABLE
> postgres=# create publication pub for table sch.test;
> ERROR:  cannot add relation "test" to publication
> DETAIL:  Temporary and unlogged relations cannot be replicated.
> postgres=# create publication pub for all tables in schema sch;
> CREATE PUBLICATION

I have changed the alter table behavior to allow setting it to an
unlogged table to keep the behavior similar to "ALL TABLES"
publication. I have kept the create publication behavior as it is, it
will be similar to "ALL TABLES" publication i.e to allow create
publication even if there are unlogged tables present.

These comments are handled in the v37 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0ON%3D012jGC3oquSVVWTWXhHG0q8yOyRROVGFR9PjWa-g%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Wed, Oct 6, 2021 at 4:42 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v37 patch has the changes for the same.
>

A small issue I noticed is that using "\dS" in PSQL shows UNLOGGED
tables as belonging to a publication, if the table belongs to a schema
that was added to the publication using ALL TABLES IN SCHEMA (yet
doesn't show as part of an ALL TABLES publication).
Since publication of UNLOGGED tables is silently skipped in the case
of ALL TABLES and ALL TABLES IN SCHEMA, it shouldn't show as belonging
to the publication, right?

test=# \dRp+ pub2
                            Publication pub2
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | t          | t       | t       | t       | t         | f
(1 row)

test=# \dRp+ pub
                             Publication pub
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | t
Tables from schemas:
    "sch1"

test=# \dS sch1.test2
            Unlogged table "sch1.test2"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 i      | integer |           |          |
Publications:
    "pub"


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v37 patch has the changes for the same.
>

Few comments:
==============
v37-0001-Added-schema-level-support-for-publication
1.
+ *
+ * The first scan will get all the 'r' relkind tables for the specified schema,
+ * iterate the 'r' relkind tables and prepare a list of:
+ * 1) non partition table if pub_partopt is PUBLICATION_PART_ROOT
+ * 2) partition table and non partition table if pub_partopt is
+ *    PUBLICATION_PART_LEAF.
+ *
+ * The second scan will get all the 'p'' relkind tables for the  specified
+ * schema, iterate the 'p' relkind tables and prepare a list of:
+ * 1) partition table's child relations if pub_partopt is PUBLICATION_PART_LEAF
+ * 2) partition table if pub_partopt is PUBLICATION_PART_ROOT.

I think these comments are redundant and not sure if they are
completely correct. We don't need these as the actual code explains
these conditions better. The earlier part of these comments is
sufficient.

v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN
2.
+ * selectDumpablePublicationObject: policy-setting subroutine
+ * Mark a publication as to be dumped or not
  *
- * Publication tables have schemas, but those are ignored in decision making,
+ * Publications have schemas, but those are ignored in decision making,
  * because publications are only dumped when we are dumping everything.
  */

Change the above comment lines:
a. "Mark a publication as to be dumped or not" to "Mark a publication
object as to be dumped or not".

b. "Publications have schemas, but those are ignored in decision
making, .." to "A publication can have schemas and tables which have
schemas, but those are ignored in decision making, .."

3.
+/*
+ * dumpPublicationNamespace
+ *   dump the definition of the given publication tables in schema mapping
+ */

Can we change the comment to: "dump the definition of the given
publication schema mapping"? IT is easier to read and understand.

4.
+/*
+ * The PublicationSchemaInfo struct is used to represent publication tables
+ * in schema mapping.
+ */
+typedef struct _PublicationSchemaInfo
+{
+ DumpableObject dobj;
+ NamespaceInfo *pubschema;
+ PublicationInfo *publication;
+} PublicationSchemaInfo;

Can we change the comment similar to the comment change in point 3?
Also, let's keep PublicationInfo * before NamespaceInfo * just to be
consistent with the existing structure PublicationRelInfo?

5.
+ printfPQExpBuffer(&buf,
+   "SELECT p.pubname\n"
+   "FROM pg_catalog.pg_publication p\n"
+   " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
+   " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid AND
pc.oid = '%s'\n"

I think this part of the query can be improved in multiple ways: (a)
pc.oid = '%s' should be part of WHERE clause not join condition, (b)
for pubname, no need to use alias name, it can be directly referred as
pubname, (c) you can check if the relation is publishable. So, the
formed query would look like:

SELECT p.pubname FROM pg_catalog.pg_publication p JOIN
pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid JOIN
pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid  WHERE pc.oid =
'%s' and pg_catalog.pg_relation_is_publishable('%s')

6.
listSchemas()
{
..
+ if (pub_schema_tuples > 0)
+ {
+ /*
+ * Allocate memory for footers. Size of footers will be 1 (for
+ * storing "Publications:" string) + Schema count +  1 (for
+ * storing NULL).
+ */
+ footers = (char **) palloc((1 + pub_schema_tuples + 1) * sizeof(char *));
+ footers[0] = pstrdup(_("Publications:"));
+
+ /* Might be an empty set - that's ok */
+ for (i = 0; i < pub_schema_tuples; i++)
+ {
+ printfPQExpBuffer(&buf, "    \"%s\"",
+   PQgetvalue(result, i, 0));
+
+ footers[i + 1] = pstrdup(buf.data);
+ }
+
+ footers[i + 1] = NULL;
+ myopt.footers = footers;
+ }
..
}

Is there a reason of not using printTableAddFooter() here similar to
how we use it to print Publications in describeOneTableDetails()?

7.
describePublications()
{
..
+ /* Get the schemas for the specified publication */
+ printfPQExpBuffer(&buf,
+   "SELECT n.nspname\n"
+   "FROM pg_catalog.pg_namespace n,\n"
+   "     pg_catalog.pg_publication_namespace pn\n"
+   "WHERE n.oid = pn.pnnspid\n"
+   "  AND pn.pnpubid = '%s'\n"
+   "ORDER BY 1", pubid);
+ if (!addFooterToPublicationDesc(&buf, "Tables from schemas:",
+ true, &cont))
+ goto error_return;
..
}

Shouldn't we try to get schemas only when pset.sversion >= 150000?

8.
+addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
+    bool singlecol, printTableContent *cont)
+{
..
+ termPQExpBuffer(buf);
..
}

It seems this buffer is freed at the caller's site, if so, no need to
free it here.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v37 patch has the changes for the same.
> >
>
> Few comments:
> ==============
>

Few more comments:
====================
v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN
1.
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL",
"TABLES", "IN", "SCHEMA"))
+ COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+ " UNION SELECT 'CURRENT_SCHEMA' "
+ "UNION SELECT 'WITH ('");

What is the need to display WITH here? It will be displayed after
Schema name with the below rule:
+ else if (Matches("CREATE", "PUBLICATION", MatchAny,  "FOR", "ALL",
"TABLES", "IN", "SCHEMA", MatchAny))
+ COMPLETE_WITH("WITH (");

2. It seems tab-completion happens incorrectly for the below case:
create publication pub for all tables in schema s1,

If I press the tab after above, it completes with below which is wrong
because it will lead to incorrect syntax.

create publication pub for all tables in schema s1, WITH (

v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication
3.
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
..
+ 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => {
+ create_order => 51,
+ create_sql =>
+   'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;',
+ regexp => qr/^
+ \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E
+ /xm,
+ like   => { %full_runs, section_post_data => 1, },
+ unlike => { exclude_dump_test_schema => 1, },

In this test, won't it exclude the schema dump_test because of unlike?
If so, then we don't have coverage for "ALL Tables In Schema" except
for public schema?

4.
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
..
+-- fail - can't add schema to for all tables publication
+ALTER PUBLICATION testpub_foralltables ADD ALL TABLES IN SCHEMA pub_test;

In the above and all similar comments, it is better to either quote
'for all tables' or write in CAPS FOR ALL TABLE or both 'FOR ALL
TABLE'.

5.
+\dRp+ testpub1_forschema
+                               Publication testpub1_forschema
+          Owner           | All tables | Inserts | Updates | Deletes
| Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t
| t         | f
+Tables from schemas:
+    "pub_test1"
+
+SELECT p.pubname FROM pg_catalog.pg_publication p,
pg_catalog.pg_namespace n, pg_catalog.pg_publication_namespace pn
WHERE n.oid = pn.pnnspid AND p.oid = pn.pnpubid AND n.nspname =
'pub_test1' ORDER BY 1;
+      pubname
+--------------------
+ testpub1_forschema
+(1 row)

I don't think in the above and similar tests, we need to separately
check the presence of publication via Select query, if we have tested
it via psql command. Let's try to keep the meaningful tests.

6.
+INSERT INTO pub_test1.tbl VALUES(1, 'test');
+-- fail
+UPDATE pub_test1.tbl SET id = 2;
+ERROR:  cannot update table "tbl" because it does not have a replica
identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
+ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1;
+-- success
+UPDATE pub_test1.tbl SET id = 2;
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
+-- fail
+UPDATE pub_test1.tbl SET id = 2;
+ERROR:  cannot update table "tbl" because it does not have a replica
identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
+ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1;
+-- success
+UPDATE pub_test1.tbl SET id = 2;
+ALTER PUBLICATION testpub1_forschema ADD ALL TABLES IN SCHEMA pub_test1;
+-- fail
+UPDATE pub_test1.tbl SET id = 2;
+ERROR:  cannot update table "tbl" because it does not have a replica
identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

I think here we don't need to test both SET and ADD variants, one of
them is sufficient.

7.
+-- verify invalidation of partition table having partition on different schema

I think this comment is not very clear to me. Can we change it to:
"verify invalidation of partition table having parent and child tables
in different schema"?

8.
+-- verify invalidation of schema having partition parent table and
partition child table

Similarly, let's change this to: "verify invalidation of partition
tables for schema publication that has parent and child tables of
different partition hierarchies". Keep comments line boundary as 80
chars, that way they look readable.

9.
+++ b/src/test/subscription/t/025_rep_changes_for_schema.pl
..
+# Basic logical replication test

Let's change this comment to: "Logical replication tests for schema
publications"


-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
> On Friday, October 8, 2021 7:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication
> 3.
> --- a/src/bin/pg_dump/t/002_pg_dump.pl
> +++ b/src/bin/pg_dump/t/002_pg_dump.pl
> ..
> + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => {
> + create_order => 51,
> + create_sql =>
> +   'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;',
> + regexp => qr/^
> + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E
> + /xm,
> + like   => { %full_runs, section_post_data => 1, },
> + unlike => { exclude_dump_test_schema => 1, },
> 
> In this test, won't it exclude the schema dump_test because of unlike?
> If so, then we don't have coverage for "ALL Tables In Schema" except
> for public schema?
> 

Yes, the unlike case will exclude the schema dump_test, but I think schema dump_test could be
dumped in like case.
I checked the log file src/bin/pg_dump/tmp_check/log/regress_log_002_pg_dump and
saw some cases were described as "should dump ALTER PUBLICATION pub3 ADD ALL
TABLES IN SCHEMA dump_test". I think in these cases schema dump_test would be
dumped.


Besides, a small comment on tab-complete.c:

    else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
-        COMPLETE_WITH("TABLES");
-    else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
-             || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+        COMPLETE_WITH("TABLES", "TABLE IN SCHEMA");


COMPLETE_WITH("TABLES", "TABLE IN SCHEMA");
->
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");


Regards
Tang

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Oct 7, 2021 at 12:51 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Oct 6, 2021 at 4:42 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v37 patch has the changes for the same.
> >
>
> A small issue I noticed is that using "\dS" in PSQL shows UNLOGGED
> tables as belonging to a publication, if the table belongs to a schema
> that was added to the publication using ALL TABLES IN SCHEMA (yet
> doesn't show as part of an ALL TABLES publication).
> Since publication of UNLOGGED tables is silently skipped in the case
> of ALL TABLES and ALL TABLES IN SCHEMA, it shouldn't show as belonging
> to the publication, right?
>
> test=# \dRp+ pub2
>                             Publication pub2
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | t          | t       | t       | t       | t         | f
> (1 row)
>
> test=# \dRp+ pub
>                              Publication pub
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | t
> Tables from schemas:
>     "sch1"
>
> test=# \dS sch1.test2
>             Unlogged table "sch1.test2"
>  Column |  Type   | Collation | Nullable | Default
> --------+---------+-----------+----------+---------
>  i      | integer |           |          |
> Publications:
>     "pub"

Thanks for reporting the issue. The attached patch has the fix for the issue.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v37 patch has the changes for the same.
> >
>
> Few comments:
> ==============
> v37-0001-Added-schema-level-support-for-publication
> 1.
> + *
> + * The first scan will get all the 'r' relkind tables for the specified schema,
> + * iterate the 'r' relkind tables and prepare a list of:
> + * 1) non partition table if pub_partopt is PUBLICATION_PART_ROOT
> + * 2) partition table and non partition table if pub_partopt is
> + *    PUBLICATION_PART_LEAF.
> + *
> + * The second scan will get all the 'p'' relkind tables for the  specified
> + * schema, iterate the 'p' relkind tables and prepare a list of:
> + * 1) partition table's child relations if pub_partopt is PUBLICATION_PART_LEAF
> + * 2) partition table if pub_partopt is PUBLICATION_PART_ROOT.
>
> I think these comments are redundant and not sure if they are
> completely correct. We don't need these as the actual code explains
> these conditions better. The earlier part of these comments is
> sufficient.

Removed it.

> v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN
> 2.
> + * selectDumpablePublicationObject: policy-setting subroutine
> + * Mark a publication as to be dumped or not
>   *
> - * Publication tables have schemas, but those are ignored in decision making,
> + * Publications have schemas, but those are ignored in decision making,
>   * because publications are only dumped when we are dumping everything.
>   */
>
> Change the above comment lines:
> a. "Mark a publication as to be dumped or not" to "Mark a publication
> object as to be dumped or not".
>
> b. "Publications have schemas, but those are ignored in decision
> making, .." to "A publication can have schemas and tables which have
> schemas, but those are ignored in decision making, .."

Modified

> 3.
> +/*
> + * dumpPublicationNamespace
> + *   dump the definition of the given publication tables in schema mapping
> + */
>
> Can we change the comment to: "dump the definition of the given
> publication schema mapping"? IT is easier to read and understand.
>
> 4.
> +/*
> + * The PublicationSchemaInfo struct is used to represent publication tables
> + * in schema mapping.
> + */
> +typedef struct _PublicationSchemaInfo
> +{
> + DumpableObject dobj;
> + NamespaceInfo *pubschema;
> + PublicationInfo *publication;
> +} PublicationSchemaInfo;
>
> Can we change the comment similar to the comment change in point 3?
> Also, let's keep PublicationInfo * before NamespaceInfo * just to be
> consistent with the existing structure PublicationRelInfo?
>
> 5.
> + printfPQExpBuffer(&buf,
> +   "SELECT p.pubname\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid AND
> pc.oid = '%s'\n"
>
> I think this part of the query can be improved in multiple ways: (a)
> pc.oid = '%s' should be part of WHERE clause not join condition, (b)
> for pubname, no need to use alias name, it can be directly referred as
> pubname, (c) you can check if the relation is publishable. So, the
> formed query would look like:
>
> SELECT p.pubname FROM pg_catalog.pg_publication p JOIN
> pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid JOIN
> pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid  WHERE pc.oid =
> '%s' and pg_catalog.pg_relation_is_publishable('%s')

Modified

> 6.
> listSchemas()
> {
> ..
> + if (pub_schema_tuples > 0)
> + {
> + /*
> + * Allocate memory for footers. Size of footers will be 1 (for
> + * storing "Publications:" string) + Schema count +  1 (for
> + * storing NULL).
> + */
> + footers = (char **) palloc((1 + pub_schema_tuples + 1) * sizeof(char *));
> + footers[0] = pstrdup(_("Publications:"));
> +
> + /* Might be an empty set - that's ok */
> + for (i = 0; i < pub_schema_tuples; i++)
> + {
> + printfPQExpBuffer(&buf, "    \"%s\"",
> +   PQgetvalue(result, i, 0));
> +
> + footers[i + 1] = pstrdup(buf.data);
> + }
> +
> + footers[i + 1] = NULL;
> + myopt.footers = footers;
> + }
> ..
> }
>
> Is there a reason of not using printTableAddFooter() here similar to
> how we use it to print Publications in describeOneTableDetails()?

There are 2 ways to print table in psql:
1) call printTableInit, printTableAddHeader, printTableAddCell,
printTableAddFooter, printTable & printTableCleanup to print the table
2) prepare the table contents and call printQuery to print the
table(which will take care of handling all of the above)
describeOneTableDetails uses 1st method
listSchemas uses 2nd method, in case of this method since table is not
initialized we cannot use printTableAddFooter. we have to prepare the
footers and set the footers.

> 7.
> describePublications()
> {
> ..
> + /* Get the schemas for the specified publication */
> + printfPQExpBuffer(&buf,
> +   "SELECT n.nspname\n"
> +   "FROM pg_catalog.pg_namespace n,\n"
> +   "     pg_catalog.pg_publication_namespace pn\n"
> +   "WHERE n.oid = pn.pnnspid\n"
> +   "  AND pn.pnpubid = '%s'\n"
> +   "ORDER BY 1", pubid);
> + if (!addFooterToPublicationDesc(&buf, "Tables from schemas:",
> + true, &cont))
> + goto error_return;
> ..
> }
>
> Shouldn't we try to get schemas only when pset.sversion >= 150000?

Modified

> 8.
> +addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
> +    bool singlecol, printTableContent *cont)
> +{
> ..
> + termPQExpBuffer(buf);
> ..
> }
>
> It seems this buffer is freed at the caller's site, if so, no need to
> free it here.
>

Modified

These comments are fixed in the v38 patch attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Oct 8, 2021 at 4:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Attached v37 patch has the changes for the same.
> > >
> >
> > Few comments:
> > ==============
> >
>
> Few more comments:
> ====================
> v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN
> 1.
> + else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL",
> "TABLES", "IN", "SCHEMA"))
> + COMPLETE_WITH_QUERY(Query_for_list_of_schemas
> + " UNION SELECT 'CURRENT_SCHEMA' "
> + "UNION SELECT 'WITH ('");
>
> What is the need to display WITH here? It will be displayed after
> Schema name with the below rule:
> + else if (Matches("CREATE", "PUBLICATION", MatchAny,  "FOR", "ALL",
> "TABLES", "IN", "SCHEMA", MatchAny))
> + COMPLETE_WITH("WITH (");

Removed it.

> 2. It seems tab-completion happens incorrectly for the below case:
> create publication pub for all tables in schema s1,
>
> If I press the tab after above, it completes with below which is wrong
> because it will lead to incorrect syntax.
>
> create publication pub for all tables in schema s1, WITH (

Modified

> v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication
> 3.
> --- a/src/bin/pg_dump/t/002_pg_dump.pl
> +++ b/src/bin/pg_dump/t/002_pg_dump.pl
> ..
> + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => {
> + create_order => 51,
> + create_sql =>
> +   'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;',
> + regexp => qr/^
> + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E
> + /xm,
> + like   => { %full_runs, section_post_data => 1, },
> + unlike => { exclude_dump_test_schema => 1, },
>
> In this test, won't it exclude the schema dump_test because of unlike?
> If so, then we don't have coverage for "ALL Tables In Schema" except
> for public schema?

I noticed that the tap test framework runs pg_dump tests with various
combinations, these tests will be covered in the like tests as Tang
suggested at [1].  In the exclude_dump_test_schema since this sql will
not be present,  exclude_dump_test_schema should be added in the
unlike option, as the validation will fail for exclude-schema option
which will be run with the following command:
pg_dump --no-sync
--file=/home/vignesh/postgres/src/bin/pg_dump/tmp_check/tmp_test_4i8F/exclude_dump_test_schema.sql
--exclude-schema=dump_test postgres

> 4.
> --- a/src/test/regress/expected/publication.out
> +++ b/src/test/regress/expected/publication.out
> ..
> +-- fail - can't add schema to for all tables publication
> +ALTER PUBLICATION testpub_foralltables ADD ALL TABLES IN SCHEMA pub_test;
>
> In the above and all similar comments, it is better to either quote
> 'for all tables' or write in CAPS FOR ALL TABLE or both 'FOR ALL
> TABLE'.

Modified

> 5.
> +\dRp+ testpub1_forschema
> +                               Publication testpub1_forschema
> +          Owner           | All tables | Inserts | Updates | Deletes
> | Truncates | Via root
> +--------------------------+------------+---------+---------+---------+-----------+----------
> + regress_publication_user | f          | t       | t       | t
> | t         | f
> +Tables from schemas:
> +    "pub_test1"
> +
> +SELECT p.pubname FROM pg_catalog.pg_publication p,
> pg_catalog.pg_namespace n, pg_catalog.pg_publication_namespace pn
> WHERE n.oid = pn.pnnspid AND p.oid = pn.pnpubid AND n.nspname =
> 'pub_test1' ORDER BY 1;
> +      pubname
> +--------------------
> + testpub1_forschema
> +(1 row)
>
> I don't think in the above and similar tests, we need to separately
> check the presence of publication via Select query, if we have tested
> it via psql command. Let's try to keep the meaningful tests.

Removed it.

> 6.
> +INSERT INTO pub_test1.tbl VALUES(1, 'test');
> +-- fail
> +UPDATE pub_test1.tbl SET id = 2;
> +ERROR:  cannot update table "tbl" because it does not have a replica
> identity and publishes updates
> +HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
> +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1;
> +-- success
> +UPDATE pub_test1.tbl SET id = 2;
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
> +-- fail
> +UPDATE pub_test1.tbl SET id = 2;
> +ERROR:  cannot update table "tbl" because it does not have a replica
> identity and publishes updates
> +HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
> +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1;
> +-- success
> +UPDATE pub_test1.tbl SET id = 2;
> +ALTER PUBLICATION testpub1_forschema ADD ALL TABLES IN SCHEMA pub_test1;
> +-- fail
> +UPDATE pub_test1.tbl SET id = 2;
> +ERROR:  cannot update table "tbl" because it does not have a replica
> identity and publishes updates
> +HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
>
> I think here we don't need to test both SET and ADD variants, one of
> them is sufficient.

Removed it

> 7.
> +-- verify invalidation of partition table having partition on different schema
>
> I think this comment is not very clear to me. Can we change it to:
> "verify invalidation of partition table having parent and child tables
> in different schema"?

Modified

> 8.
> +-- verify invalidation of schema having partition parent table and
> partition child table
>
> Similarly, let's change this to: "verify invalidation of partition
> tables for schema publication that has parent and child tables of
> different partition hierarchies". Keep comments line boundary as 80
> chars, that way they look readable.

Modified

> 9.
> +++ b/src/test/subscription/t/025_rep_changes_for_schema.pl
> ..
> +# Basic logical replication test
>
> Let's change this comment to: "Logical replication tests for schema
> publications"

Modified

These comments are handled in the v38 patch attached at [2].
[1] -
https://www.postgresql.org/message-id/OS0PR01MB6113A715FB3B85907458F4E7FBB59%40OS0PR01MB6113.jpnprd01.prod.outlook.com
[2] - https://www.postgresql.org/message-id/CALDaNm1TP9S0dif2QWoEUcCtNDop1xJ6Rj1xnu2vS92%3Dj9ahYw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Oct 11, 2021 at 7:46 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> > On Friday, October 8, 2021 7:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication
> > 3.
> > --- a/src/bin/pg_dump/t/002_pg_dump.pl
> > +++ b/src/bin/pg_dump/t/002_pg_dump.pl
> > ..
> > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => {
> > + create_order => 51,
> > + create_sql =>
> > +   'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;',
> > + regexp => qr/^
> > + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E
> > + /xm,
> > + like   => { %full_runs, section_post_data => 1, },
> > + unlike => { exclude_dump_test_schema => 1, },
> >
> > In this test, won't it exclude the schema dump_test because of unlike?
> > If so, then we don't have coverage for "ALL Tables In Schema" except
> > for public schema?
> >
>
> Yes, the unlike case will exclude the schema dump_test, but I think schema dump_test could be
> dumped in like case.
> I checked the log file src/bin/pg_dump/tmp_check/log/regress_log_002_pg_dump and
> saw some cases were described as "should dump ALTER PUBLICATION pub3 ADD ALL
> TABLES IN SCHEMA dump_test". I think in these cases schema dump_test would be
> dumped.

I agree

> Besides, a small comment on tab-complete.c:
>
>         else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
> -               COMPLETE_WITH("TABLES");
> -       else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
> -                        || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
> +               COMPLETE_WITH("TABLES", "TABLE IN SCHEMA");
>
>
> COMPLETE_WITH("TABLES", "TABLE IN SCHEMA");
> ->
> COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");

Modified.
This issue is fixed in the v38 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1TP9S0dif2QWoEUcCtNDop1xJ6Rj1xnu2vS92%3Dj9ahYw%40mail.gmail.com

Regards,
Vignesh



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Monday, October 11, 2021 2:39 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> These comments are fixed in the v38 patch attached.

Thanks for updating the patches.
Here are a few comments on the v38-0004-Doc patch.

1.
+  <para>
+   Adding/Setting a table that is part of schema specified in
+   <literal>ALL TABLES IN SCHEMA</literal>, adding/setting a schema to a
+   publication along with same schema's table specified with
+   <literal>TABLE</literal>, adding/setting a schema to a publication that
+   already has a table that is part of specified schema or adding/setting a
+   table to a publication that already has a table's schema as part of
+   specified schema is not supported.

ISTM we can remove the description "adding/setting a schema to a publication
along with same schema's table specified with <literal>TABLE</literal>",
because it seems the same as the first mentioned case "Adding/Setting a table
that is part of schema specified in <literal>ALL TABLES IN SCHEMA</literal>"

2.

+</programlisting></para>
+
+  <para>
+   Add some schemas to the publication:
+<programlisting>
+ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june;
+</programlisting>
+  </para>
+
+  <para>
+   Add some tables and schemas to the publication:
...
+
+  <para>
+   Drop some schemas from the publication:
...
+  <para>
+   Set some schemas to the publication:
+<programlisting>
+ALTER PUBLICATION production_publication SET ALL TABLES IN SCHEMA production_september, production_october;

Personally, I think we don't need the example about DROP and SET here.
The example of ADD seems sufficient.

3.
+</programlisting>
+  </para>
+
+  <para>
+   Create a publication that publishes all changes for all the tables present in
+   the schema "production":
+<programlisting>
+CREATE PUBLICATION production_publication FOR ALL TABLES IN SCHEMA production;
+</programlisting>
+  </para>
...
+  <para>
+   Create a publication that publishes all changes for all the tables present in
+   the schemas "marketing" and "sales":
+<programlisting>
+CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;

I think the example for publishing all the tables in schemas "marketing" and
"sales" is sufficient, the example for pulishing signal schema seems can be
removed.

Best regards,
Hou zj


Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Oct 11, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> These comments are fixed in the v38 patch attached.
>

Thanks for the updates.
I noticed that these patches don't apply on the latest source (last
seemed to apply cleanly on HEAD as at about October 6).

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Oct 11, 2021 at 1:21 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Oct 11, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > These comments are fixed in the v38 patch attached.
> >
>
> Thanks for the updates.
> I noticed that these patches don't apply on the latest source (last
> seemed to apply cleanly on HEAD as at about October 6).

I was not able to apply v37 patches, but I was able to apply the v38
version of patches on top of HEAD.

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Oct 11, 2021 at 12:50 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, October 11, 2021 2:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > These comments are fixed in the v38 patch attached.
>
> Thanks for updating the patches.
> Here are a few comments on the v38-0004-Doc patch.
>
> 1.
> +  <para>
> +   Adding/Setting a table that is part of schema specified in
> +   <literal>ALL TABLES IN SCHEMA</literal>, adding/setting a schema to a
> +   publication along with same schema's table specified with
> +   <literal>TABLE</literal>, adding/setting a schema to a publication that
> +   already has a table that is part of specified schema or adding/setting a
> +   table to a publication that already has a table's schema as part of
> +   specified schema is not supported.
>
> ISTM we can remove the description "adding/setting a schema to a publication
> along with same schema's table specified with <literal>TABLE</literal>",
> because it seems the same as the first mentioned case "Adding/Setting a table
> that is part of schema specified in <literal>ALL TABLES IN SCHEMA</literal>"

Modified

> 2.
>
> +</programlisting></para>
> +
> +  <para>
> +   Add some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june;
> +</programlisting>
> +  </para>
> +
> +  <para>
> +   Add some tables and schemas to the publication:
> ...
> +
> +  <para>
> +   Drop some schemas from the publication:
> ...
> +  <para>
> +   Set some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION production_publication SET ALL TABLES IN SCHEMA production_september, production_october;
>
> Personally, I think we don't need the example about DROP and SET here.
> The example of ADD seems sufficient.

Modified

> 3.
> +</programlisting>
> +  </para>
> +
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +   the schema "production":
> +<programlisting>
> +CREATE PUBLICATION production_publication FOR ALL TABLES IN SCHEMA production;
> +</programlisting>
> +  </para>
> ...
> +  <para>
> +   Create a publication that publishes all changes for all the tables present in
> +   the schemas "marketing" and "sales":
> +<programlisting>
> +CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;
>
> I think the example for publishing all the tables in schemas "marketing" and
> "sales" is sufficient, the example for pulishing signal schema seems can be
> removed.

Modified

The attached v39 patch has the fixes for the above issues.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Monday, October 11, 2021 11:02 PM vignesh C <vignesh21@gmail.com> wrote:
> The attached v39 patch has the fixes for the above issues.

Thanks for the updates.
I have a few minor suggestions about the testcases in the v39-0003-Test patch.

1)
+-- alter publication drop CURRENT_SCHEMA
+ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA CURRENT_SCHEMA;
+\dRp+ testpub1_forschema

Since we already tested CURRENT_SCHEMA in various CREATE PUBLICATION cases, maybe
we don't need to test it again in SET/DROP/ADD cases.

2)
+-- alter publication set schema
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
+\dRp+ testpub1_forschema
+
+-- alter publication set multiple schema
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
+\dRp+ testpub1_forschema
+

I think the multiple schemas testcase is sufficient, maybe we can remove the
single schema case.


3)
+
+-- alter publication set it with the same schema
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
+\dRp+ testpub1_forschema

ISTM, we didn't have some special code path for this case, maybe we can remove
this testcase.


Best regards,
Hou zj


Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Oct 12, 2021 at 12:24 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, October 11, 2021 11:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > The attached v39 patch has the fixes for the above issues.
>
> Thanks for the updates.
> I have a few minor suggestions about the testcases in the v39-0003-Test patch.
>
> 1)
> +-- alter publication drop CURRENT_SCHEMA
> +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA CURRENT_SCHEMA;
> +\dRp+ testpub1_forschema
>
> Since we already tested CURRENT_SCHEMA in various CREATE PUBLICATION cases, maybe
> we don't need to test it again in SET/DROP/ADD cases.

Modified

> 2)
> +-- alter publication set schema
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
> +\dRp+ testpub1_forschema
> +
> +-- alter publication set multiple schema
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
> +\dRp+ testpub1_forschema
> +
>
> I think the multiple schemas testcase is sufficient, maybe we can remove the
> single schema case.

Modified

> 3)
> +
> +-- alter publication set it with the same schema
> +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2;
> +\dRp+ testpub1_forschema
>
> ISTM, we didn't have some special code path for this case, maybe we can remove
> this testcase.

Modified

Attached v40 patch has the fix for the above comments.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Tuesday, October 12, 2021 9:15 PM vignesh C <vignesh21@gmail.com>
> Attached v40 patch has the fix for the above comments.

Thanks for the update, I have some minor issues about partition related behavior.

1)

Tang tested and discussed this issue with me.
The testcase is:
We publish a schema and there is a partition in the published schema. If
publish_via_partition_root is on and the partition's parent table is not in the
published schema, neither the change on the partition nor the parent table will
not be published.

But if we publish by FOR TABLE partition and set publish_via_partition_root to
on, the change on the partition will be published. So, I think it'd be better to
publish the change on partition for FOR ALL TABLES IN SCHEMA case if its parent table
is not published in the same publication.

It seems we should pass publication oid to the GetSchemaPublicationRelations()
and add some check like the following:

        if (is_publishable_class(relid, relForm) &&
             !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT))
             result = lappend_oid(result, relid);
+        if (relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT)
+        {
+            bool skip = false;
+            List *ancestors = get_partition_ancestors(relid);
+            List *schemas = GetPublicationSchemas(pubid);
+            ListCell   *lc;
+            foreach(lc, ancestors)
+            {
+                if (list_member_oid(schemas, get_rel_namespace(lfirst_oid(lc))))
+                {
+                    skip = true;
+                    break;
+                }
+            }
+            if (!skip)
+                result = lappend_oid(result, relid);
+        }



2)
+    /*
+     * It is quite possible that some of the partitions are in a different
+     * schema than the parent table, so we need to get such partitions
+     * separately.
+     */
+    scan = table_beginscan_catalog(classRel, keycount, key);
+    while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+    {
+        Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+
+        if (is_publishable_class(relForm->oid, relForm))
+        {
+            List       *partitionrels = NIL;
+
+            partitionrels = GetPubPartitionOptionRelations(partitionrels,
+                                                           pub_partopt,
+                                                           relForm->oid);

I think a partitioned table could also be a partition which should not be
appended to the list. I think we should also filter these cases here by same
check in 1).
Thoughts ?

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Wed, Oct 13, 2021 at 12:15 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v40 patch has the fix for the above comments.
>

[Maybe this has some overlap with what Hou-san reported, and I have
not tested this against his proposed fixes]

If partitions belong to a different schema than the parent partitioned
table, then the current patch implementation allows the partitions to
(optionally) be explicitly added to a publication that includes the
parent partitioned table (and for the most part, it doesn't seem to
make any difference to the publication behavior). Should this be
allowed?

e.g.

CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer) PARTITION BY RANGE (sale_date);
CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
('2019-01-01') TO ('2019-02-01');
CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
('2019-02-01') TO ('2019-03-01');

postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE
sch1.sale_201901, TABLE sch1.sale_201902;
CREATE PUBLICATION
postgres=# \dRp+
                             Publication pub
 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
 gregn | f          | t       | t       | t       | t         | f
Tables:
    "sch1.sale_201901"
    "sch1.sale_201902"
Tables from schemas:
    "sch"


Also, I found the following scenario where the data is double-published:

(1) PUB:  CREATE PUBLICATION pub FOR TABLE sch1.sale_201901, TABLE
sch1.sale_201902 WITH (publish_via_partition_root=true);
(2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
host=localhost port=5432' PUBLICATION pub;
(3) PUB:  INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5),
('2019-01-02', 'AU', 'disk', 8);
(4) SUB:  SELECT * FROM sch.sale;
(5) PUB:  ALTER PUBLICATION pub ADD ALL TABLES IN SCHEMA sch;
(6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
(7) SUB:  SELECT * FROM sch.sale;

sale_date  | country_code | product_sku | units
------------+--------------+-------------+-------
 2019-01-01 | AU           | cpu         |     5
 2019-01-02 | AU           | disk        |     8
 2019-01-01 | AU           | cpu         |     5
 2019-01-02 | AU           | disk        |     8


Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Wednesday, October 13, 2021 4:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> Also, I found the following scenario where the data is double-published:
> 
> (1) PUB:  CREATE PUBLICATION pub FOR TABLE sch1.sale_201901, TABLE
> sch1.sale_201902 WITH (publish_via_partition_root=true);
> (2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
> host=localhost port=5432' PUBLICATION pub;
> (3) PUB:  INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5),
> ('2019-01-02', 'AU', 'disk', 8);
> (4) SUB:  SELECT * FROM sch.sale;
> (5) PUB:  ALTER PUBLICATION pub ADD ALL TABLES IN SCHEMA sch;
> (6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
> (7) SUB:  SELECT * FROM sch.sale;
> 
> sale_date  | country_code | product_sku | units
> ------------+--------------+-------------+-------
>  2019-01-01 | AU           | cpu         |     5
>  2019-01-02 | AU           | disk        |     8
>  2019-01-01 | AU           | cpu         |     5
>  2019-01-02 | AU           | disk        |     8
> 
> 

I changed your test step in (5) and used "ADD TABLE" command as below:
ALTER PUBLICATION pub ADD TABLE sch.sale;

I could get the same result on HEAD.
So I think it's not a problem related to this patch.
Maybe we can post this issue in a new thread.

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Oct 13, 2021 at 1:40 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Oct 13, 2021 at 12:15 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v40 patch has the fix for the above comments.
> >
>
> [Maybe this has some overlap with what Hou-san reported, and I have
> not tested this against his proposed fixes]
>
> If partitions belong to a different schema than the parent partitioned
> table, then the current patch implementation allows the partitions to
> (optionally) be explicitly added to a publication that includes the
> parent partitioned table (and for the most part, it doesn't seem to
> make any difference to the publication behavior). Should this be
> allowed?
>
> e.g.
>
> CREATE SCHEMA sch;
> CREATE SCHEMA sch1;
> CREATE TABLE sch.sale (sale_date date not null, country_code text,
> product_sku text, units integer) PARTITION BY RANGE (sale_date);
> CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
> ('2019-01-01') TO ('2019-02-01');
> CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
> ('2019-02-01') TO ('2019-03-01');
>
> postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE
> sch1.sale_201901, TABLE sch1.sale_201902;
> CREATE PUBLICATION
> postgres=# \dRp+
>                              Publication pub
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -------+------------+---------+---------+---------+-----------+----------
>  gregn | f          | t       | t       | t       | t         | f
> Tables:
>     "sch1.sale_201901"
>     "sch1.sale_201902"
> Tables from schemas:
>     "sch"
>

I don't see any problem with this. Do you have a specific problem in
mind due to this?

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Oct 14, 2021 at 9:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > If partitions belong to a different schema than the parent partitioned
> > table, then the current patch implementation allows the partitions to
> > (optionally) be explicitly added to a publication that includes the
> > parent partitioned table (and for the most part, it doesn't seem to
> > make any difference to the publication behavior). Should this be
> > allowed?
> >
> > e.g.
> >
> > CREATE SCHEMA sch;
> > CREATE SCHEMA sch1;
> > CREATE TABLE sch.sale (sale_date date not null, country_code text,
> > product_sku text, units integer) PARTITION BY RANGE (sale_date);
> > CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
> > ('2019-01-01') TO ('2019-02-01');
> > CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
> > ('2019-02-01') TO ('2019-03-01');
> >
> > postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE
> > sch1.sale_201901, TABLE sch1.sale_201902;
> > CREATE PUBLICATION
> > postgres=# \dRp+
> >                              Publication pub
> >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > -------+------------+---------+---------+---------+-----------+----------
> >  gregn | f          | t       | t       | t       | t         | f
> > Tables:
> >     "sch1.sale_201901"
> >     "sch1.sale_201902"
> > Tables from schemas:
> >     "sch"
> >
>
> I don't see any problem with this. Do you have a specific problem in
> mind due to this?
>

I'm not sure if it's a problem as such, really just a query from me as
to whether it should be allowed to also (redundantly) add partitions
to the publication, in addition to the partitioned table, since the
current documentation says: "When a partitioned table is added to a
publication, all of its existing and future partitions are implicitly
considered to be part of the publication".
I guess it should be allowed, as I find I can do it in the current
implementation just with TABLE.

Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wednesday, October 13, 2021 10:49 AM Hou, Zhijie wrote:
> On Tuesday, October 12, 2021 9:15 PM vignesh C <vignesh21@gmail.com>
> > Attached v40 patch has the fix for the above comments.
> 
> Thanks for the update, I have some minor issues about partition related
> behavior.
> 
> 1)
> 
> Tang tested and discussed this issue with me.
> The testcase is:
> We publish a schema and there is a partition in the published schema. If
> publish_via_partition_root is on and the partition's parent table is not in the
> published schema, neither the change on the partition nor the parent table will
> not be published.
> 
> But if we publish by FOR TABLE partition and set publish_via_partition_root to
> on, the change on the partition will be published. So, I think it'd be better to
> publish the change on partition for FOR ALL TABLES IN SCHEMA case if its parent
> table
> is not published in the same publication.
> 
> It seems we should pass publication oid to the GetSchemaPublicationRelations()
> and add some check like the following:
> 2)
> I think a partitioned table could also be a partition which should not be
> appended to the list. I think we should also filter these cases here by same
> check in 1).

After some offline discussion with Vignesh and Amit.
I found my proposed fix can be improved because it brings some overhead to
functions which could be invoked many times for the same publication. Now, I
think it'd be better to add this check in GetAllSchemaPublicationRelations and
GetPublicationRelations.

Besides, I found we misunderstood the flag PUBLICATION_PART_ROOT it means:
"ROOT: only the table explicitly mentioned in the publication" We cannot use it
as a flag to judge whether do the partition filtering, I think we need to pass
the actual pubviaroot flag.

Based on the V40 patchset, attaching the Top-up patch which try to fix the
partition issue in a cleaner way.

Best regards,
Hou zj


Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Oct 15, 2021 at 6:45 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Oct 14, 2021 at 9:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > If partitions belong to a different schema than the parent partitioned
> > > table, then the current patch implementation allows the partitions to
> > > (optionally) be explicitly added to a publication that includes the
> > > parent partitioned table (and for the most part, it doesn't seem to
> > > make any difference to the publication behavior). Should this be
> > > allowed?
> > >
> > > e.g.
> > >
> > > CREATE SCHEMA sch;
> > > CREATE SCHEMA sch1;
> > > CREATE TABLE sch.sale (sale_date date not null, country_code text,
> > > product_sku text, units integer) PARTITION BY RANGE (sale_date);
> > > CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
> > > ('2019-01-01') TO ('2019-02-01');
> > > CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
> > > ('2019-02-01') TO ('2019-03-01');
> > >
> > > postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE
> > > sch1.sale_201901, TABLE sch1.sale_201902;
> > > CREATE PUBLICATION
> > > postgres=# \dRp+
> > >                              Publication pub
> > >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > > -------+------------+---------+---------+---------+-----------+----------
> > >  gregn | f          | t       | t       | t       | t         | f
> > > Tables:
> > >     "sch1.sale_201901"
> > >     "sch1.sale_201902"
> > > Tables from schemas:
> > >     "sch"
> > >
> >
> > I don't see any problem with this. Do you have a specific problem in
> > mind due to this?
> >
>
> I'm not sure if it's a problem as such, really just a query from me as
> to whether it should be allowed to also (redundantly) add partitions
> to the publication, in addition to the partitioned table, since the
> current documentation says: "When a partitioned table is added to a
> publication, all of its existing and future partitions are implicitly
> considered to be part of the publication".
> I guess it should be allowed, as I find I can do it in the current
> implementation just with TABLE.
>

I have also checked the "For Table" case and it behaves similar to
what the patch has for schema. So, I think it is better to retain the
current behavior of patch.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Sat, Oct 16, 2021 at 4:57 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Besides, I found we misunderstood the flag PUBLICATION_PART_ROOT it means:
> "ROOT: only the table explicitly mentioned in the publication" We cannot use it
> as a flag to judge whether do the partition filtering, I think we need to pass
> the actual pubviaroot flag.
>

I agree, PUBLICATION_PART_ROOT can't be used to determine whether to
do partition filtering, the "pubviaroot" flag is needed.

> Based on the V40 patchset, attaching the Top-up patch which try to fix the
> partition issue in a cleaner way.
>

A minor thing, in your "top-up patch", the test code added to
publication.sql, you need to remove the last "DROP TABLE
sch2.tbl1_part1;". It causes an error because the table doesn't exist
and it seems to have been erroneously copied from the previous test
case.

Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote:
> Based on the V40 patchset, attaching the Top-up patch which try to fix the
> partition issue in a cleaner way.

Attach the new version patch set which merge the partition fix into it.
Besides, instead of introducing new function and parameter, just add the
partition filter in pg_get_publication_tables which makes the code cleaner.

Only 0001 and 0003 was changed.

Best regards,
Hou zj


Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Monday, October 18, 2021 12:04 PM Greg Nancarrow wrote:
> On Sat, Oct 16, 2021 at 4:57 PM houzj.fnst@fujitsu.com wrote:
> > Based on the V40 patchset, attaching the Top-up patch which try to fix
> > the partition issue in a cleaner way.
> >
> 
> A minor thing, in your "top-up patch", the test code added to publication.sql,
> you need to remove the last "DROP TABLE sch2.tbl1_part1;". It causes an error
> because the table doesn't exist and it seems to have been erroneously copied
> from the previous test case.

Thanks for the comment.
I have removed the last "DROP TABLE sch2.tbl1_part1;" in V41 patch set.

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
Hi,

On Mon, Oct 18, 2021 at 3:14 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote:
> > Based on the V40 patchset, attaching the Top-up patch which try to fix the
> > partition issue in a cleaner way.
>
> Attach the new version patch set which merge the partition fix into it.
> Besides, instead of introducing new function and parameter, just add the
> partition filter in pg_get_publication_tables which makes the code cleaner.
>
> Only 0001 and 0003 was changed.

I've reviewed 0001 and 0002 patch and here are comments:

0001 patch:

+/*
+ * Get the list of publishable relation oids for a specified schema.
+ *
+ * Schema will be having both ordinary('r') relkind tables and partitioned('p')
+ * relkind tables, so two rounds of scan are required.
+ */
+List *
+GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
+{
+        Relation       classRel;
+        ScanKeyData key[3];
+        TableScanDesc scan;

I think it's enough to have key[2], not key[3].

BTW, this function does the table scan on pg_class twice in order to
get OIDs of both normal tables and partitioned tables. But can't we do
that by the single table scan? I think we can set a scan key for
relnamespace, and check relkind inside a scan loop.

---
+                ObjectsInPublicationToOids(stmt->pubobjects, pstate,
&relations,
+
&schemaidlist);
+
+                if (list_length(relations) > 0)
+                {
+                        List      *rels;
+
+                        rels = OpenTableList(relations);
+                        CheckObjSchemaNotAlreadyInPublication(rels,
schemaidlist,
+
PUBLICATIONOBJ_TABLE);
+                        PublicationAddTables(puboid, rels, true, NULL);
+                        CloseTableList(rels);
+                }
+
+                if (list_length(schemaidlist) > 0)
+                {
+                        /* FOR ALL TABLES IN SCHEMA requires superuser */
+                        if (!superuser())
+                                ereport(ERROR,
+
errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                                                errmsg("must be
superuser to create FOR ALL TABLES IN SCHEMA publication"));
+

Perhaps we can do a superuser check before handling "relations"? If
the user doesn't have the permission, we don't need to do anything for
relations.

0002 patch:

postgres(1:13619)=# create publication pub for all TABLES in schema
CURRENT_SCHEMA      pg_catalog          public              s2
information_schema  pg_toast            s1

Since pg_catalog and pg_toast cannot be added to the schema
publication can we exclude them from the completion list?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Oct 18, 2021 at 2:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi,
>
> On Mon, Oct 18, 2021 at 3:14 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote:
> > > Based on the V40 patchset, attaching the Top-up patch which try to fix the
> > > partition issue in a cleaner way.
> >
> > Attach the new version patch set which merge the partition fix into it.
> > Besides, instead of introducing new function and parameter, just add the
> > partition filter in pg_get_publication_tables which makes the code cleaner.
> >
> > Only 0001 and 0003 was changed.
>
> I've reviewed 0001 and 0002 patch and here are comments:
>
> 0001 patch:
>
> +/*
> + * Get the list of publishable relation oids for a specified schema.
> + *
> + * Schema will be having both ordinary('r') relkind tables and partitioned('p')
> + * relkind tables, so two rounds of scan are required.
> + */
> +List *
> +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
> +{
> +        Relation       classRel;
> +        ScanKeyData key[3];
> +        TableScanDesc scan;
>
> I think it's enough to have key[2], not key[3].

Modified

> BTW, this function does the table scan on pg_class twice in order to
> get OIDs of both normal tables and partitioned tables. But can't we do
> that by the single table scan? I think we can set a scan key for
> relnamespace, and check relkind inside a scan loop.

Modified

> ---
> +                ObjectsInPublicationToOids(stmt->pubobjects, pstate,
> &relations,
> +
> &schemaidlist);
> +
> +                if (list_length(relations) > 0)
> +                {
> +                        List      *rels;
> +
> +                        rels = OpenTableList(relations);
> +                        CheckObjSchemaNotAlreadyInPublication(rels,
> schemaidlist,
> +
> PUBLICATIONOBJ_TABLE);
> +                        PublicationAddTables(puboid, rels, true, NULL);
> +                        CloseTableList(rels);
> +                }
> +
> +                if (list_length(schemaidlist) > 0)
> +                {
> +                        /* FOR ALL TABLES IN SCHEMA requires superuser */
> +                        if (!superuser())
> +                                ereport(ERROR,
> +
> errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                                errmsg("must be
> superuser to create FOR ALL TABLES IN SCHEMA publication"));
> +
>
> Perhaps we can do a superuser check before handling "relations"? If
> the user doesn't have the permission, we don't need to do anything for
> relations.

Modified

> 0002 patch:
>
> postgres(1:13619)=# create publication pub for all TABLES in schema
> CURRENT_SCHEMA      pg_catalog          public              s2
> information_schema  pg_toast            s1
>
> Since pg_catalog and pg_toast cannot be added to the schema
> publication can we exclude them from the completion list?

Modified

Thanks for the comments, the attached v42 patch has the fixes for the same.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for the comments, the attached v42 patch has the fixes for the same.

Thanks for your new patch.

I tried your patch and found that the permission check for superuser didn't work.

For example:
postgres=# create role r1;
CREATE ROLE
postgres=# grant all privileges on database postgres to r1;
GRANT
postgres=# set role r1;
SET
postgres=> create schema s1;
CREATE SCHEMA
postgres=> create publication pub for all tables in schema s1;
CREATE PUBLICATION

Role r1 is not superuser, but this role could create publication for all tables in schema
successfully, I think it is related the following change. List schemaidlist was
not assigned yet. I think we should check it later.

@@ -165,6 +265,12 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
                 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
                  errmsg("must be superuser to create FOR ALL TABLES publication")));
 
+    /* FOR ALL TABLES IN SCHEMA requires superuser */
+    if (list_length(schemaidlist) > 0 && !superuser())
+        ereport(ERROR,
+                errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));
+
     rel = table_open(PublicationRelationId, RowExclusiveLock);
 
     /* Check if name is used */

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v42 patch has the fixes for the same.
>
> Thanks for your new patch.
>
> I tried your patch and found that the permission check for superuser didn't work.
>
> For example:
> postgres=# create role r1;
> CREATE ROLE
> postgres=# grant all privileges on database postgres to r1;
> GRANT
> postgres=# set role r1;
> SET
> postgres=> create schema s1;
> CREATE SCHEMA
> postgres=> create publication pub for all tables in schema s1;
> CREATE PUBLICATION
>
> Role r1 is not superuser, but this role could create publication for all tables in schema
> successfully, I think it is related the following change. List schemaidlist was
> not assigned yet. I think we should check it later.
>

It seems this got broken in yesterday's patch version. Do you think it
is a good idea to add a test for this case?

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Tuesday, October 19, 2021 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > >
> > > Thanks for the comments, the attached v42 patch has the fixes for the same.
> >
> > Thanks for your new patch.
> >
> > I tried your patch and found that the permission check for superuser didn't work.
> >
> > For example:
> > postgres=# create role r1;
> > CREATE ROLE
> > postgres=# grant all privileges on database postgres to r1;
> > GRANT
> > postgres=# set role r1;
> > SET
> > postgres=> create schema s1;
> > CREATE SCHEMA
> > postgres=> create publication pub for all tables in schema s1;
> > CREATE PUBLICATION
> >
> > Role r1 is not superuser, but this role could create publication for all tables in
> schema
> > successfully, I think it is related the following change. List schemaidlist was
> > not assigned yet. I think we should check it later.
> >
> 
> It seems this got broken in yesterday's patch version. Do you think it
> is a good idea to add a test for this case?
> 

Agreed. Thanks for your suggestion.

I tried to add this test to publication.sql, a patch diff file for this test is attached.

Regards
Tang

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Oct 18, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
>

Few comments on latest set of patches:
===============================
1.
+/*
+ * Filter out the partitions whose parent tables was also specified in
+ * the publication.
+ */
+static List *
+filter_out_partitions(List *relids)

Can we name this function as filter_partitions()?

2.
+ /*
+ * If the publication publishes partition changes via their
+ * respective root partitioned tables, we must exclude partitions
+ * in favor of including the root partitioned tables. Otherwise,
+ * the function could return both the child and parent tables which
+ * could cause the data of child table double-published in
+ * subscriber side.
+ */

Let's slightly change the last part of the line in the above comment
as: "... which could cause data of the child table to be
double-published on the subscriber side."

3.
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
..
..
@@ -38,7 +40,6 @@
 #include "utils/builtins.h"
 #include "utils/catcache.h"
 #include "utils/fmgroids.h"
-#include "utils/inval.h"
 #include "utils/lsyscache.h"

Does this change belong to this patch? If not, maybe you can submit a
separate patch for this. A similar change is present in
publicationcmds.c as well, not sure if that is required as well.

4.
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
...
...
+#include "nodes/makefuncs.h"

Do we need to include this file? I am able to compile without
including this file.

v42-0003-Add-tests-for-the-schema-publication-feature-of-
5.
+-- pg_publication_tables
+SET client_min_messages = 'ERROR';
+CREATE SCHEMA sch1;
+CREATE SCHEMA sch2;
+CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a);
+CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM
(1) to (10);
+CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH
(PUBLISH_VIA_PARTITION_ROOT=1);
+SELECT * FROM pg_publication_tables;
+
+DROP PUBLICATION pub;
+CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH
(PUBLISH_VIA_PARTITION_ROOT=1);
+SELECT * FROM pg_publication_tables;

Can we expand the above comment on the lines of: "Test the list of
partitions published"?

v42-0004-Add-documentation-for-the-schema-publication-fea
6.
+     <row>
+      <entry><link
linkend="catalog-pg-publication-namespace"><structname>pg_publication_namespace</structname></link></entry>
+      <entry>schema to publication mapping</entry>
+     </row>
+
      <row>
       <entry><link
linkend="catalog-pg-publication-rel"><structname>pg_publication_rel</structname></link></entry>
       <entry>relation to publication mapping</entry>
@@ -6238,6 +6243,67 @@ SCRAM-SHA-256$<replaceable><iteration
count></replaceable>:<replaceable>&l
   </table>
  </sect1>

+ <sect1 id="catalog-pg-publication-namespace">
+  <title><structname>pg_publication_namespace</structname></title>

At one place, the new catalog is placed after pg_publication_rel and
at another place, it is before it. Shouldn't it be before in both
places as we have a place as per naming order?

7.
The
+   <literal>ADD</literal> clause will add one or more tables/schemas to the
+   publication. The <literal>DROP</literal> clauses will remove one or more
+   tables/schemas from the publication.

Isn't it better to write the above as one line: "The
<literal>ADD</literal> and <literal>DROP</literal> clauses will add
and remove one or more tables/schemas from the publication."?

8.
+  <para>
+   Add some schemas to the publication:
+<programlisting>
+ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA
marketing_june, sales_june;
+</programlisting>
+  </para>

Can we change schema names to just marketing and sales? Also, let's
change the description as:"Add schemas
<structname>marketing</structname> and <structname>sales</structname>
to the publication <structname>sales_publication</structname>?

9.
+    [ FOR ALL TABLES
+      | FOR <replaceable class="parameter">publication
object</replaceable> [, ... ] ]
     [ WITH ( <replaceable
class="parameter">publication_parameter</replaceable> [= <replaceable
class="parameter">value</replaceable>] [, ... ] ) ]
+
+<phrase>where <replaceable class="parameter">publication
object</replaceable> is one of:</phrase>

Similar to Alter Publication, here also we should use
publication_object instead of publication object.

10.
+  <para>
+   Create a publication that publishes all changes for tables "users" and
+   "departments" and that publishes all changes for all the tables present in
+   the schema "production":
+<programlisting>
+CREATE PUBLICATION production_publication FOR TABLE users,
departments, ALL TABLES IN SCHEMA production;
+</programlisting>
+  </para>
+
+  <para>
+   Create a publication that publishes all changes for all the tables
present in
+   the schemas "marketing" and "sales":

It is better to use <structname> before and </structname> after schema
names in above descriptions.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v42 patch has the fixes for the same.
>
> Thanks for your new patch.
>
> I tried your patch and found that the permission check for superuser didn't work.
>
> For example:
> postgres=# create role r1;
> CREATE ROLE
> postgres=# grant all privileges on database postgres to r1;
> GRANT
> postgres=# set role r1;
> SET
> postgres=> create schema s1;
> CREATE SCHEMA
> postgres=> create publication pub for all tables in schema s1;
> CREATE PUBLICATION
>
> Role r1 is not superuser, but this role could create publication for all tables in schema
> successfully, I think it is related the following change. List schemaidlist was
> not assigned yet. I think we should check it later.
>
> @@ -165,6 +265,12 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
>                                 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                                  errmsg("must be superuser to create FOR ALL TABLES publication")));
>
> +       /* FOR ALL TABLES IN SCHEMA requires superuser */
> +       if (list_length(schemaidlist) > 0 && !superuser())
> +               ereport(ERROR,
> +                               errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                               errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));
> +
>         rel = table_open(PublicationRelationId, RowExclusiveLock);
>
>         /* Check if name is used */

This issue got induced in the v42 version, attached v43 patch has the
fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Oct 19, 2021 at 11:23 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Tuesday, October 19, 2021 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > >
> > > > Thanks for the comments, the attached v42 patch has the fixes for the same.
> > >
> > > Thanks for your new patch.
> > >
> > > I tried your patch and found that the permission check for superuser didn't work.
> > >
> > > For example:
> > > postgres=# create role r1;
> > > CREATE ROLE
> > > postgres=# grant all privileges on database postgres to r1;
> > > GRANT
> > > postgres=# set role r1;
> > > SET
> > > postgres=> create schema s1;
> > > CREATE SCHEMA
> > > postgres=> create publication pub for all tables in schema s1;
> > > CREATE PUBLICATION
> > >
> > > Role r1 is not superuser, but this role could create publication for all tables in
> > schema
> > > successfully, I think it is related the following change. List schemaidlist was
> > > not assigned yet. I think we should check it later.
> > >
> >
> > It seems this got broken in yesterday's patch version. Do you think it
> > is a good idea to add a test for this case?
> >
>
> Agreed. Thanks for your suggestion.
>
> I tried to add this test to publication.sql, a patch diff file for this test is attached.

Thanks for the test case, I have merged it to the v43 version patch
attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2pJ49wAv%3DgEZrAP5%3D_apAzv_rgK3zjX-wfwCY%2BWWfT9w%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Oct 19, 2021 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Few comments on latest set of patches:
> ===============================
> 1.
> +/*
> + * Filter out the partitions whose parent tables was also specified in
> + * the publication.
> + */
> +static List *
> +filter_out_partitions(List *relids)
>
> Can we name this function as filter_partitions()?

Modified

> 2.
> + /*
> + * If the publication publishes partition changes via their
> + * respective root partitioned tables, we must exclude partitions
> + * in favor of including the root partitioned tables. Otherwise,
> + * the function could return both the child and parent tables which
> + * could cause the data of child table double-published in
> + * subscriber side.
> + */
>
> Let's slightly change the last part of the line in the above comment
> as: "... which could cause data of the child table to be
> double-published on the subscriber side."

Modified

> 3.
> --- a/src/backend/catalog/pg_publication.c
> +++ b/src/backend/catalog/pg_publication.c
> ..
> ..
> @@ -38,7 +40,6 @@
>  #include "utils/builtins.h"
>  #include "utils/catcache.h"
>  #include "utils/fmgroids.h"
> -#include "utils/inval.h"
>  #include "utils/lsyscache.h"
>
> Does this change belong to this patch? If not, maybe you can submit a
> separate patch for this. A similar change is present in
> publicationcmds.c as well, not sure if that is required as well.

I have removed these changes from this patch, I will post a patch for
this separately later.

> 4.
> --- a/src/backend/commands/publicationcmds.c
> +++ b/src/backend/commands/publicationcmds.c
> ...
> ...
> +#include "nodes/makefuncs.h"
>
> Do we need to include this file? I am able to compile without
> including this file.

Modified

> v42-0003-Add-tests-for-the-schema-publication-feature-of-
> 5.
> +-- pg_publication_tables
> +SET client_min_messages = 'ERROR';
> +CREATE SCHEMA sch1;
> +CREATE SCHEMA sch2;
> +CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a);
> +CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM
> (1) to (10);
> +CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH
> (PUBLISH_VIA_PARTITION_ROOT=1);
> +SELECT * FROM pg_publication_tables;
> +
> +DROP PUBLICATION pub;
> +CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH
> (PUBLISH_VIA_PARTITION_ROOT=1);
> +SELECT * FROM pg_publication_tables;
>
> Can we expand the above comment on the lines of: "Test the list of
> partitions published"?

Modified

> v42-0004-Add-documentation-for-the-schema-publication-fea
> 6.
> +     <row>
> +      <entry><link
> linkend="catalog-pg-publication-namespace"><structname>pg_publication_namespace</structname></link></entry>
> +      <entry>schema to publication mapping</entry>
> +     </row>
> +
>       <row>
>        <entry><link
> linkend="catalog-pg-publication-rel"><structname>pg_publication_rel</structname></link></entry>
>        <entry>relation to publication mapping</entry>
> @@ -6238,6 +6243,67 @@ SCRAM-SHA-256$<replaceable><iteration
> count></replaceable>:<replaceable>&l
>    </table>
>   </sect1>
>
> + <sect1 id="catalog-pg-publication-namespace">
> +  <title><structname>pg_publication_namespace</structname></title>
>
> At one place, the new catalog is placed after pg_publication_rel and
> at another place, it is before it. Shouldn't it be before in both
> places as we have a place as per naming order?

Modified

> 7.
> The
> +   <literal>ADD</literal> clause will add one or more tables/schemas to the
> +   publication. The <literal>DROP</literal> clauses will remove one or more
> +   tables/schemas from the publication.
>
> Isn't it better to write the above as one line: "The
> <literal>ADD</literal> and <literal>DROP</literal> clauses will add
> and remove one or more tables/schemas from the publication."?

Modified

> 8.
> +  <para>
> +   Add some schemas to the publication:
> +<programlisting>
> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA
> marketing_june, sales_june;
> +</programlisting>
> +  </para>
>
> Can we change schema names to just marketing and sales? Also, let's
> change the description as:"Add schemas
> <structname>marketing</structname> and <structname>sales</structname>
> to the publication <structname>sales_publication</structname>?

Modified

> 9.
> +    [ FOR ALL TABLES
> +      | FOR <replaceable class="parameter">publication
> object</replaceable> [, ... ] ]
>      [ WITH ( <replaceable
> class="parameter">publication_parameter</replaceable> [= <replaceable
> class="parameter">value</replaceable>] [, ... ] ) ]
> +
> +<phrase>where <replaceable class="parameter">publication
> object</replaceable> is one of:</phrase>
>
> Similar to Alter Publication, here also we should use
> publication_object instead of publication object.

Modified

> 10.
> +  <para>
> +   Create a publication that publishes all changes for tables "users" and
> +   "departments" and that publishes all changes for all the tables present in
> +   the schema "production":
> +<programlisting>
> +CREATE PUBLICATION production_publication FOR TABLE users,
> departments, ALL TABLES IN SCHEMA production;
> +</programlisting>
> +  </para>
> +
> +  <para>
> +   Create a publication that publishes all changes for all the tables
> present in
> +   the schemas "marketing" and "sales":
>
> It is better to use <structname> before and </structname> after schema
> names in above descriptions.

Modified

Attached v43 patch has the fixes for the same.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Tuesday, October 19, 2021 11:42 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> This issue got induced in the v42 version, attached v43 patch has the
> fixes for the same.
> 

Thanks for your new patch. I confirmed that this issue has be fixed.

All regression tests passed. 
I also tested V43 in some other scenarios and found no issue.
So the v43 patch LGTM.

Regards
Tang

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Oct 19, 2021 at 9:42 PM vignesh C <vignesh21@gmail.com> wrote:
>

Thanks for the modified patch. I have a few more comments and suggestions:

As the thread [1] is still not concluded, I suggest we fix the
duplicate data case only when schemas are involved by slightly
tweaking the code as per attached. This is just to give you an idea
about what I have in mind, if you find a better solution then feel
free to let me know.

Few additional minor comments:
1.
+-- Test the list of partitions published

Shall we change the above comment as: "Test the list of partitions
published with or without 'PUBLISH_VIA_PARTITION_ROOT' parameter"?

2. psql documentation for \dRp[+] needs to be modified.

\dRp
Before:
" .. If + is appended to the command name, the tables associated with
each publication are shown as well."
After:
" .. If + is appended to the command name, the tables and schemas
associated with each publication are shown as well.

Apart from the above, I think we should merge the first four patches
as there doesn't seem to be any big problems pending. We can keep
still keep tests added by 025_rep_changes_for_schema.pl as a separate
patch as there might be some timing-dependent tests in that file.

[1] -
https://www.postgresql.org/message-id/OS0PR01MB57167F45D481F78CDC5986F794B99%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Oct 20, 2021 at 12:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 19, 2021 at 9:42 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Thanks for the modified patch. I have a few more comments and suggestions:
>
> As the thread [1] is still not concluded, I suggest we fix the
> duplicate data case only when schemas are involved by slightly
> tweaking the code as per attached. This is just to give you an idea
> about what I have in mind, if you find a better solution then feel
> free to let me know.
>
> Few additional minor comments:
> 1.
> +-- Test the list of partitions published
>
> Shall we change the above comment as: "Test the list of partitions
> published with or without 'PUBLISH_VIA_PARTITION_ROOT' parameter"?

Modified

> 2. psql documentation for \dRp[+] needs to be modified.
>
> \dRp
> Before:
> " .. If + is appended to the command name, the tables associated with
> each publication are shown as well."
> After:
> " .. If + is appended to the command name, the tables and schemas
> associated with each publication are shown as well.

Modified

> Apart from the above, I think we should merge the first four patches
> as there doesn't seem to be any big problems pending. We can keep
> still keep tests added by 025_rep_changes_for_schema.pl as a separate
> patch as there might be some timing-dependent tests in that file.

This version of patch retains the changes related to
PublicationRelInfo, I will handle the merging of the patches in the
next version so that this version of patch change related to
PublicationRelInfo can be reviewed easily.

Attached v44 patch as the fixes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v44 patch as the fixes for the same.
>

Regarding the documentation, I think some minor updates are needed in:
 doc/src/sgml/logical-replication.sgml.
For example, currently it says:
   Publications may currently only contain tables. Objects must be
added explicitly, except when a publication is created for
<literal>ALL TABLES</literal>.
There is also some security-related information in this file which may
need updating for ALL TABLES IN SCHEMA.

Also, I'm not sure the documentation updates in the patches clearly
define how partitions relate to ALL TABLES IN SCHEMA.
For example, if a partitioned table belongs to a different schema to
that of a child partition that belongs to a schema specified for ALL
TABLES IN SCHEMA, is it implicitly included in the publication or not
included?

Regards,
Greg Nancarrow
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Thurs, Oct 21, 2021 12:25 AM vignesh C <vignesh21@gmail.com> wrote:
> This version of patch retains the changes related to PublicationRelInfo, I will
> handle the merging of the patches in the next version so that this version of
> patch change related to PublicationRelInfo can be reviewed easily.

Thanks for the patches,
I think the change related to PublicationRelInfo looks good.

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Attached v44 patch as the fixes for the same.
>

In the v44-0001 patch, I have some doubts about the condition guarding
the following code in pg_get_publication_tables():

+ if (schemarelids)
+ {
+    /*
+     * If the publication publishes partition changes via their
+     * respective root partitioned tables, we must exclude
+     * partitions in favor of including the root partitioned
+     * tables. Otherwise, the function could return both the child
+     * and parent tables which could cause data of the child table
+     * to be double-published on the subscriber side.
+     *
+     * XXX As of now, we do this when a publication has associated
+     * schema or for all tables publication. See
+     * GetAllTablesPublicationRelations().
+     */
+    tables = list_concat_unique_oid(relids, schemarelids);
+    if (publication->pubviaroot)
+       tables = filter_partitions(tables);
+ }

Shouldn't a partition be filtered out only if publication->pubviaroot
and the partition belongs to a schema (i.e. ALL TABLES IN SCHEMA)
included in the publication?
The current code seems to filter out partitions of partitioned tables
included in the publication if ANY schemas are included as part of the
publication (e.g. which could be a schema not including any
partitioned tables or partitions).

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Oct 21, 2021 at 4:41 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v44 patch as the fixes for the same.
> >
>
> Regarding the documentation, I think some minor updates are needed in:
>  doc/src/sgml/logical-replication.sgml.
> For example, currently it says:
>    Publications may currently only contain tables. Objects must be
> added explicitly, except when a publication is created for
> <literal>ALL TABLES</literal>.
> There is also some security-related information in this file which may
> need updating for ALL TABLES IN SCHEMA.

Modified

> Also, I'm not sure the documentation updates in the patches clearly
> define how partitions relate to ALL TABLES IN SCHEMA.
> For example, if a partitioned table belongs to a different schema to
> that of a child partition that belongs to a schema specified for ALL
> TABLES IN SCHEMA, is it implicitly included in the publication or not
> included?

Added

Thanks for the comments, the attached v45 patch has the fix for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Oct 21, 2021 at 3:29 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attached v44 patch as the fixes for the same.
> >
>
> In the v44-0001 patch, I have some doubts about the condition guarding
> the following code in pg_get_publication_tables():
>
> + if (schemarelids)
> + {
> +    /*
> +     * If the publication publishes partition changes via their
> +     * respective root partitioned tables, we must exclude
> +     * partitions in favor of including the root partitioned
> +     * tables. Otherwise, the function could return both the child
> +     * and parent tables which could cause data of the child table
> +     * to be double-published on the subscriber side.
> +     *
> +     * XXX As of now, we do this when a publication has associated
> +     * schema or for all tables publication. See
> +     * GetAllTablesPublicationRelations().
> +     */
> +    tables = list_concat_unique_oid(relids, schemarelids);
> +    if (publication->pubviaroot)
> +       tables = filter_partitions(tables);
> + }
>
> Shouldn't a partition be filtered out only if publication->pubviaroot
> and the partition belongs to a schema (i.e. ALL TABLES IN SCHEMA)
> included in the publication?
> The current code seems to filter out partitions of partitioned tables
> included in the publication if ANY schemas are included as part of the
> publication (e.g. which could be a schema not including any
> partitioned tables or partitions).

I could reproduce the issue by using the following test:
--- Setup
create schema sch1;
create schema sch2;
create table sch1.tbl1 (a int) partition by range (a);
create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
create table sch2.tbl1_part2 partition of sch1.tbl1 for values from
(10) to (20);
create schema sch3;
create table sch3.t1(c1 int);

--- Publication
create publication pub1 for all tables in schema sch3, table
sch1.tbl1, table sch2.tbl1_part1 with ( publish_via_partition_root
=on);
insert into sch1.tbl1 values(1);
insert into sch1.tbl1 values(11);
insert into sch3.t1 values(1);

---- Subscription
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=5432' PUBLICATION pub1;

The patch posted at [1] has the fix for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm1onqBEr0WE_e7%3DCNw3bURfrGRmbMjX31d-nx3FGLS10A%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Oct 22, 2021 at 12:19 AM vignesh C <vignesh21@gmail.com> wrote:
>
> I could reproduce the issue by using the following test:
> --- Setup
> create schema sch1;
> create schema sch2;
> create table sch1.tbl1 (a int) partition by range (a);
> create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
> create table sch2.tbl1_part2 partition of sch1.tbl1 for values from
> (10) to (20);
> create schema sch3;
> create table sch3.t1(c1 int);
>
> --- Publication
> create publication pub1 for all tables in schema sch3, table
> sch1.tbl1, table sch2.tbl1_part1 with ( publish_via_partition_root
> =on);
> insert into sch1.tbl1 values(1);
> insert into sch1.tbl1 values(11);
> insert into sch3.t1 values(1);
>
> ---- Subscription
> CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
> port=5432' PUBLICATION pub1;
>
> The patch posted at [1] has the fix for the same.
> [1] - https://www.postgresql.org/message-id/CALDaNm1onqBEr0WE_e7%3DCNw3bURfrGRmbMjX31d-nx3FGLS10A%40mail.gmail.com
>

Thanks for addressing this.
I checked the updated code and did some more testing and the fix LGTM.

I was also previously concerned about what the behavior should be when
only including just the partitions of a partitioned table in a
publication using ALL TABLES IN SCHEMA and having
publish_via_partition_root=true. It seems to implicitly include the
partitioned table in the publication. But I did some testing and found
that this is the current behavior when only the partitions are
individually included in a publication using TABLE, so it seems to be
OK.

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Thanks for the comments, the attached v45 patch has the fix for the same.
>

The first patch is mostly looking good to me apart from the below
minor comments:

1.
+  <para>
+   The catalog <structname>pg_publication_namespace</structname> contains the
+   mapping between schemas and publications in the database.  This is a
+   many-to-many mapping.

There are extra spaces after mapping at the end which are not required.

2.
+   <literal>CREATE</literal> privilege on the database.  Also, the new owner
+   of a <literal>FOR ALL TABLES</literal> publication must be a superuser.

I think we can modify the second line as: "Also, the new owner of a
<literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN
SCHEMA</literal> publication must be a superuser.

3.
/table's schema as part of specified schema is not supported./table's
schema as part of the specified schema is not supported.

4.
+  <para>
+   Create a publication that publishes all changes for tables
+   <structname>users</structname>, <structname>departments</structname> and
+   that publishes all changes for all the tables present in the schema
+   <structname>production</structname>:

I don't think '...that publishes...' is required twice in the above sentence.

5.
+static List *OpenReliIdList(List *relids);
 static List *OpenTableList(List *tables);
 static void CloseTableList(List *rels);
 static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
  AlterPublicationStmt *stmt);
 static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
+static void LockSchemaList(List *schemalist);
+static void PublicationAddSchemas(Oid pubid, List *schemas, bool if_not_exists,
+   AlterPublicationStmt *stmt);
+static void PublicationDropSchemas(Oid pubid, List *schemas, bool missing_ok);

Keep the later definitions also in this order. I suggest move
LockSchemaList() just after CloseTableList() both in declaration and
definition.

6.
+/*
+ * Convert the PublicationObjSpecType list into schema oid list and rangevar
+ * list.
+ */

I think you need to say publication table instead of rangevar in the
above comment.

7.
+ /*
+ * It is quite possible that for the SET case user has not specified any
+ * schema in which case we need to remove all the existing schemas.
+ */

/schema/schemas

8.
+/*
+ * Open relations specified by a RangeVar list.

/RangeVar/PublicationTable

9.
+static bool
+_equalPublicationObject(const PublicationObjSpec *a,
+ const PublicationObjSpec *b)
+{
+ COMPARE_SCALAR_FIELD(pubobjtype);
+ COMPARE_STRING_FIELD(name);
+ COMPARE_NODE_FIELD(pubtable);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+

Let's define this immediately before _equalPublicationTable as all
publication functions are defined there. Also, make the handling of
T_PublicationObjSpec before T_PublicationTable in equal() function as
that is the way nodes are defined.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Oct 22, 2021 at 12:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> I was also previously concerned about what the behavior should be when
> only including just the partitions of a partitioned table in a
> publication using ALL TABLES IN SCHEMA and having
> publish_via_partition_root=true. It seems to implicitly include the
> partitioned table in the publication. But I did some testing and found
> that this is the current behavior when only the partitions are
> individually included in a publication using TABLE, so it seems to be
> OK.
>

Thinking some more about this, it still may still be confusing to the
user if not explicitly stated in the ALL TABLES IN SCHEMA case.
How about adding some additional explanation to the end of the
following paragraph:

+     <para>
+      When a partitioned table is published via schema level publication, all
+      of its existing and future partitions irrespective of it being from the
+      publication schema or not are implicitly considered to be part of the
+      publication. So, even operations that are performed directly on a
+      partition are also published via publications that its ancestors are
+      part of.
+     </para>

Something like:

Similarly, if a partition is published via schema level publication
and publish_via_partition_root=true, the parent partitioned table is
implicitly considered to be part of the publication, irrespective of
it being from the publication schema or not.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Fri, Oct 22, 2021 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Thanks for the comments, the attached v45 patch has the fix for the same.
> >
>
> The first patch is mostly looking good to me apart from the below
> minor comments:

Let me share other minor comments on v45-0001 patch:

>
> 1.
> +  <para>
> +   The catalog <structname>pg_publication_namespace</structname> contains the
> +   mapping between schemas and publications in the database.  This is a
> +   many-to-many mapping.
>
> There are extra spaces after mapping at the end which are not required.

+   <literal>ADD</literal> and <literal>DROP</literal>  clauses will add and
+   remove one or more tables/schemas from the publication.  Note that adding
+   tables/schemas to a publication that is already subscribed to will require a

There is also an extra space after "adding".

-    [ FOR TABLE [ ONLY ] <replaceable
class="parameter">table_name</replaceable> [ * ] [, ...]
-      | FOR ALL TABLES ]
+    [ FOR ALL TABLES
+      | FOR <replaceable
class="parameter">publication_object</replaceable> [, ... ] ]

Similarly, after "TABLES".

+
+     <para>
+      Specifying a table that is part of a schema specified by
+      <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported.
+     </para>

And, after "by".

---

+static void
+AlterPublicationSchemas(AlterPublicationStmt *stmt,
+                                                HeapTuple tup, List
*schemaidlist)
+{
(snip)
+                PublicationAddSchemas(pubform->oid, schemaidlist, true, stmt);
+        }
+
+        return;
+}

The "return" at the end of the function is not necessary.

---
+                        if (pubobj->name)
+                                pubobj->pubobjtype =
PUBLICATIONOBJ_REL_IN_SCHEMA;
+                        else if (!pubobj->name && !pubobj->pubtable)
+                                pubobj->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
+                        else if (!pubobj->name)
+                                ereport(ERROR,
+                                                errcode(ERRCODE_SYNTAX_ERROR),
+                                                errmsg("invalid
schema name at or near"),
+
parser_errposition(pubobj->location));

I think it's better to change the last "else if" to just "else".

---
+
+                        if (schemarelids)
+                        {
+                                /*
+                                 * If the publication publishes
partition changes via their
+                                 * respective root partitioned
tables, we must exclude
+                                 * partitions in favor of including
the root partitioned
+                                 * tables. Otherwise, the function
could return both the child
+                                 * and parent tables which could
cause data of the child table
+                                 * to be double-published on the
subscriber side.
+                                 *
+                                 * XXX As of now, we do this when a
publication has associated
+                                 * schema or for all tables publication. See
+                                 * GetAllTablesPublicationRelations().
+                                 */
+                                tables =
list_concat_unique_oid(relids, schemarelids);
+                                if (publication->pubviaroot)
+                                        tables =
filter_partitions(tables, schemarelids);
+                        }
+                        else
+                                tables = relids;
+
+                }

There is an extra newline after "table = relids;".

The rest looks good to me.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Oct 22, 2021 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Thanks for the comments, the attached v45 patch has the fix for the same.
> >
>
> The first patch is mostly looking good to me apart from the below
> minor comments:
>
> 1.
> +  <para>
> +   The catalog <structname>pg_publication_namespace</structname> contains the
> +   mapping between schemas and publications in the database.  This is a
> +   many-to-many mapping.
>
> There are extra spaces after mapping at the end which are not required.

Modified

> 2.
> +   <literal>CREATE</literal> privilege on the database.  Also, the new owner
> +   of a <literal>FOR ALL TABLES</literal> publication must be a superuser.
>
> I think we can modify the second line as: "Also, the new owner of a
> <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN
> SCHEMA</literal> publication must be a superuser.

Modified

> 3.
> /table's schema as part of specified schema is not supported./table's
> schema as part of the specified schema is not supported.

Modified

> 4.
> +  <para>
> +   Create a publication that publishes all changes for tables
> +   <structname>users</structname>, <structname>departments</structname> and
> +   that publishes all changes for all the tables present in the schema
> +   <structname>production</structname>:
>
> I don't think '...that publishes...' is required twice in the above sentence.

Modified

> 5.
> +static List *OpenReliIdList(List *relids);
>  static List *OpenTableList(List *tables);
>  static void CloseTableList(List *rels);
>  static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
>   AlterPublicationStmt *stmt);
>  static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
> +static void LockSchemaList(List *schemalist);
> +static void PublicationAddSchemas(Oid pubid, List *schemas, bool if_not_exists,
> +   AlterPublicationStmt *stmt);
> +static void PublicationDropSchemas(Oid pubid, List *schemas, bool missing_ok);
>
> Keep the later definitions also in this order. I suggest move
> LockSchemaList() just after CloseTableList() both in declaration and
> definition.

Modified

> 6.
> +/*
> + * Convert the PublicationObjSpecType list into schema oid list and rangevar
> + * list.
> + */
>
> I think you need to say publication table instead of rangevar in the
> above comment.

Modified

> 7.
> + /*
> + * It is quite possible that for the SET case user has not specified any
> + * schema in which case we need to remove all the existing schemas.
> + */
>
> /schema/schemas

Modified

> 8.
> +/*
> + * Open relations specified by a RangeVar list.
>
> /RangeVar/PublicationTable

Modified

> 9.
> +static bool
> +_equalPublicationObject(const PublicationObjSpec *a,
> + const PublicationObjSpec *b)
> +{
> + COMPARE_SCALAR_FIELD(pubobjtype);
> + COMPARE_STRING_FIELD(name);
> + COMPARE_NODE_FIELD(pubtable);
> + COMPARE_LOCATION_FIELD(location);
> +
> + return true;
> +}
> +
>
> Let's define this immediately before _equalPublicationTable as all
> publication functions are defined there. Also, make the handling of
> T_PublicationObjSpec before T_PublicationTable in equal() function as
> that is the way nodes are defined.

Modified

Thanks for the comments, attached v46 patch has the fix for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Oct 22, 2021 at 11:59 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Oct 22, 2021 at 12:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > I was also previously concerned about what the behavior should be when
> > only including just the partitions of a partitioned table in a
> > publication using ALL TABLES IN SCHEMA and having
> > publish_via_partition_root=true. It seems to implicitly include the
> > partitioned table in the publication. But I did some testing and found
> > that this is the current behavior when only the partitions are
> > individually included in a publication using TABLE, so it seems to be
> > OK.
> >
>
> Thinking some more about this, it still may still be confusing to the
> user if not explicitly stated in the ALL TABLES IN SCHEMA case.
> How about adding some additional explanation to the end of the
> following paragraph:
>
> +     <para>
> +      When a partitioned table is published via schema level publication, all
> +      of its existing and future partitions irrespective of it being from the
> +      publication schema or not are implicitly considered to be part of the
> +      publication. So, even operations that are performed directly on a
> +      partition are also published via publications that its ancestors are
> +      part of.
> +     </para>
>
> Something like:
>
> Similarly, if a partition is published via schema level publication
> and publish_via_partition_root=true, the parent partitioned table is
> implicitly considered to be part of the publication, irrespective of
> it being from the publication schema or not.

I have made this change in the v46 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3kBrMO5EyEgK_TyOrBuw%2BRvdJ2mJfpWb5e8FbuKg2cQw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Oct 22, 2021 at 1:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Oct 22, 2021 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > >
> > > Thanks for the comments, the attached v45 patch has the fix for the same.
> > >
> >
> > The first patch is mostly looking good to me apart from the below
> > minor comments:
>
> Let me share other minor comments on v45-0001 patch:
>
> >
> > 1.
> > +  <para>
> > +   The catalog <structname>pg_publication_namespace</structname> contains the
> > +   mapping between schemas and publications in the database.  This is a
> > +   many-to-many mapping.
> >
> > There are extra spaces after mapping at the end which are not required.

Modified

> +   <literal>ADD</literal> and <literal>DROP</literal>  clauses will add and
> +   remove one or more tables/schemas from the publication.  Note that adding
> +   tables/schemas to a publication that is already subscribed to will require a
>
> There is also an extra space after "adding".

Modified

> -    [ FOR TABLE [ ONLY ] <replaceable
> class="parameter">table_name</replaceable> [ * ] [, ...]
> -      | FOR ALL TABLES ]
> +    [ FOR ALL TABLES
> +      | FOR <replaceable
> class="parameter">publication_object</replaceable> [, ... ] ]
>
> Similarly, after "TABLES".

Modified

> +
> +     <para>
> +      Specifying a table that is part of a schema specified by
> +      <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported.
> +     </para>
>
> And, after "by".

Modified

> ---
>
> +static void
> +AlterPublicationSchemas(AlterPublicationStmt *stmt,
> +                                                HeapTuple tup, List
> *schemaidlist)
> +{
> (snip)
> +                PublicationAddSchemas(pubform->oid, schemaidlist, true, stmt);
> +        }
> +
> +        return;
> +}
>
> The "return" at the end of the function is not necessary.

Modified

> ---
> +                        if (pubobj->name)
> +                                pubobj->pubobjtype =
> PUBLICATIONOBJ_REL_IN_SCHEMA;
> +                        else if (!pubobj->name && !pubobj->pubtable)
> +                                pubobj->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
> +                        else if (!pubobj->name)
> +                                ereport(ERROR,
> +                                                errcode(ERRCODE_SYNTAX_ERROR),
> +                                                errmsg("invalid
> schema name at or near"),
> +
> parser_errposition(pubobj->location));
>
> I think it's better to change the last "else if" to just "else".

Modified

> ---
> +
> +                        if (schemarelids)
> +                        {
> +                                /*
> +                                 * If the publication publishes
> partition changes via their
> +                                 * respective root partitioned
> tables, we must exclude
> +                                 * partitions in favor of including
> the root partitioned
> +                                 * tables. Otherwise, the function
> could return both the child
> +                                 * and parent tables which could
> cause data of the child table
> +                                 * to be double-published on the
> subscriber side.
> +                                 *
> +                                 * XXX As of now, we do this when a
> publication has associated
> +                                 * schema or for all tables publication. See
> +                                 * GetAllTablesPublicationRelations().
> +                                 */
> +                                tables =
> list_concat_unique_oid(relids, schemarelids);
> +                                if (publication->pubviaroot)
> +                                        tables =
> filter_partitions(tables, schemarelids);
> +                        }
> +                        else
> +                                tables = relids;
> +
> +                }
>
> There is an extra newline after "table = relids;".

Removed it

I have made this change in the v46 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3kBrMO5EyEgK_TyOrBuw%2BRvdJ2mJfpWb5e8FbuKg2cQw%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Oct 22, 2021 at 11:59 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Oct 22, 2021 at 12:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > I was also previously concerned about what the behavior should be when
> > only including just the partitions of a partitioned table in a
> > publication using ALL TABLES IN SCHEMA and having
> > publish_via_partition_root=true. It seems to implicitly include the
> > partitioned table in the publication. But I did some testing and found
> > that this is the current behavior when only the partitions are
> > individually included in a publication using TABLE, so it seems to be
> > OK.
> >
>
> Thinking some more about this, it still may still be confusing to the
> user if not explicitly stated in the ALL TABLES IN SCHEMA case.
> How about adding some additional explanation to the end of the
> following paragraph:
>
> +     <para>
> +      When a partitioned table is published via schema level publication, all
> +      of its existing and future partitions irrespective of it being from the
> +      publication schema or not are implicitly considered to be part of the
> +      publication. So, even operations that are performed directly on a
> +      partition are also published via publications that its ancestors are
> +      part of.
> +     </para>
>
> Something like:
>
> Similarly, if a partition is published via schema level publication
> and publish_via_partition_root=true, the parent partitioned table is
> implicitly considered to be part of the publication, irrespective of
> it being from the publication schema or not.
>

I don't think we need to explicitly mention this in docs as this is
quite similar to what is happening for the "For Table" case as well
and it seems to be clarified by "publish_via_partition_root"
definition in Create Publication docs [1].

[1] - https://www.postgresql.org/docs/devel/sql-createpublication.html

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote:
>

I am getting a compilation error in the latest patch on HEAD. I think
was relying on some variable removed by a recent commit
92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that
compilation error, I observed that we don't need the second and third
parameters in pg_dump.c/getPublicationNamespaces() as those are not
getting used.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, Oct 25, 2021 at 10:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> I am getting a compilation error in the latest patch on HEAD. I think
> was relying on some variable removed by a recent commit
> 92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that
> compilation error, I observed that we don't need the second and third
> parameters in pg_dump.c/getPublicationNamespaces() as those are not
> getting used.

I have fixed this in the v47 version attached.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Oct 25, 2021 at 10:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > I am getting a compilation error in the latest patch on HEAD. I think
> > was relying on some variable removed by a recent commit
> > 92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that
> > compilation error, I observed that we don't need the second and third
> > parameters in pg_dump.c/getPublicationNamespaces() as those are not
> > getting used.
>
> I have fixed this in the v47 version attached.
>

Thanks, the first patch in the series "Allow publishing the tables of
schema." looks good to me. Unless there are more
comments/bugs/objections, I am planning to commit it in a day or so.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have fixed this in the v47 version attached.
> >
>
> Thanks, the first patch in the series "Allow publishing the tables of
> schema." looks good to me. Unless there are more
> comments/bugs/objections, I am planning to commit it in a day or so.
>

Yesterday, I have pushed the first patch. Feel free to submit the
remaining patches.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Oct 28, 2021 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have fixed this in the v47 version attached.
> > >
> >
> > Thanks, the first patch in the series "Allow publishing the tables of
> > schema." looks good to me. Unless there are more
> > comments/bugs/objections, I am planning to commit it in a day or so.
> >
>
> Yesterday, I have pushed the first patch. Feel free to submit the
> remaining patches.

Thanks for committing the patch, please find the remaining patches attached.
Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments
offline, I have fixed those in the attached patch.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Thu, Oct 28, 2021 at 3:25 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for committing the patch, please find the remaining patches attached.

A few comments on the v48-0002 patch:

(1) The quoting of TABLE/SCHEMA looks a bit odd in the patch comment

(2) src/backend/catalog/system_views.sq
ON should be capitalized in the following line:
+    JOIN pg_catalog.pg_namespace N on N.oid = S.pnnspid

(3) Some basic tests should be added for this in the publication tests.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Oct 28, 2021 at 9:55 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for committing the patch, please find the remaining patches attached.
> Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments
> offline, I have fixed those in the attached patch.
>

Pushed the first test case patch. About
v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it
doesn't display anything for "for all tables" publication. Instead of
selecting from pg_publication_rel, you can use the existing view
pg_publication_tables to solve that problem.

Having said that, I am not completely sure about the value of this new
view pg_publication_objects which displays all objects of
publications. I see that users might want to see all the objects that
the publication publishes and when we include other objects like
sequences it might be more helpful.

Sawada-San, others, what do you think? Is it really useful to have such a view?

One point to think is if we introduce such a view then how it should
behave for schema objects? Do we need to display only schemas or
additionally all the tables in the schema as well? If you follow the
above suggestion of mine then I think it will display both schemas
published and tables in that schema that will be considered for
publishing.

-- 
With Regards,
Amit Kapila.



RE: Added schema level support for publication.

From
"tanghy.fnst@fujitsu.com"
Date:
On Friday, October 29, 2021 12:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Thu, Oct 28, 2021 at 9:55 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for committing the patch, please find the remaining patches attached.
> > Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments
> > offline, I have fixed those in the attached patch.
> >
> 
> Pushed the first test case patch. About
> v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it
> doesn't display anything for "for all tables" publication. Instead of
> selecting from pg_publication_rel, you can use the existing view
> pg_publication_tables to solve that problem.
> 
> Having said that, I am not completely sure about the value of this new
> view pg_publication_objects which displays all objects of
> publications. I see that users might want to see all the objects that
> the publication publishes and when we include other objects like
> sequences it might be more helpful.
> 
> Sawada-San, others, what do you think? Is it really useful to have such a view?
> 
> One point to think is if we introduce such a view then how it should
> behave for schema objects? Do we need to display only schemas or
> additionally all the tables in the schema as well? If you follow the
> above suggestion of mine then I think it will display both schemas
> published and tables in that schema that will be considered for
> publishing.
> 

Personally, if I want to see ALL the published objects in a publication, I would use
'\dRp+' command. I think there might not be too much value to have this view.

Regards
Tang

Re: Added schema level support for publication.

From
Tomas Vondra
Date:
Hi,

On 10/28/21 04:41, Amit Kapila wrote:
> On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
>>>
>>> I have fixed this in the v47 version attached.
>>>
>>
>> Thanks, the first patch in the series "Allow publishing the tables of
>> schema." looks good to me. Unless there are more
>> comments/bugs/objections, I am planning to commit it in a day or so.
>>
> 
> Yesterday, I have pushed the first patch. Feel free to submit the
> remaining patches.
> 

I haven't been following this thread recently, but while rebasing the 
sequence decoding patch I noticed this adds

     PUBLICATIONOBJ_TABLE,            /* Table type */
     PUBLICATIONOBJ_REL_IN_SCHEMA,    /* Relations in schema type */

Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel 
instead of table?

I'm asking because the sequence decoding patch mimics ALTER PUBLICATION 
options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this 
seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA, 
which does not specify the object type.

I wonder if it'd be better to just separate the schema and object type 
specification, instead of mashing it into a single constant. Otherwise 
we'll end up with (M x N) combinations, which seems silly.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Fri, Oct 29, 2021 at 3:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Sawada-San, others, what do you think? Is it really useful to have such a view?
>
> One point to think is if we introduce such a view then how it should
> behave for schema objects? Do we need to display only schemas or
> additionally all the tables in the schema as well? If you follow the
> above suggestion of mine then I think it will display both schemas
> published and tables in that schema that will be considered for
> publishing.
>

I find the proposed view useful for processing the publication
structure and members in SQL, without having to piece the information
together from the other pg_publication_* tables.
Personally I don't think it is necessary to additionally display all
tables in the schema (that information can be retrieved by pg_tables
or information_schema.tables, if needed).


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Fri, Oct 29, 2021 at 1:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 9:55 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for committing the patch, please find the remaining patches attached.
> > Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments
> > offline, I have fixed those in the attached patch.
> >
>
> Pushed the first test case patch. About
> v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it
> doesn't display anything for "for all tables" publication. Instead of
> selecting from pg_publication_rel, you can use the existing view
> pg_publication_tables to solve that problem.
>
> Having said that, I am not completely sure about the value of this new
> view pg_publication_objects which displays all objects of
> publications. I see that users might want to see all the objects that
> the publication publishes and when we include other objects like
> sequences it might be more helpful.
>
> Sawada-San, others, what do you think? Is it really useful to have such a view?

I haven't followed the discussion on pg_publication_objects view but
what is the primary use case of this view? If it's to list all tables
published in a publication (e.g, "select * from pg_publication_objects
where pubname = 'pub1'), pg_publication_objects view lacks the
information of FOR ALL TABLES publications. And probably we can use
pg_publication_tables instead. On the other hand, if it's to list all
tables published in FOR ALL TABLES IN SCHEMA publications (e.g.,
"select * from pg_publication_object where objtype = 'schema'), the
view doesn't show tables published in such publications.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Greg Nancarrow
Date:
On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I haven't followed the discussion on pg_publication_objects view but
> what is the primary use case of this view? If it's to list all tables
> published in a publication (e.g, "select * from pg_publication_objects
> where pubname = 'pub1'), pg_publication_objects view lacks the
> information of FOR ALL TABLES publications. And probably we can use
> pg_publication_tables instead. On the other hand, if it's to list all
> tables published in FOR ALL TABLES IN SCHEMA publications (e.g.,
> "select * from pg_publication_object where objtype = 'schema'), the
> view doesn't show tables published in such publications.
>

I think that Amit originally suggested to have a view that provides
information about the objects in each publication (like table, tables
in schema, sequence ...).
So it currently is at the granularity level of the objects that are
actually added to the publication (TABLE t, ALL TABLES IN SCHEMA s)
I agree that the view is currently missing ALL TABLES publication
information, but I think it could be easily added.
Also, currently for the "objtype" column, the type "schema" does not
seem specific enough; maybe that should be instead named
"all-tables-in-schema" or similar.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 10/28/21 04:41, Amit Kapila wrote:
> > On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
> >>>
> >>> I have fixed this in the v47 version attached.
> >>>
> >>
> >> Thanks, the first patch in the series "Allow publishing the tables of
> >> schema." looks good to me. Unless there are more
> >> comments/bugs/objections, I am planning to commit it in a day or so.
> >>
> >
> > Yesterday, I have pushed the first patch. Feel free to submit the
> > remaining patches.
> >
>
> I haven't been following this thread recently, but while rebasing the
> sequence decoding patch I noticed this adds
>
>      PUBLICATIONOBJ_TABLE,            /* Table type */
>      PUBLICATIONOBJ_REL_IN_SCHEMA,    /* Relations in schema type */
>
> Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel
> instead of table?
>

Yeah, it should be PUBLICATIONOBJ_TABLE_IN_SCHEMA considering we have
to add other objects like sequence.

> I'm asking because the sequence decoding patch mimics ALTER PUBLICATION
> options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this
> seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA,
> which does not specify the object type.
>

I think we should name it PUBLICATIONOBJ_TABLE_CURRSCHEMA. Does that make sense?

> I wonder if it'd be better to just separate the schema and object type
> specification, instead of mashing it into a single constant.
>

Do you mean to say the syntax on the lines of Create Publication For
Table t1, t2 Schema s1, s2;? If so, then originally the patch had the
syntax on those lines but Tom pointed out that the meaning of such a
syntax can change over a period of time and that can break apps [1]. I
think the current syntax gives a lot of flexibility to users and we
have some precedent for it as well.

[1] - https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Nov 1, 2021 at 2:38 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I haven't followed the discussion on pg_publication_objects view but
> > what is the primary use case of this view? If it's to list all tables
> > published in a publication (e.g, "select * from pg_publication_objects
> > where pubname = 'pub1'), pg_publication_objects view lacks the
> > information of FOR ALL TABLES publications. And probably we can use
> > pg_publication_tables instead. On the other hand, if it's to list all
> > tables published in FOR ALL TABLES IN SCHEMA publications (e.g.,
> > "select * from pg_publication_object where objtype = 'schema'), the
> > view doesn't show tables published in such publications.
> >

Both the problems mentioned can be fixed if we follow the change
suggested by me in one of the emails above [1].

>
> I think that Amit originally suggested to have a view that provides
> information about the objects in each publication (like table, tables
> in schema, sequence ...).
>

Right and I think as you also mentioned in your previous email it can
save the effort of users if they want to know all the objects
published via a publication. I am just not sure if it is worth adding
such a view or we leave it to users to find that information via
querying individual views or system tables for objects.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BL2-6JQ174sVfE3_K%3DmjTKJ2A8-z%2B_pExDHhqdBJvb5Q%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Tomas Vondra
Date:

On 11/1/21 11:18, Amit Kapila wrote:
> On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 10/28/21 04:41, Amit Kapila wrote:
>>> On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>
>>>> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote:
>>>>>
>>>>> I have fixed this in the v47 version attached.
>>>>>
>>>>
>>>> Thanks, the first patch in the series "Allow publishing the tables of
>>>> schema." looks good to me. Unless there are more
>>>> comments/bugs/objections, I am planning to commit it in a day or so.
>>>>
>>>
>>> Yesterday, I have pushed the first patch. Feel free to submit the
>>> remaining patches.
>>>
>>
>> I haven't been following this thread recently, but while rebasing the
>> sequence decoding patch I noticed this adds
>>
>>       PUBLICATIONOBJ_TABLE,            /* Table type */
>>       PUBLICATIONOBJ_REL_IN_SCHEMA,    /* Relations in schema type */
>>
>> Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel
>> instead of table?
>>
> 
> Yeah, it should be PUBLICATIONOBJ_TABLE_IN_SCHEMA considering we have
> to add other objects like sequence.
> 
>> I'm asking because the sequence decoding patch mimics ALTER PUBLICATION
>> options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this
>> seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA,
>> which does not specify the object type.
>>
> 
> I think we should name it PUBLICATIONOBJ_TABLE_CURRSCHEMA. Does that make sense?
> 
>> I wonder if it'd be better to just separate the schema and object type
>> specification, instead of mashing it into a single constant.
>>
> 
> Do you mean to say the syntax on the lines of Create Publication For
> Table t1, t2 Schema s1, s2;? If so, then originally the patch had the
> syntax on those lines but Tom pointed out that the meaning of such a
> syntax can change over a period of time and that can break apps [1]. I
> think the current syntax gives a lot of flexibility to users and we
> have some precedent for it as well.
> 

No, I'm not talking about the syntax at all - I'm talking about how we 
represent it. PUBLICATIONOBJ_TABLE_CURRSCHEMA mixes the object type and 
schema in the same constant, so I am wondering if we should just split 
that into two pieces - one determining the schema, one determining the 
object type. So PublicationObjSpec would have two fields instead of just 
pubobjtype.

The advantage would be we wouldn't need a whole lot of new constants for 
each object type - adding sequences pretty much means adding

     PUBLICATIONOBJ_SEQUENCE
     PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA
     PUBLICATIONOBJ_SEQUENCE_CURRSCHEMA

and after splitting we'd need just the first one. But maybe it's not 
that bad, though. We don't expect all that many object types in 
publications, I guess.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Added schema level support for publication.

From
Masahiko Sawada
Date:
On Mon, Nov 1, 2021 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 1, 2021 at 2:38 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I haven't followed the discussion on pg_publication_objects view but
> > > what is the primary use case of this view? If it's to list all tables
> > > published in a publication (e.g, "select * from pg_publication_objects
> > > where pubname = 'pub1'), pg_publication_objects view lacks the
> > > information of FOR ALL TABLES publications. And probably we can use
> > > pg_publication_tables instead. On the other hand, if it's to list all
> > > tables published in FOR ALL TABLES IN SCHEMA publications (e.g.,
> > > "select * from pg_publication_object where objtype = 'schema'), the
> > > view doesn't show tables published in such publications.
> > >
>
> Both the problems mentioned can be fixed if we follow the change
> suggested by me in one of the emails above [1].
>
> >
> > I think that Amit originally suggested to have a view that provides
> > information about the objects in each publication (like table, tables
> > in schema, sequence ...).
> >
>
> Right and I think as you also mentioned in your previous email it can
> save the effort of users if they want to know all the objects
> published via a publication.

Thank you for the explanation. Given we already have
pg_publication_tables view, if pg_publication_objects view also shows
all tables published in FOR ALL TABLES publications or FOR ALL TABLES
IN SCHEMA publications, there is essentially not much difference
between pg_publication_tables and pg_publication_objects except for
objtype column. Right? If so it'd be better to have one row for each
FOR ALL TABLES publication and FOR ALL TABLES IN SCHEMA publication
with objtype = 'database' or 'schema' etc, instead of individual
tables.

> I am just not sure if it is worth adding
> such a view or we leave it to users to find that information via
> querying individual views or system tables for objects.

I've not looked at the patch for logical replication of sequences but
the view becomes more useful once we support the new type of
replication object? If so, we can consider this view again after the
patch gets committed.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Nov 2, 2021 at 6:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > I am just not sure if it is worth adding
> > such a view or we leave it to users to find that information via
> > querying individual views or system tables for objects.
>
> I've not looked at the patch for logical replication of sequences but
> the view becomes more useful once we support the new type of
> replication object? If so, we can consider this view again after the
> patch gets committed.
>

+1.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Mon, Nov 1, 2021 at 5:52 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 11/1/21 11:18, Amit Kapila wrote:
> > On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >> I wonder if it'd be better to just separate the schema and object type
> >> specification, instead of mashing it into a single constant.
> >>
> >
> > Do you mean to say the syntax on the lines of Create Publication For
> > Table t1, t2 Schema s1, s2;? If so, then originally the patch had the
> > syntax on those lines but Tom pointed out that the meaning of such a
> > syntax can change over a period of time and that can break apps [1]. I
> > think the current syntax gives a lot of flexibility to users and we
> > have some precedent for it as well.
> >
>
> No, I'm not talking about the syntax at all - I'm talking about how we
> represent it. PUBLICATIONOBJ_TABLE_CURRSCHEMA mixes the object type and
> schema in the same constant, so I am wondering if we should just split
> that into two pieces - one determining the schema, one determining the
> object type. So PublicationObjSpec would have two fields instead of just
> pubobjtype.
>
> The advantage would be we wouldn't need a whole lot of new constants for
> each object type - adding sequences pretty much means adding
>
>      PUBLICATIONOBJ_SEQUENCE
>      PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA
>      PUBLICATIONOBJ_SEQUENCE_CURRSCHEMA
>
> and after splitting we'd need just the first one.
>

I see your point but OTOH, I think it will lead to additional checks
in post-processing functions like ObjectsInPublicationToOids() as we
have to always check both object type and schema to make decisions.

> But maybe it's not
> that bad, though. We don't expect all that many object types in
> publications, I guess.
>

Yeah, that is also true. So maybe at this, we can just rename the few
types as suggested by you and we can look at it later if we anytime
have more number of objects to add.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Tomas Vondra
Date:

On 11/2/21 11:37 AM, Amit Kapila wrote:
> On Mon, Nov 1, 2021 at 5:52 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 11/1/21 11:18, Amit Kapila wrote:
>>> On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra
>>> <tomas.vondra@enterprisedb.com> wrote:
>>>> I wonder if it'd be better to just separate the schema and object type
>>>> specification, instead of mashing it into a single constant.
>>>>
>>>
>>> Do you mean to say the syntax on the lines of Create Publication For
>>> Table t1, t2 Schema s1, s2;? If so, then originally the patch had the
>>> syntax on those lines but Tom pointed out that the meaning of such a
>>> syntax can change over a period of time and that can break apps [1]. I
>>> think the current syntax gives a lot of flexibility to users and we
>>> have some precedent for it as well.
>>>
>>
>> No, I'm not talking about the syntax at all - I'm talking about how we
>> represent it. PUBLICATIONOBJ_TABLE_CURRSCHEMA mixes the object type and
>> schema in the same constant, so I am wondering if we should just split
>> that into two pieces - one determining the schema, one determining the
>> object type. So PublicationObjSpec would have two fields instead of just
>> pubobjtype.
>>
>> The advantage would be we wouldn't need a whole lot of new constants for
>> each object type - adding sequences pretty much means adding
>>
>>      PUBLICATIONOBJ_SEQUENCE
>>      PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA
>>      PUBLICATIONOBJ_SEQUENCE_CURRSCHEMA
>>
>> and after splitting we'd need just the first one.
>>
> 
> I see your point but OTOH, I think it will lead to additional checks
> in post-processing functions like ObjectsInPublicationToOids() as we
> have to always check both object type and schema to make decisions.
> 

True.

>> But maybe it's not
>> that bad, though. We don't expect all that many object types in
>> publications, I guess.
>>
> 
> Yeah, that is also true. So maybe at this, we can just rename the few
> types as suggested by you and we can look at it later if we anytime
> have more number of objects to add.
> 

+1

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
>
> >
> > Yeah, that is also true. So maybe at this, we can just rename the few
> > types as suggested by you and we can look at it later if we anytime
> > have more number of objects to add.
> >
>
> +1
>

Apart from what you have pointed above, we are using
"DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> >
> > >
> > > Yeah, that is also true. So maybe at this, we can just rename the few
> > > types as suggested by you and we can look at it later if we anytime
> > > have more number of objects to add.
> > >
> >
> > +1
> >
>
> Apart from what you have pointed above, we are using
> "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
> that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".

Thanks for the comments, the attached patch has the changes for the same.

Regards,
Vignesh

Attachment

RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Wed, Nov 3, 2021 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
> On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> > >
> > >
> > > >
> > > > Yeah, that is also true. So maybe at this, we can just rename the
> > > > few types as suggested by you and we can look at it later if we
> > > > anytime have more number of objects to add.
> > > >
> > >
> > > +1
> > >
> >
> > Apart from what you have pointed above, we are using
> > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
> > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".
> 
> Thanks for the comments, the attached patch has the changes for the same.
> 

Thanks for the patch.
I have only one minor comment:

+    PUBLICATIONOBJ_TABLE_IN_SCHEMA, /* Relations in schema type */

I think the ' Relations' in the comments also need to be changed to 'tables'.

The other part of the patch looks good to me.

Best regards,
Hou zj



Re: Added schema level support for publication.

From
Peter Smith
Date:
While you are changing these, maybe also change:

Before: PUBLICATIONOBJ..._CURRSCHEMA
After: PUBLICATIONOBJ..._CUR_SCHEMA

Which seems more readable to me.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Nov 3, 2021 at 11:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> While you are changing these, maybe also change:
>
> Before: PUBLICATIONOBJ..._CURRSCHEMA
> After: PUBLICATIONOBJ..._CUR_SCHEMA
>
> Which seems more readable to me.

Thanks for the comment, the attached patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, Nov 3, 2021 at 11:07 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wed, Nov 3, 2021 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > >
> > > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra
> > > <tomas.vondra@enterprisedb.com> wrote:
> > > >
> > > >
> > > > >
> > > > > Yeah, that is also true. So maybe at this, we can just rename the
> > > > > few types as suggested by you and we can look at it later if we
> > > > > anytime have more number of objects to add.
> > > > >
> > > >
> > > > +1
> > > >
> > >
> > > Apart from what you have pointed above, we are using
> > > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace
> > > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA".
> >
> > Thanks for the comments, the attached patch has the changes for the same.
> >
>
> Thanks for the patch.
> I have only one minor comment:
>
> +       PUBLICATIONOBJ_TABLE_IN_SCHEMA, /* Relations in schema type */
>
> I think the ' Relations' in the comments also need to be changed to 'tables'.
>
> The other part of the patch looks good to me.

Thanks for the comment, the patch at [1] has the changes for the same.
[1] -
https://www.postgresql.org/message-id/CALDaNm3g4ZaJ8h%3D16_A%2BytbyPUdJPaAz94YzQLqkD%3DyPu%2BVPwA%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Wed, Nov 3, 2021 at 11:55 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Nov 3, 2021 at 11:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > While you are changing these, maybe also change:
> >
> > Before: PUBLICATIONOBJ..._CURRSCHEMA
> > After: PUBLICATIONOBJ..._CUR_SCHEMA
> >
> > Which seems more readable to me.
>
> Thanks for the comment, the attached patch has the changes for the same.
>

LGTM. I'll push this tomorrow unless there are more comments or suggestions.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
Peter Smith
Date:
FYI - I found a small problem with one of the new PublicationObjSpec
parser error messages that was introduced by the recent schema
publication commit [1].

The error message text is assuming that the error originates from
CREATE PUBLICATION, but actually that same error can also come from
ALTER PUBLICATION.

For example,

e.g.1) Here the error came from CREATE PUBLICATION, so the message
text looks OK.

test_pub=# CREATE PUBLICATION p1 FOR t1;
2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
SCHEMA should be specified before the table/schema name(s) at
character 27
2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
the table/schema name(s)
LINE 1: CREATE PUBLICATION p1 FOR t1;
                                  ^
~~

e.g.2) Here the error came from ALTER PUBLICATION, so the message text
is not OK because the ALTER syntax [2] does not even have a FOR
keyword.

test_pub=# ALTER PUBLICATION p1 SET t1;
2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
SCHEMA should be specified before the table/schema name(s) at
character 26
2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
the table/schema name(s)
LINE 1: ALTER PUBLICATION p1 SET t1;
                                 ^
------
[1] https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd
[2] https://www.postgresql.org/docs/devel/sql-alterpublication.html

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
On Thurs, Nov 4, 2021 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
> FYI - I found a small problem with one of the new PublicationObjSpec parser
> error messages that was introduced by the recent schema publication commit
> [1].
> 
> The error message text is assuming that the error originates from CREATE
> PUBLICATION, but actually that same error can also come from ALTER
> PUBLICATION.
> e.g.2) Here the error came from ALTER PUBLICATION, so the message text is
> not OK because the ALTER syntax [2] does not even have a FOR keyword.
> 
> test_pub=# ALTER PUBLICATION p1 SET t1;
> 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at character 26
> 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET
> t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;

I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be specified before ...".

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Peter Smith
Date:
On Thu, Nov 4, 2021 at 11:55 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thurs, Nov 4, 2021 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > FYI - I found a small problem with one of the new PublicationObjSpec parser
> > error messages that was introduced by the recent schema publication commit
> > [1].
> >
> > The error message text is assuming that the error originates from CREATE
> > PUBLICATION, but actually that same error can also come from ALTER
> > PUBLICATION.
> > e.g.2) Here the error came from ALTER PUBLICATION, so the message text is
> > not OK because the ALTER syntax [2] does not even have a FOR keyword.
> >
> > test_pub=# ALTER PUBLICATION p1 SET t1;
> > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at character 26
> > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET
> > t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;
>
> I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be specified before ...".
>

+1 - Yes, I also thought the fix should just be some simple/generic
change of the wording like your suggestion (rather than a different
message for both cases).

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Added schema level support for publication.

From
vignesh C
Date:
On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI - I found a small problem with one of the new PublicationObjSpec
> parser error messages that was introduced by the recent schema
> publication commit [1].
>
> The error message text is assuming that the error originates from
> CREATE PUBLICATION, but actually that same error can also come from
> ALTER PUBLICATION.
>
> For example,
>
> e.g.1) Here the error came from CREATE PUBLICATION, so the message
> text looks OK.
>
> test_pub=# CREATE PUBLICATION p1 FOR t1;
> 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at
> character 27
> 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s)
> LINE 1: CREATE PUBLICATION p1 FOR t1;
>                                   ^
> ~~
>
> e.g.2) Here the error came from ALTER PUBLICATION, so the message text
> is not OK because the ALTER syntax [2] does not even have a FOR
> keyword.
>
> test_pub=# ALTER PUBLICATION p1 SET t1;
> 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at
> character 26
> 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s)
> LINE 1: ALTER PUBLICATION p1 SET t1;
>                                  ^

Thanks for the comment, I changed the error message to remove the FOR
keyword. The attached patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Added schema level support for publication.

From
Peter Smith
Date:
On Thu, Nov 4, 2021 at 3:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > FYI - I found a small problem with one of the new PublicationObjSpec
> > parser error messages that was introduced by the recent schema
> > publication commit [1].
> >
> > The error message text is assuming that the error originates from
> > CREATE PUBLICATION, but actually that same error can also come from
> > ALTER PUBLICATION.
> >
> > For example,
> >
> > e.g.1) Here the error came from CREATE PUBLICATION, so the message
> > text looks OK.
> >
> > test_pub=# CREATE PUBLICATION p1 FOR t1;
> > 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at
> > character 27
> > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s)
> > LINE 1: CREATE PUBLICATION p1 FOR t1;
> >                                   ^
> > ~~
> >
> > e.g.2) Here the error came from ALTER PUBLICATION, so the message text
> > is not OK because the ALTER syntax [2] does not even have a FOR
> > keyword.
> >
> > test_pub=# ALTER PUBLICATION p1 SET t1;
> > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at
> > character 26
> > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s)
> > LINE 1: ALTER PUBLICATION p1 SET t1;
> >                                  ^
>
> Thanks for the comment, I changed the error message to remove the FOR
> keyword. The attached patch has the changes for the same.
>

LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Added schema level support for publication.

From
"houzj.fnst@fujitsu.com"
Date:
> On Thu, Nov 4, 2021 at 3:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com>
> wrote:
> > >
> > > FYI - I found a small problem with one of the new PublicationObjSpec
> > > parser error messages that was introduced by the recent schema
> > > publication commit [1].
> > >
> > > The error message text is assuming that the error originates from
> > > CREATE PUBLICATION, but actually that same error can also come from
> > > ALTER PUBLICATION.
> > >
> > > For example,
> > >
> > > e.g.1) Here the error came from CREATE PUBLICATION, so the message
> > > text looks OK.
> > >
> > > test_pub=# CREATE PUBLICATION p1 FOR t1;
> > > 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES
> > > IN SCHEMA should be specified before the table/schema name(s) at
> > > character 27
> > > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1
> > > FOR t1;
> > > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified
> > > before the table/schema name(s) LINE 1: CREATE PUBLICATION p1 FOR
> > > t1;
> > >                                   ^
> > > ~~
> > >
> > > e.g.2) Here the error came from ALTER PUBLICATION, so the message
> > > text is not OK because the ALTER syntax [2] does not even have a FOR
> > > keyword.
> > >
> > > test_pub=# ALTER PUBLICATION p1 SET t1;
> > > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES
> > > IN SCHEMA should be specified before the table/schema name(s) at
> > > character 26
> > > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1
> > > SET t1;
> > > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified
> > > before the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;
> > >                                  ^
> >
> > Thanks for the comment, I changed the error message to remove the FOR
> > keyword. The attached patch has the changes for the same.
> >
> 
> LGTM.

+1

Best regards,
Hou zj

Re: Added schema level support for publication.

From
Peter Smith
Date:
FYI -  I spotted a trivial SQL mistake (?) of the schema publication patch [1].

See the file describe.c, function describeOneTableDetails.
The new SQL has a 3rd UNION that looks like:

...
"UNION\n"
"SELECT pubname\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
oid, oid, oid, oid);

Notice that there is a table alias "p" but it is never used. It seems
to me like it is just an accidental omission. I think it should be
written like -

BEFORE:
"WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
AFTER:
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"

Doing this will make it consistent with the SQL of the nearby "else"
case which uses the same alias as expected.

------
[1] https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Tue, Nov 9, 2021 at 7:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI -  I spotted a trivial SQL mistake (?) of the schema publication patch [1].
>
> See the file describe.c, function describeOneTableDetails.
> The new SQL has a 3rd UNION that looks like:
>
> ...
> "UNION\n"
> "SELECT pubname\n"
> "FROM pg_catalog.pg_publication p\n"
> "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> "ORDER BY 1;",
> oid, oid, oid, oid);
>
> Notice that there is a table alias "p" but it is never used. It seems
> to me like it is just an accidental omission. I think it should be
> written like -
>
> BEFORE:
> "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> AFTER:
> "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
>
> Doing this will make it consistent with the SQL of the nearby "else"
> case which uses the same alias as expected.
>

The above makes sense to me. So, pushed a fix for this along with
Vignesh's patch to fix other comments related to this work.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Tue, Nov 9, 2021 at 2:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 7:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > FYI -  I spotted a trivial SQL mistake (?) of the schema publication patch [1].
> >
> > See the file describe.c, function describeOneTableDetails.
> > The new SQL has a 3rd UNION that looks like:
> >
> > ...
> > "UNION\n"
> > "SELECT pubname\n"
> > "FROM pg_catalog.pg_publication p\n"
> > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> > "ORDER BY 1;",
> > oid, oid, oid, oid);
> >
> > Notice that there is a table alias "p" but it is never used. It seems
> > to me like it is just an accidental omission. I think it should be
> > written like -
> >
> > BEFORE:
> > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> > AFTER:
> > "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> >
> > Doing this will make it consistent with the SQL of the nearby "else"
> > case which uses the same alias as expected.
> >
>
> The above makes sense to me. So, pushed a fix for this along with
> Vignesh's patch to fix other comments related to this work.

Thanks for committing the patch. I have changed the status of the
Commitfest entry for this patch to Committed. Adding of the view can
be handled once the Sequence patch is committed.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Alvaro Herrera
Date:
I just noticed that this (commit 5a2832465fd8) added a separate catalog
to store schemas which are part of a publication, side-by-side with the
catalog to store relations which are part of a publication.  This seems
a strange way to represent publication membership: in order to find out
what objects are members of a publication, you have to scan both
pg_publication_rel and pg_publication_namespace.  Wouldn't it make more
sense to have a single catalog for both things, maybe something like

pg_publication_object
oid        OID    -- unique key (for pg_depend)
prpubid        OID    -- of pg_publication
prrelid        OID    -- OID of relation, or 0 if not a relation
prnspid        OID    -- OID of namespace, or 0 if not a namespace

which seems more natural to me, and pollutes the system less with weird
syscaches, etc.

What do you think?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> I just noticed that this (commit 5a2832465fd8) added a separate catalog
> to store schemas which are part of a publication, side-by-side with the
> catalog to store relations which are part of a publication.  This seems
> a strange way to represent publication membership: in order to find out
> what objects are members of a publication, you have to scan both
> pg_publication_rel and pg_publication_namespace.  Wouldn't it make more
> sense to have a single catalog for both things, maybe something like
>
> pg_publication_object
> oid             OID     -- unique key (for pg_depend)
> prpubid         OID     -- of pg_publication
> prrelid         OID     -- OID of relation, or 0 if not a relation
> prnspid         OID     -- OID of namespace, or 0 if not a namespace
>
> which seems more natural to me, and pollutes the system less with weird
> syscaches, etc.
>

It will unnecessarily increase the size per row for FOR TABLE
publication both because of the additional column and additional index
(schema_id, pub_id) and vice versa. I think future projects (like
row_filter, column_filter, etc) will make this impact much bigger as
new columns for those won't be required for schema publications.
Basically, as soon as we start to store additional properties for
different objects, storing different objects together would start
becoming more and more worse. This will also in turn increase the scan
cost where we need only schema or rel publication access like where
ever we are using PUBLICATIONNAMESPACEMAP. I think it will increase
the cost of scanning table publications as its corresponding index
will be larger.

-- 
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> I just noticed that this (commit 5a2832465fd8) added a separate catalog
> to store schemas which are part of a publication, side-by-side with the
> catalog to store relations which are part of a publication.  This seems
> a strange way to represent publication membership: in order to find out
> what objects are members of a publication, you have to scan both
> pg_publication_rel and pg_publication_namespace.  Wouldn't it make more
> sense to have a single catalog for both things, maybe something like
>
> pg_publication_object
> oid             OID     -- unique key (for pg_depend)
> prpubid         OID     -- of pg_publication
> prrelid         OID     -- OID of relation, or 0 if not a relation
> prnspid         OID     -- OID of namespace, or 0 if not a namespace
>
> which seems more natural to me, and pollutes the system less with weird
> syscaches, etc.
>
> What do you think?

I felt the existing tables are better normalized than the proposed one.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Tom Lane
Date:
[ reviving one aspect of an old thread ]

vignesh C <vignesh21@gmail.com> writes:
> On Mon, Jul 19, 2021 at 9:32 AM tanghy.fnst@fujitsu.com <
> tanghy.fnst@fujitsu.com> wrote:
>> I tested your v12 patch and found a problem in the following case.
>>
>> Step 1:
>> postgres=# create schema s1;
>> CREATE SCHEMA
>> postgres=# create table s1.t1 (a int);
>> CREATE TABLE
>> postgres=# create publication pub_t for table s1.t1;
>> CREATE PUBLICATION
>> postgres=# create publication pub_s for schema s1;
>> CREATE PUBLICATION
>>
>> Step 2:
>> pg_dump -N s1
>>
>> I dumped and excluded schema s1, pg_dump generated the following SQL:
>> -------------------------------
>> ALTER PUBLICATION pub_s ADD SCHEMA s1;
>>
>> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD
> TABLE s1.t1" was not generated in my case. Thoughts?

> Thanks for reporting this issue, this issue is fixed in the v13 patch

I suppose this exchange is what led to this bit in
getPublicationNamespaces:

        /*
         * We always dump publication namespaces unless the corresponding
         * namespace is excluded from the dump.
         */
        if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
            continue;

I'd like to push back against this on three separate grounds:


1. The behavior this produces is extremely non-obvious and not
adequately explained by the comment, which makes one wonder how
much of it was intended.  For example:

* The public schema will be included if listed in FOR ALL TABLES IN,
even though it's not dumped explicitly in the dump, because its dump
mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
was probably intentional, but you wouldn't know it from the comment.

* Schemas created within extensions will be included if listed in FOR
ALL TABLES IN, even though they're not dumped explicitly in the dump.
This seems like a quite accidental by-product of the fact that
checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
member objects, thus making their dump mask not NONE.  If this
behavior was intentional, it needs a less-fragile implementation.

* The information_schema will NOT be included, even if it was listed in
FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
contain any tables that'd be useful to publish.  But still, this seems
like randomly ignoring the user's intent.


2. The complaint was that if a schema is excluded from the dump
by --exclude-schema, then it should not get included in the
publication either.  I think this is at best highly debatable:
arguably it amounts to breaking the publication.  It seems
analogous to deciding that if a function is excluded from the
dump, while a view using the function is included, we should
silently adjust the view by removing the output columns or
WHERE clauses that use the function.  I'm pretty sure that
nobody would think that was sane.  Perhaps there's a case for
excluding the view as a whole, but we don't do that.  Besides, the
corresponding behavior would be to exclude the whole publication,
not silently modify its definition.


3. The corresponding test for individual tables listed in
a publication is coded differently:

        /*
         * Ignore publication membership of tables whose definitions are not
         * to be dumped.
         */
        if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
            continue;

This is considerably easier to understand the effects of than a test
on the whole dump mask: it will list the table if we intend to emit
CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
But why is it different from the code for schemas?


So I think that this was just wrongly thought through.  My
preference would be to either delete the above-quoted bit in
getPublicationNamespaces entirely, or make it look like the
test in getPublicationTables.  Or maybe we should delete
both of these tests, on the grounds that redefining the
contents of the publication is far outside pg_dump's charter.

BTW, the discussion that caused me to notice this is at [1].
I'd come to the conclusion that doing something on the basis of
"dobj->dump == DUMP_COMPONENT_NONE" is probably an anti-pattern.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAKNkYnwXFBf136%3Du9UqUxFUVagevLQJ%3DzGd5BsLhCsatDvQsKQ%40mail.gmail.com



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Sat, Dec 14, 2024 at 5:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I suppose this exchange is what led to this bit in
> getPublicationNamespaces:
>
>         /*
>          * We always dump publication namespaces unless the corresponding
>          * namespace is excluded from the dump.
>          */
>         if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
>             continue;
>
> I'd like to push back against this on three separate grounds:
>
>
> 1. The behavior this produces is extremely non-obvious and not
> adequately explained by the comment, which makes one wonder how
> much of it was intended.  For example:
>
> * The public schema will be included if listed in FOR ALL TABLES IN,
> even though it's not dumped explicitly in the dump, because its dump
> mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
> was probably intentional, but you wouldn't know it from the comment.
>
> * Schemas created within extensions will be included if listed in FOR
> ALL TABLES IN, even though they're not dumped explicitly in the dump.
> This seems like a quite accidental by-product of the fact that
> checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
> member objects, thus making their dump mask not NONE.  If this
> behavior was intentional, it needs a less-fragile implementation.
>
> * The information_schema will NOT be included, even if it was listed in
> FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
> contain any tables that'd be useful to publish.  But still, this seems
> like randomly ignoring the user's intent.
>
>
> 2. The complaint was that if a schema is excluded from the dump
> by --exclude-schema, then it should not get included in the
> publication either.  I think this is at best highly debatable:
> arguably it amounts to breaking the publication.  It seems
> analogous to deciding that if a function is excluded from the
> dump, while a view using the function is included, we should
> silently adjust the view by removing the output columns or
> WHERE clauses that use the function.  I'm pretty sure that
> nobody would think that was sane.  Perhaps there's a case for
> excluding the view as a whole, but we don't do that.  Besides, the
> corresponding behavior would be to exclude the whole publication,
> not silently modify its definition.
>
>
> 3. The corresponding test for individual tables listed in
> a publication is coded differently:
>
>         /*
>          * Ignore publication membership of tables whose definitions are not
>          * to be dumped.
>          */
>         if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
>             continue;
>
> This is considerably easier to understand the effects of than a test
> on the whole dump mask: it will list the table if we intend to emit
> CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
> But why is it different from the code for schemas?
>
>
> So I think that this was just wrongly thought through.  My
> preference would be to either delete the above-quoted bit in
> getPublicationNamespaces entirely, or make it look like the
> test in getPublicationTables.  Or maybe we should delete
> both of these tests, on the grounds that redefining the
> contents of the publication is far outside pg_dump's charter.
>

I see a merit in your second suggestion which is to delete these tests
in getPublicationTables() and getPublicationNamespaces() because we
follow similar behavior in the somewhat related subscription case as
well. When a subscription points to a set of publications and we use
'--no-publications' option in pg_dump, it still dumps the
subscription. I tried it with the following test:

Publisher:
postgres=# create schema s1;
CREATE SCHEMA
postgres=# create table t1(c1 int);
CREATE TABLE
postgres=# create publication pub1 for table t1;
CREATE PUBLICATION
postgres=# create publication pub2 for tables in schema s1;
CREATE PUBLICATION

Subscriber:
postgres=# create table t1 (c1 int);
CREATE TABLE
postgres=# create publication pub_on_sub_1 for table t1;
CREATE PUBLICATION
postgres=# create subscription sub1 connection 'dbname = postgres'
publication pub1, pub2;
NOTICE:  created replication slot "sub1" on publisher
CREATE SUBSCRIPTION

Now when I performed the dump with '--no-publications' option on the
subscriber node, it didn't include publications which is expected but
did include a subscription definition pointing to the publications as
defined originally.

--
With Regards,
Amit Kapila.



Re: Added schema level support for publication.

From
vignesh C
Date:
On Sat, 14 Dec 2024 at 05:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> [ reviving one aspect of an old thread ]
>
> vignesh C <vignesh21@gmail.com> writes:
> > On Mon, Jul 19, 2021 at 9:32 AM tanghy.fnst@fujitsu.com <
> > tanghy.fnst@fujitsu.com> wrote:
> >> I tested your v12 patch and found a problem in the following case.
> >>
> >> Step 1:
> >> postgres=# create schema s1;
> >> CREATE SCHEMA
> >> postgres=# create table s1.t1 (a int);
> >> CREATE TABLE
> >> postgres=# create publication pub_t for table s1.t1;
> >> CREATE PUBLICATION
> >> postgres=# create publication pub_s for schema s1;
> >> CREATE PUBLICATION
> >>
> >> Step 2:
> >> pg_dump -N s1
> >>
> >> I dumped and excluded schema s1, pg_dump generated the following SQL:
> >> -------------------------------
> >> ALTER PUBLICATION pub_s ADD SCHEMA s1;
> >>
> >> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD
> > TABLE s1.t1" was not generated in my case. Thoughts?
>
> > Thanks for reporting this issue, this issue is fixed in the v13 patch
>
> I suppose this exchange is what led to this bit in
> getPublicationNamespaces:
>
>         /*
>          * We always dump publication namespaces unless the corresponding
>          * namespace is excluded from the dump.
>          */
>         if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
>             continue;
>
> I'd like to push back against this on three separate grounds:
>
>
> 1. The behavior this produces is extremely non-obvious and not
> adequately explained by the comment, which makes one wonder how
> much of it was intended.  For example:
>
> * The public schema will be included if listed in FOR ALL TABLES IN,
> even though it's not dumped explicitly in the dump, because its dump
> mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
> was probably intentional, but you wouldn't know it from the comment.
>
> * Schemas created within extensions will be included if listed in FOR
> ALL TABLES IN, even though they're not dumped explicitly in the dump.
> This seems like a quite accidental by-product of the fact that
> checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
> member objects, thus making their dump mask not NONE.  If this
> behavior was intentional, it needs a less-fragile implementation.
>
> * The information_schema will NOT be included, even if it was listed in
> FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
> contain any tables that'd be useful to publish.  But still, this seems
> like randomly ignoring the user's intent.
>
>
> 2. The complaint was that if a schema is excluded from the dump
> by --exclude-schema, then it should not get included in the
> publication either.  I think this is at best highly debatable:
> arguably it amounts to breaking the publication.  It seems
> analogous to deciding that if a function is excluded from the
> dump, while a view using the function is included, we should
> silently adjust the view by removing the output columns or
> WHERE clauses that use the function.  I'm pretty sure that
> nobody would think that was sane.  Perhaps there's a case for
> excluding the view as a whole, but we don't do that.  Besides, the
> corresponding behavior would be to exclude the whole publication,
> not silently modify its definition.
>
>
> 3. The corresponding test for individual tables listed in
> a publication is coded differently:
>
>         /*
>          * Ignore publication membership of tables whose definitions are not
>          * to be dumped.
>          */
>         if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
>             continue;
>
> This is considerably easier to understand the effects of than a test
> on the whole dump mask: it will list the table if we intend to emit
> CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
> But why is it different from the code for schemas?
>
>
> So I think that this was just wrongly thought through.  My
> preference would be to either delete the above-quoted bit in
> getPublicationNamespaces entirely, or make it look like the
> test in getPublicationTables.  Or maybe we should delete
> both of these tests, on the grounds that redefining the
> contents of the publication is far outside pg_dump's charter.

We cannot keep the code identical for getPublicationNamespaces and
getPublicationTables because selectDumpableNamespace performs special
handling for the public schema. Specifically, it unsets
DUMP_COMPONENT_DEFINITION for the public namespace, which prevents the
inclusion of 'TABLES IN SCHEMA public' in the publication. That is the
reason we did not keep the code similar to getPublicationTables.
I prefer the other approach to remove both the checks in
getPublicationTables() and getPublicationNamespaces() which also makes
it consistent with the other case that Amit mentioned at [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BZYanA51c9NzKM31AqJSw-j0-edGz91%2BVh-nsoKdzKfQ%40mail.gmail.com

Regards,
Vignesh



Re: Added schema level support for publication.

From
Artur Zakirov
Date:
On Mon, 16 Dec 2024 at 12:05, vignesh C <vignesh21@gmail.com> wrote:
> I prefer the other approach to remove both the checks in
> getPublicationTables() and getPublicationNamespaces() which also makes
> it consistent with the other case that Amit mentioned at [1].

If I understand your suggestion correctly I think this will break the
"--exclude-schema" option of pg_dump. That change will dump all
mappings between publications and schemas for publications which are
dumped.

That solves the issue with special schemas, but restore will fail if
some schemas were explicitly excluded. pg_dump will include in the
dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
those schemas which are not created during restore.

I think getPublicationNamespaces() should handle both special schemas
and excluded schemas, which can make the code a bit more complex
though.

-- 
Kind regards,
Artur



Re: Added schema level support for publication.

From
vignesh C
Date:
On Mon, 16 Dec 2024 at 17:21, Artur Zakirov <zaartur@gmail.com> wrote:
>
> On Mon, 16 Dec 2024 at 12:05, vignesh C <vignesh21@gmail.com> wrote:
> > I prefer the other approach to remove both the checks in
> > getPublicationTables() and getPublicationNamespaces() which also makes
> > it consistent with the other case that Amit mentioned at [1].
>
> If I understand your suggestion correctly I think this will break the
> "--exclude-schema" option of pg_dump. That change will dump all
> mappings between publications and schemas for publications which are
> dumped.
>
> That solves the issue with special schemas, but restore will fail if
> some schemas were explicitly excluded. pg_dump will include in the
> dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> those schemas which are not created during restore.

This is already the case in the existing implementation, so users
should not be surprised by the proposed change.
This can be reproduced with the following steps:
-- Create schema and user defined function in schema sch2
create schema sch2;
CREATE FUNCTION sch2.add1(integer, integer)
RETURNS integer
LANGUAGE sql IMMUTABLE STRICT
AS $_$select $1 + $2;$_$;

-- Create a view which references user defined function of a different schema
create schema sch1;
CREATE TABLE sch1.t1 (c1 integer, c2 integer);
CREATE VIEW sch1.v1 AS SELECT c1 FROM sch1.t1 WHERE (sch2.add1(c1, c2) >= 10);

-- Exclude schema sch2 which has the user defined function while dumping
./pg_dump -d postgres -N sch2

You will notice that the schema sch2 and the user defined function in
schema sch2 will not be dumped.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Artur Zakirov
Date:
On Tue, 17 Dec 2024 at 10:43, vignesh C <vignesh21@gmail.com> wrote:
> > If I understand your suggestion correctly I think this will break the
> > "--exclude-schema" option of pg_dump. That change will dump all
> > mappings between publications and schemas for publications which are
> > dumped.
> >
> > That solves the issue with special schemas, but restore will fail if
> > some schemas were explicitly excluded. pg_dump will include in the
> > dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> > those schemas which are not created during restore.
>
> This is already the case in the existing implementation, so users
> should not be surprised by the proposed change.

Currently the behavior isn't the same as the proposed change.

Sorry, I might have been not clear when I described what might be
wrong with this. Here is the example with the proposed patch [1].

Create necessary objects to test:

    create schema nsp;
    create publication pub for tables in schema nsp;

If you run pg_dump excluding the schema "nsp":

    pg_dump -d postgres -U postgres -f backup --exclude-schema=nsp

In the resulting file "backup" you will have:

    ...
    ALTER PUBLICATION pub ADD TABLES IN SCHEMA nsp;
    ...

which you won't have on the current master. And I think this is not
what users might expect and it can break some of the scenarios because
during restore they will have an error:

    ERROR:  schema "nsp" does not exist


1. https://www.postgresql.org/message-id/CALDaNm1TQqBC5ZP5BsNf2LKVu1kEJNJn2spFwbAtyLn1FoAFGQ%40mail.gmail.com

-- 
Kind regards,
Artur



Re: Added schema level support for publication.

From
vignesh C
Date:
On Wed, 18 Dec 2024 at 16:34, Artur Zakirov <zaartur@gmail.com> wrote:
>
> On Tue, 17 Dec 2024 at 10:43, vignesh C <vignesh21@gmail.com> wrote:
> > > If I understand your suggestion correctly I think this will break the
> > > "--exclude-schema" option of pg_dump. That change will dump all
> > > mappings between publications and schemas for publications which are
> > > dumped.
> > >
> > > That solves the issue with special schemas, but restore will fail if
> > > some schemas were explicitly excluded. pg_dump will include in the
> > > dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> > > those schemas which are not created during restore.
> >
> > This is already the case in the existing implementation, so users
> > should not be surprised by the proposed change.
>
> Currently the behavior isn't the same as the proposed change.
>
> Sorry, I might have been not clear when I described what might be
> wrong with this. Here is the example with the proposed patch [1].
>
> Create necessary objects to test:
>
>     create schema nsp;
>     create publication pub for tables in schema nsp;
>
> If you run pg_dump excluding the schema "nsp":
>
>     pg_dump -d postgres -U postgres -f backup --exclude-schema=nsp
>
> In the resulting file "backup" you will have:
>
>     ...
>     ALTER PUBLICATION pub ADD TABLES IN SCHEMA nsp;
>     ...
>
> which you won't have on the current master. And I think this is not
> what users might expect and it can break some of the scenarios because
> during restore they will have an error:
>
>     ERROR:  schema "nsp" does not exist

Yes, this is done intentionally in the proposed patch to keep it
consistent with other scenarios in HEAD.
For example, consider the following case:
-- Create schema and user defined function in schema sch2
create schema sch2;
CREATE FUNCTION sch2.add1(integer, integer)
RETURNS integer
LANGUAGE sql IMMUTABLE STRICT
AS $_$select $1 + $2;$_$;

-- Create a view which references user defined function of a different schema
create schema sch1;
CREATE TABLE sch1.t1 (c1 integer, c2 integer);
CREATE VIEW sch1.v1 AS SELECT c1 FROM sch1.t1 WHERE (sch2.add1(c1, c2) >= 10);

-- Exclude schema sch2 which has the user defined function while dumping
./pg_dump -d postgres -Fc -f dump1 -N sch2

The dump file has the reference to sch2.add1 even though sch2 schema
was excluded, dump will not have the user defined functions defined in
schema sch2:
CREATE VIEW sch1.v1 AS
 SELECT c1
   FROM sch1.t1
  WHERE (sch2.add1(c1, c2) >= 10);

Restore using the above dump that was generated will fail with the below error:
./pg_restore -d test1 dump1
pg_restore: error: could not execute query: ERROR:  schema "sch2" does not exist
LINE 4:   WHERE (sch2.add1(c1, c2) >= 10);
                 ^
Command was: CREATE VIEW sch1.v1 AS
 SELECT c1
   FROM sch1.t1
  WHERE (sch2.add1(c1, c2) >= 10);

The proposed patch is in similar lines.

Regards,
Vignesh



Re: Added schema level support for publication.

From
Amit Kapila
Date:
On Thu, Dec 19, 2024 at 12:02 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 18 Dec 2024 at 16:34, Artur Zakirov <zaartur@gmail.com> wrote:
> >
> > On Tue, 17 Dec 2024 at 10:43, vignesh C <vignesh21@gmail.com> wrote:
> > > > If I understand your suggestion correctly I think this will break the
> > > > "--exclude-schema" option of pg_dump. That change will dump all
> > > > mappings between publications and schemas for publications which are
> > > > dumped.
> > > >
> > > > That solves the issue with special schemas, but restore will fail if
> > > > some schemas were explicitly excluded. pg_dump will include in the
> > > > dump ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema> even for
> > > > those schemas which are not created during restore.
> > >
> > > This is already the case in the existing implementation, so users
> > > should not be surprised by the proposed change.
> >
> > Currently the behavior isn't the same as the proposed change.
> >
> > Sorry, I might have been not clear when I described what might be
> > wrong with this. Here is the example with the proposed patch [1].
> >
> > Create necessary objects to test:
> >
> >     create schema nsp;
> >     create publication pub for tables in schema nsp;
> >
> > If you run pg_dump excluding the schema "nsp":
> >
> >     pg_dump -d postgres -U postgres -f backup --exclude-schema=nsp
> >
> > In the resulting file "backup" you will have:
> >
> >     ...
> >     ALTER PUBLICATION pub ADD TABLES IN SCHEMA nsp;
> >     ...
> >
> > which you won't have on the current master. And I think this is not
> > what users might expect and it can break some of the scenarios because
> > during restore they will have an error:
> >
> >     ERROR:  schema "nsp" does not exist
>
> Yes, this is done intentionally in the proposed patch to keep it
> consistent with other scenarios in HEAD.
> For example, consider the following case:
> -- Create schema and user defined function in schema sch2
> create schema sch2;
> CREATE FUNCTION sch2.add1(integer, integer)
> RETURNS integer
> LANGUAGE sql IMMUTABLE STRICT
> AS $_$select $1 + $2;$_$;
>
> -- Create a view which references user defined function of a different schema
> create schema sch1;
> CREATE TABLE sch1.t1 (c1 integer, c2 integer);
> CREATE VIEW sch1.v1 AS SELECT c1 FROM sch1.t1 WHERE (sch2.add1(c1, c2) >= 10);
>
> -- Exclude schema sch2 which has the user defined function while dumping
> ./pg_dump -d postgres -Fc -f dump1 -N sch2
>
> The dump file has the reference to sch2.add1 even though sch2 schema
> was excluded, dump will not have the user defined functions defined in
> schema sch2:
> CREATE VIEW sch1.v1 AS
>  SELECT c1
>    FROM sch1.t1
>   WHERE (sch2.add1(c1, c2) >= 10);
>
> Restore using the above dump that was generated will fail with the below error:
> ./pg_restore -d test1 dump1
> pg_restore: error: could not execute query: ERROR:  schema "sch2" does not exist
> LINE 4:   WHERE (sch2.add1(c1, c2) >= 10);
>                  ^
> Command was: CREATE VIEW sch1.v1 AS
>  SELECT c1
>    FROM sch1.t1
>   WHERE (sch2.add1(c1, c2) >= 10);
>
> The proposed patch is in similar lines.
>

I agree with the proposed patch and this behavior. It is on the lines
of what Tom proposed as one of the ways to address the issue raised in
his initial email. Also, it follows the existing behavior in cases
like view<->function dependency as shown by you in this email, and to
some extent subscription<->publication dependency as shown in email
[1].

Let's wait till the beginning of the next CF to see if there are other
suggestions or any arguments against this proposed change.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BZYanA51c9NzKM31AqJSw-j0-edGz91%2BVh-nsoKdzKfQ%40mail.gmail.com
--
With Regards,
Amit Kapila.