Thread: Added schema level support for publication.
Hi, This feature adds schema option while creating publication. Users will be able to specify one or more schemas while creating publication, when the user specifies schema option, then the data changes for the tables present in the schema specified by the user will be replicated to the subscriber. Few examples have been listed below: Create a publication that publishes all changes for all the tables present in production schema: CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production; Create a publication that publishes all changes for all the tables present in marketing and sales schemas: CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales; Add some schemas to the publication: ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june; Drop some schema from the publication: ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july; Attached is a POC patch for the same. I felt this feature would be quite useful. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Attachment
On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote: > > This feature adds schema option while creating publication. Users will > be able to specify one or more schemas while creating publication, > when the user specifies schema option, then the data changes for the > tables present in the schema specified by the user will be replicated > to the subscriber. Few examples have been listed below: > > Create a publication that publishes all changes for all the tables > present in production schema: > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production; > > Create a publication that publishes all changes for all the tables > present in marketing and sales schemas: > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales; > > Add some schemas to the publication: > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june; > > Drop some schema from the publication: > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july; > > Attached is a POC patch for the same. I felt this feature would be quite useful. > What do we do if the user Drops the schema? Do we automatically remove it from the publication? I see some use of such a feature but you haven't described the use case or how did you arrive at the conclusion that it would be quite useful? -- With Regards, Amit Kapila.
On Fri, Jan 8, 2021 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote: > > > > This feature adds schema option while creating publication. Users will > > be able to specify one or more schemas while creating publication, > > when the user specifies schema option, then the data changes for the > > tables present in the schema specified by the user will be replicated > > to the subscriber. Few examples have been listed below: > > > > Create a publication that publishes all changes for all the tables > > present in production schema: > > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production; > > > > Create a publication that publishes all changes for all the tables > > present in marketing and sales schemas: > > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales; > > > > Add some schemas to the publication: > > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june; > > > > Drop some schema from the publication: > > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july; > > > > Attached is a POC patch for the same. I felt this feature would be quite useful. > > > > What do we do if the user Drops the schema? Do we automatically remove > it from the publication? > I have not yet handled this scenario yet, I will handle this and adding of tests in the next patch. > I see some use of such a feature but you haven't described the use > case or how did you arrive at the conclusion that it would be quite > useful? > Currently there are a couple of options "FOR All TABLES" and "FOR TABLE" when a user creates a publication, 1) either to subscribe to the changes of all the tables or 2) subscribe to a few tables. There is no option for users to subscribe to relations present in the schemas. User has to manually identify the list of tables present in the schema and specify the list of tables in that schema using the "FOR TABLE" option. Similarly if a user wants to subscribe to n number of schemas, the user has to do this for the required schemas, this is a tedious process. This feature helps the user to take care of this internally using schema option. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 9, 2021 at 5:21 PM vignesh C <vignesh21@gmail.com> wrote: > > What do we do if the user Drops the schema? Do we automatically remove > > it from the publication? > > > I have not yet handled this scenario yet, I will handle this and > adding of tests in the next patch. > > > I see some use of such a feature but you haven't described the use > > case or how did you arrive at the conclusion that it would be quite > > useful? > > > Currently there are a couple of options "FOR All TABLES" and "FOR > TABLE" when a user creates a publication, 1) either to subscribe to > the changes of all the tables or 2) subscribe to a few tables. There > is no option for users to subscribe to relations present in the > schemas. User has to manually identify the list of tables present in > the schema and specify the list of tables in that schema using the > "FOR TABLE" option. Similarly if a user wants to subscribe to n number > of schemas, the user has to do this for the required schemas, this is > a tedious process. This feature helps the user to take care of this > internally using schema option. I think this feature can be useful, in case a user has a lot of tables to publish inside a schema. Having said that, I wonder if this feature mandates users to create the same schema with same permissions/authorizations manually on the subscriber, because logical replication doesn't propagate any ddl's so are the schema or schema changes? Or is it that the list of tables from the publisher can go into a different schema on the subscriber? Since the schema can have other objects such as data types, functions, operators, I'm sure with your feature, non-table objects will be skipped. As Amit pointed out earlier, the behaviour when schema dropped, I think we should also consider when schema is altered, say altered to a different name, maybe we should change that in the publication too. In general, what happens if we have some temporary tables or foreign tables inside the schema, will they be allowed to send the data to subscribers? And, with this feature, since there can be many huge tables inside a schema, the initial table sync phase of the replication can take a while. Say a user has created a publication for a schema with hundreds of tables in it, at some point later, can he stop replicating a single or some tables from that schema? IMO, it's better to have the syntax - CREATE PUBLICATION production_publication FOR ALL TABLES IN SCHEMA production - just added IN between for all tables and schema. Say a user has a schema with 121 tables in it, and wants to replicate only 120 or 199 or even lesser tables out of it, so can we have some skip option to the new syntax, something like below? CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production WITH skip = marketing, accounts, sales; --> meaning is, replicate all the tables in the schema production except marketing, accounts, sales tables. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > I think this feature can be useful, in case a user has a lot of tables > to publish inside a schema. Having said that, I wonder if this feature > mandates users to create the same schema with same > permissions/authorizations manually on the subscriber, because logical > replication doesn't propagate any ddl's so are the schema or schema > changes? Or is it that the list of tables from the publisher can go > into a different schema on the subscriber? > > Since the schema can have other objects such as data types, functions, > operators, I'm sure with your feature, non-table objects will be > skipped. > > As Amit pointed out earlier, the behaviour when schema dropped, I > think we should also consider when schema is altered, say altered to a > different name, maybe we should change that in the publication too. > > In general, what happens if we have some temporary tables or foreign > tables inside the schema, will they be allowed to send the data to > subscribers? > > And, with this feature, since there can be many huge tables inside a > schema, the initial table sync phase of the replication can take a > while. > > Say a user has created a publication for a schema with hundreds of > tables in it, at some point later, can he stop replicating a single or > some tables from that schema? > > IMO, it's better to have the syntax - CREATE PUBLICATION > production_publication FOR ALL TABLES IN SCHEMA production - just > added IN between for all tables and schema. > > Say a user has a schema with 121 tables in it, and wants to replicate > only 120 or 199 or even lesser tables out of it, so can we have some > skip option to the new syntax, something like below? > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA > production WITH skip = marketing, accounts, sales; --> meaning is, > replicate all the tables in the schema production except marketing, > accounts, sales tables. One more point - if the publication is created for a schema with no or some initial tables, will all the future tables that may get added to the schema will be replicated too? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 9, 2021 at 8:14 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > I think this feature can be useful, in case a user has a lot of tables > > to publish inside a schema. Having said that, I wonder if this feature > > mandates users to create the same schema with same > > permissions/authorizations manually on the subscriber, because logical > > replication doesn't propagate any ddl's so are the schema or schema > > changes? Or is it that the list of tables from the publisher can go > > into a different schema on the subscriber? > > > > Since the schema can have other objects such as data types, functions, > > operators, I'm sure with your feature, non-table objects will be > > skipped. > > > > As Amit pointed out earlier, the behaviour when schema dropped, I > > think we should also consider when schema is altered, say altered to a > > different name, maybe we should change that in the publication too. > > > > In general, what happens if we have some temporary tables or foreign > > tables inside the schema, will they be allowed to send the data to > > subscribers? > > > > And, with this feature, since there can be many huge tables inside a > > schema, the initial table sync phase of the replication can take a > > while. > > > > Say a user has created a publication for a schema with hundreds of > > tables in it, at some point later, can he stop replicating a single or > > some tables from that schema? > > > > IMO, it's better to have the syntax - CREATE PUBLICATION > > production_publication FOR ALL TABLES IN SCHEMA production - just > > added IN between for all tables and schema. > > > > Say a user has a schema with 121 tables in it, and wants to replicate > > only 120 or 199 or even lesser tables out of it, so can we have some > > skip option to the new syntax, something like below? > > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA > > production WITH skip = marketing, accounts, sales; --> meaning is, > > replicate all the tables in the schema production except marketing, > > accounts, sales tables. > > One more point - if the publication is created for a schema with no or > some initial tables, will all the future tables that may get added to > the schema will be replicated too? > I expect this should be the behavior. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Thanks for your comments Bharath, please find my opinion below. On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > I think this feature can be useful, in case a user has a lot of tables > to publish inside a schema. Having said that, I wonder if this feature > mandates users to create the same schema with same > permissions/authorizations manually on the subscriber, because logical > replication doesn't propagate any ddl's so are the schema or schema > changes? Or is it that the list of tables from the publisher can go > into a different schema on the subscriber? > DDL's will not be propagated to the subscriber. Users have to create the schema & tables in the subscriber. No change in Permissions/authorizations handling, it will be the same as the existing behavior for relations. > Since the schema can have other objects such as data types, functions, > operators, I'm sure with your feature, non-table objects will be > skipped. > Yes, only table data will be sent to subscribers, non-table objects will be skipped. > As Amit pointed out earlier, the behaviour when schema dropped, I > think we should also consider when schema is altered, say altered to a > different name, maybe we should change that in the publication too. > I agree that when schema is altered the renamed schema should be reflected in the publication. > In general, what happens if we have some temporary tables or foreign > tables inside the schema, will they be allowed to send the data to > subscribers? > Temporary tables & foreign tables will not be added to the publications. > And, with this feature, since there can be many huge tables inside a > schema, the initial table sync phase of the replication can take a > while. > Yes this is required. > Say a user has created a publication for a schema with hundreds of > tables in it, at some point later, can he stop replicating a single or > some tables from that schema? > There is no provision for this currently. > IMO, it's better to have the syntax - CREATE PUBLICATION > production_publication FOR ALL TABLES IN SCHEMA production - just > added IN between for all tables and schema. > I'm ok with the proposed syntax, I would like others' opinion too before making the change. > Say a user has a schema with 121 tables in it, and wants to replicate > only 120 or 199 or even lesser tables out of it, so can we have some > skip option to the new syntax, something like below? > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA > production WITH skip = marketing, accounts, sales; --> meaning is, > replicate all the tables in the schema production except marketing, > accounts, sales tables. > Yes this is a good use case, will include this change. Thanks for the comments, I will handle the comments and post a patch for this. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 9, 2021 at 8:14 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > One more point - if the publication is created for a schema with no or > some initial tables, will all the future tables that may get added to > the schema will be replicated too? > I agree on this, when a relation is added to the schema it should be added to the publication. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote: > On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > I think this feature can be useful, in case a user has a lot of tables > > to publish inside a schema. Having said that, I wonder if this feature > > mandates users to create the same schema with same > > permissions/authorizations manually on the subscriber, because logical > > replication doesn't propagate any ddl's so are the schema or schema > > changes? Or is it that the list of tables from the publisher can go > > into a different schema on the subscriber? > > > > DDL's will not be propagated to the subscriber. Users have to create > the schema & tables in the subscriber. No change in > Permissions/authorizations handling, it will be the same as the > existing behavior for relations. Looks like the existing behaviour already requires users to create the schema on the subscriber when publishing the tables from that schema. Otherwise, an error is thrown on the subscriber [1]. [1] on publisher: CREATE SCHEMA myschema; CREATE TABLE myschema.t1(a1 int, b1 int); INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i; CREATE PUBLICATION testpub FOR TABLE myschema.t1; on subscriber: postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres user=bharath port=5432' PUBLICATION testpub; ERROR: schema "myschema" does not exist CREATE SCHEMA myschema; CREATE TABLE myschema.t1(a1 int, b1 int); postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost dbname=postgres user=bharath port=5432' PUBLICATION testpub; NOTICE: created replication slot "testsub" on publisher CREATE SUBSCRIPTION > > Since the schema can have other objects such as data types, functions, > > operators, I'm sure with your feature, non-table objects will be > > skipped. > > > > Yes, only table data will be sent to subscribers, non-table objects > will be skipped. Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is for all the tables in the database, does this i.e. skips non-table objects and temporary tables, foreign tables and so on. So, your feature also can behave the same way, but within the scope of the given schema/s. > > As Amit pointed out earlier, the behaviour when schema dropped, I > > think we should also consider when schema is altered, say altered to a > > different name, maybe we should change that in the publication too. > > > > I agree that when schema is altered the renamed schema should be > reflected in the publication. I think, it's not only making sure that the publisher side has the new altered schema, but also the subscriber needs those alters. Having said that, since these alters come under DDL changes and in logical replication we don't publish the scheme changes to the subscriber, we may not need to anything extra for informing the schema alters to the subscriber from the publisher, the users might have to do the same schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub REFRESH PUBLICATION; should work for them? If this understanding is correct, then we should document this. > > In general, what happens if we have some temporary tables or foreign > > tables inside the schema, will they be allowed to send the data to > > subscribers? > > > > Temporary tables & foreign tables will not be added to the publications. Yes the existing logical replication framework doesn't allow replication of temporary, unlogged, foreign tables and other non-table relations such as materialized views, indexes etc [1]. The CREATE PUBLICATION statement either fails in check_publication_add_relation or before that. CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the single table tab1 is any of the above restricted tables, seems fine. But, if there's a list of tables with CREATE PUBLICATION testpub FOR TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4, unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on first encounter of the restricted table, say at temp_tab2. Whereas, CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted tables and continue to add the accepted tables into the publication within the database. IMHO, if there's a list of tables specified with FOR TABLE, then instead of throwing an error in case of any restricted table, we can issue a warning and continue with the addition of accepted tables into the publication. If done, this behaviour will be in sync with FOR ALL TABLES; Thoughts? If okay, I can work on a patch. Related to error messages: when foreign table is specified in CREATE PUBLICATION statement, then "ERROR: "f1" is not a table", is thrown [1], how about the error message "ERROR: foerign table "f1" cannot be replicated". In general, it would be good if we could have the error messages something like in [2] instead of the existing [1]. Thoughts? If okay, I can work on a patch. [1] t1 is a temporary table: postgres=# CREATE PUBLICATION testpub FOR TABLE t1; ERROR: table "t1" cannot be replicated DETAIL: Temporary and unlogged relations cannot be replicated. t1 is an unlogged table: postgres=# CREATE PUBLICATION testpub FOR TABLE t1; ERROR: table "t1" cannot be replicated DETAIL: Temporary and unlogged relations cannot be replicated. f1 is a foreign table: postgres=# CREATE PUBLICATION testpub FOR TABLE f1; ERROR: "f1" is not a table DETAIL: Only tables can be added to publications. mv1 is a mat view: postgres=# CREATE PUBLICATION testpub FOR TABLE mv1; ERROR: "mv1" is not a table idx1 is an index: postgres=# CREATE PUBLICATION testpub FOR TABLE idx1; ERROR: "idx1" is an index [2] t1 is a temporary table: postgres=# CREATE PUBLICATION testpub FOR TABLE t1; ERROR: temporary table "t1" cannot be replicated DETAIL: Temporary, unlogged and foreign relations cannot be replicated. t1 is an unlogged table: postgres=# CREATE PUBLICATION testpub FOR TABLE t1; ERROR: unlogged table "t1" cannot be replicated DETAIL: Temporary, unlogged and foreign relations cannot be replicated. f1 is a foreign table: postgres=# CREATE PUBLICATION testpub FOR TABLE f1; ERROR: foreign table "f1" cannot be replicated DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > > Say a user has created a publication for a schema with hundreds of > > tables in it, at some point later, can he stop replicating a single or > > some tables from that schema? > > > > There is no provision for this currently. The documentation [1] says, we can ALTER PUBLICATION testpub DROP TABLE t1; which removes the table from the list of published tables, but looks like it requires ALTER SUBSCRIPTION testsub REFRESH PUBLICATION; for the changes to become effective on the subscriber. I have done some testing for this case: 1) created publication for table t1, see \d+ t1, the associated publication is visible in the output 2) created subscription on the subscriber, initial available data from the publisher for table t1 is received 3) insert into table t1 on the publisher 4) inserted data in (3) is received in the subscriber table t1 5) alter publication to drop the table t1 on the publisher, see \d+ t1, there will not be any associated publication in the output 6) execute alter subscription refresh publication on the subscriber, with the expectation that it should not receive the data from the publisher for the table t1 since it's dropped from the publication in (5) 7) insert into table t1 on the publisher 8) still the newly inserted data in (7) from the publisher, will be received into the table t1 in the subscriber IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and the above use case, it looks like a bug to me. If I'm wrong, can someone correct me? Thoughts? [1] - https://www.postgresql.org/docs/devel/sql-alterpublication.html > > IMO, it's better to have the syntax - CREATE PUBLICATION > > production_publication FOR ALL TABLES IN SCHEMA production - just > > added IN between for all tables and schema. > > > > I'm ok with the proposed syntax, I would like others' opinion too > before making the change. Thanks. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Mon, 11 Jan 2021 at 14:15, Bharath Rupireddy wrote: > On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote: >> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy >> <bharath.rupireddyforpostgres@gmail.com> wrote: >> > I think this feature can be useful, in case a user has a lot of tables >> > to publish inside a schema. Having said that, I wonder if this feature >> > mandates users to create the same schema with same >> > permissions/authorizations manually on the subscriber, because logical >> > replication doesn't propagate any ddl's so are the schema or schema >> > changes? Or is it that the list of tables from the publisher can go >> > into a different schema on the subscriber? >> > >> >> DDL's will not be propagated to the subscriber. Users have to create >> the schema & tables in the subscriber. No change in >> Permissions/authorizations handling, it will be the same as the >> existing behavior for relations. > > Looks like the existing behaviour already requires users to create the > schema on the subscriber when publishing the tables from that schema. > Otherwise, an error is thrown on the subscriber [1]. > > [1] on publisher: > CREATE SCHEMA myschema; > CREATE TABLE myschema.t1(a1 int, b1 int); > INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i; > CREATE PUBLICATION testpub FOR TABLE myschema.t1; > > on subscriber: > postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost > dbname=postgres user=bharath port=5432' PUBLICATION testpub; > ERROR: schema "myschema" does not exist > CREATE SCHEMA myschema; > CREATE TABLE myschema.t1(a1 int, b1 int); > postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost > dbname=postgres user=bharath port=5432' PUBLICATION testpub; > NOTICE: created replication slot "testsub" on publisher > CREATE SUBSCRIPTION > >> > Since the schema can have other objects such as data types, functions, >> > operators, I'm sure with your feature, non-table objects will be >> > skipped. >> > >> >> Yes, only table data will be sent to subscribers, non-table objects >> will be skipped. > > Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is > for all the tables in the database, does this i.e. skips non-table > objects and temporary tables, foreign tables and so on. So, your > feature also can behave the same way, but within the scope of the > given schema/s. > >> > As Amit pointed out earlier, the behaviour when schema dropped, I >> > think we should also consider when schema is altered, say altered to a >> > different name, maybe we should change that in the publication too. >> > >> >> I agree that when schema is altered the renamed schema should be >> reflected in the publication. > > I think, it's not only making sure that the publisher side has the new > altered schema, but also the subscriber needs those alters. Having > said that, since these alters come under DDL changes and in logical > replication we don't publish the scheme changes to the subscriber, we > may not need to anything extra for informing the schema alters to the > subscriber from the publisher, the users might have to do the same > schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub > REFRESH PUBLICATION; should work for them? If this understanding is > correct, then we should document this. > >> > In general, what happens if we have some temporary tables or foreign >> > tables inside the schema, will they be allowed to send the data to >> > subscribers? >> > >> >> Temporary tables & foreign tables will not be added to the publications. > > Yes the existing logical replication framework doesn't allow > replication of temporary, unlogged, foreign tables and other non-table > relations such as materialized views, indexes etc [1]. The CREATE > PUBLICATION statement either fails in check_publication_add_relation > or before that. > > CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the > single table tab1 is any of the above restricted tables, seems fine. > But, if there's a list of tables with CREATE PUBLICATION testpub FOR > TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4, > unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on > first encounter of the restricted table, say at temp_tab2. Whereas, > CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted > tables and continue to add the accepted tables into the publication > within the database. > > IMHO, if there's a list of tables specified with FOR TABLE, then > instead of throwing an error in case of any restricted table, we can > issue a warning and continue with the addition of accepted tables into > the publication. If done, this behaviour will be in sync with FOR ALL > TABLES; > > Thoughts? If okay, I can work on a patch. > > Related to error messages: when foreign table is specified in CREATE > PUBLICATION statement, then "ERROR: "f1" is not a table", is thrown > [1], how about the error message "ERROR: foerign table "f1" cannot be > replicated". In general, it would be good if we could have the error > messages something like in [2] instead of the existing [1]. > > Thoughts? If okay, I can work on a patch. > > [1] > t1 is a temporary table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: table "t1" cannot be replicated > DETAIL: Temporary and unlogged relations cannot be replicated. > > t1 is an unlogged table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: table "t1" cannot be replicated > DETAIL: Temporary and unlogged relations cannot be replicated. > > f1 is a foreign table: > postgres=# CREATE PUBLICATION testpub FOR TABLE f1; > ERROR: "f1" is not a table > DETAIL: Only tables can be added to publications. > > mv1 is a mat view: > postgres=# CREATE PUBLICATION testpub FOR TABLE mv1; > ERROR: "mv1" is not a table > > idx1 is an index: > postgres=# CREATE PUBLICATION testpub FOR TABLE idx1; > ERROR: "idx1" is an index > > [2] > t1 is a temporary table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: temporary table "t1" cannot be replicated > DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > > t1 is an unlogged table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: unlogged table "t1" cannot be replicated > DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > > f1 is a foreign table: > postgres=# CREATE PUBLICATION testpub FOR TABLE f1; > ERROR: foreign table "f1" cannot be replicated > DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > >> > Say a user has created a publication for a schema with hundreds of >> > tables in it, at some point later, can he stop replicating a single or >> > some tables from that schema? >> > >> >> There is no provision for this currently. > > The documentation [1] says, we can ALTER PUBLICATION testpub DROP > TABLE t1; which removes the table from the list of published tables, > but looks like it requires ALTER SUBSCRIPTION testsub REFRESH > PUBLICATION; for the changes to become effective on the subscriber. I > have done some testing for this case: > 1) created publication for table t1, see \d+ t1, the associated > publication is visible in the output > 2) created subscription on the subscriber, initial available data from > the publisher for table t1 is received > 3) insert into table t1 on the publisher > 4) inserted data in (3) is received in the subscriber table t1 > 5) alter publication to drop the table t1 on the publisher, see \d+ > t1, there will not be any associated publication in the output > 6) execute alter subscription refresh publication on the subscriber, > with the expectation that it should not receive the data from the > publisher for the table t1 since it's dropped from the publication in > (5) > 7) insert into table t1 on the publisher > 8) still the newly inserted data in (7) from the publisher, will be > received into the table t1 in the subscriber > > IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and > the above use case, it looks like a bug to me. If I'm wrong, can > someone correct me? > Yes, if we modify the publication, we should refresh the subscription on each subscriber. It looks strange for me, especially for partitioned tables [1]. > Thoughts? > Can we trace the different between publication and subscription, and auto-refresh subscription on subscriber? [1] https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote: > >> > Say a user has created a publication for a schema with hundreds of > >> > tables in it, at some point later, can he stop replicating a single or > >> > some tables from that schema? > >> > > >> > >> There is no provision for this currently. > > > > The documentation [1] says, we can ALTER PUBLICATION testpub DROP > > TABLE t1; which removes the table from the list of published tables, > > but looks like it requires ALTER SUBSCRIPTION testsub REFRESH > > PUBLICATION; for the changes to become effective on the subscriber. I > > have done some testing for this case: > > 1) created publication for table t1, see \d+ t1, the associated > > publication is visible in the output > > 2) created subscription on the subscriber, initial available data from > > the publisher for table t1 is received > > 3) insert into table t1 on the publisher > > 4) inserted data in (3) is received in the subscriber table t1 > > 5) alter publication to drop the table t1 on the publisher, see \d+ > > t1, there will not be any associated publication in the output > > 6) execute alter subscription refresh publication on the subscriber, > > with the expectation that it should not receive the data from the > > publisher for the table t1 since it's dropped from the publication in > > (5) > > 7) insert into table t1 on the publisher > > 8) still the newly inserted data in (7) from the publisher, will be > > received into the table t1 in the subscriber > > > > IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and > > the above use case, it looks like a bug to me. If I'm wrong, can > > someone correct me? > > > > Yes, if we modify the publication, we should refresh the subscription on > each subscriber. It looks strange for me, especially for partitioned > tables [1]. > > > Thoughts? > > > > Can we trace the different between publication and subscription, and > auto-refresh subscription on subscriber? > > [1] > https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com As Amit stated in your thread [1], DDLs like creation of the new tables or partitions, schema changes etc. on the publisher can not be replicated automatically by the logical replication framework to the subscriber. Users have to perform those DDLs on the subscribers by themselves. If your point is to at least issue the ALTER SUBSCRIPTION testsub REFRESH PUBLICATION; from the publication whenever the publication is altered i.e. added or dropped tables, IMO, we cannot do this, because running this command on the subscriber only makes sense, after user runs the same DDLs (which were run on the publisher) also on the subscriber. To illustrate this: 1) create a new table or partition on the publisher and add it to publisher, note that the same table has not yet been created on the subscriber 2) imagine the publisher issuing an auto refresh command to all the subscribers, then, no point in that right, because the new table or the partition is not yet created on all the subscribers. So, IMO, we can not have an auto refresh mechanism, until we have the feature to replicate the DDL changes to all the subscribers. What I stated in my earlier mail [1] is that even though we drop the table from the publication in the publisher and run a refresh publication on the subscriber, still the data is being replicated from the publisher to the subscriber table. I just wanted to know whether this is the expected behaviour or what exactly means. a user running ALTER PUBLICATION mypub DROP TABLE mytable; [1] - https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote:Say a user has created a publication for a schema with hundreds of
tables in it, at some point later, can he stop replicating a single or
some tables from that schema?
There is no provision for this currently.
The documentation [1] says, we can ALTER PUBLICATION testpub DROP
TABLE t1; which removes the table from the list of published tables,
but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
PUBLICATION; for the changes to become effective on the subscriber. I
have done some testing for this case:
1) created publication for table t1, see \d+ t1, the associated
publication is visible in the output
2) created subscription on the subscriber, initial available data from
the publisher for table t1 is received
3) insert into table t1 on the publisher
4) inserted data in (3) is received in the subscriber table t1
5) alter publication to drop the table t1 on the publisher, see \d+
t1, there will not be any associated publication in the output
6) execute alter subscription refresh publication on the subscriber,
with the expectation that it should not receive the data from the
publisher for the table t1 since it's dropped from the publication in
(5)
7) insert into table t1 on the publisher
8) still the newly inserted data in (7) from the publisher, will be
received into the table t1 in the subscriber
IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
the above use case, it looks like a bug to me. If I'm wrong, can
someone correct me?
Yes, if we modify the publication, we should refresh the subscription on
each subscriber. It looks strange for me, especially for partitioned
tables [1].Thoughts?
Can we trace the different between publication and subscription, and
auto-refresh subscription on subscriber?
[1]
https://www.postgresql.org/message-id/flat/1D6DCFD2-0F44-4A18-BF67-17C2697B1631%40hotmail.com
As Amit stated in your thread [1], DDLs like creation of the new
tables or partitions, schema changes etc. on the publisher can not be
replicated automatically by the logical replication framework to the
subscriber. Users have to perform those DDLs on the subscribers by
themselves.
Thanks for clarification.If your point is to at least issue the ALTER SUBSCRIPTION testsub
REFRESH PUBLICATION; from the publication whenever the publication is
altered i.e. added or dropped tables, IMO, we cannot do this, because
running this command on the subscriber only makes sense, after user
runs the same DDLs (which were run on the publisher) also on the
subscriber. To illustrate this:
1) create a new table or partition on the publisher and add it to
publisher, note that the same table has not yet been created on the
subscriber
2) imagine the publisher issuing an auto refresh command to all the
subscribers, then, no point in that right, because the new table or
the partition is not yet created on all the subscribers.
So, IMO, we can not have an auto refresh mechanism, until we have the
feature to replicate the DDL changes to all the subscribers.
Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert dataWhat I stated in my earlier mail [1] is that even though we drop the
table from the publication in the publisher and run a refresh
publication on the subscriber, still the data is being replicated from
the publisher to the subscriber table. I just wanted to know whether
this is the expected behaviour or what exactly means. a user running
ALTER PUBLICATION mypub DROP TABLE mytable;
[1] - https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote: > > > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote: > > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data > between step (5) and (6), it will not ship the new records, however, if we insert > data between step (5) and (6), it will ship the new records. > > .. > It might be a bug. > Can you check pg_publication_rel and pg_subscription_rel? Also, this is not related to the feature proposed in this thread, so it is better to start a new thread to conclude whether this is a bug or not. -- With Regards, Amit Kapila.
On Mon, Jan 11, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote: > > > > > > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote: > > > > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data > > between step (5) and (6), it will not ship the new records, however, if we insert > > data between step (5) and (6), it will ship the new records. > > > > > .. > > It might be a bug. > > > > Can you check pg_publication_rel and pg_subscription_rel? Also, this > is not related to the feature proposed in this thread, so it is better > to start a new thread to conclude whether this is a bug or not. Thanks Amit, sure I will verify and start a new thread. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Mon, Jan 11, 2021 at 5:28 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Jan 11, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Jan 11, 2021 at 4:29 PM Li Japin <japinli@hotmail.com> wrote: > > > > > > > > > On Jan 11, 2021, at 5:06 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > On Mon, Jan 11, 2021 at 1:29 PM japin <japinli@hotmail.com> wrote: > > > > > > Sorry, I misunderstood. After the test (ce6a71fa530). I found that if we do not insert data > > > between step (5) and (6), it will not ship the new records, however, if we insert > > > data between step (5) and (6), it will ship the new records. > > > > > > > > .. > > > It might be a bug. > > > > > > > Can you check pg_publication_rel and pg_subscription_rel? Also, this > > is not related to the feature proposed in this thread, so it is better > > to start a new thread to conclude whether this is a bug or not. > > Thanks Amit, sure I will verify and start a new thread. I started a new thread [1] for this, please have a look. [1] - https://www.postgresql.org/message-id/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Hi,
This feature adds schema option while creating publication. Users will
be able to specify one or more schemas while creating publication,
when the user specifies schema option, then the data changes for the
tables present in the schema specified by the user will be replicated
to the subscriber. Few examples have been listed below:
Create a publication that publishes all changes for all the tables
present in production schema:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
Create a publication that publishes all changes for all the tables
present in marketing and sales schemas:
CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
Add some schemas to the publication:
ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
Drop some schema from the publication:
ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
On Mon, Jan 11, 2021 at 11:45 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote: > > On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > I think this feature can be useful, in case a user has a lot of tables > > > to publish inside a schema. Having said that, I wonder if this feature > > > mandates users to create the same schema with same > > > permissions/authorizations manually on the subscriber, because logical > > > replication doesn't propagate any ddl's so are the schema or schema > > > changes? Or is it that the list of tables from the publisher can go > > > into a different schema on the subscriber? > > > > > > > DDL's will not be propagated to the subscriber. Users have to create > > the schema & tables in the subscriber. No change in > > Permissions/authorizations handling, it will be the same as the > > existing behavior for relations. > > Looks like the existing behaviour already requires users to create the > schema on the subscriber when publishing the tables from that schema. > Otherwise, an error is thrown on the subscriber [1]. > > [1] on publisher: > CREATE SCHEMA myschema; > CREATE TABLE myschema.t1(a1 int, b1 int); > INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i; > CREATE PUBLICATION testpub FOR TABLE myschema.t1; > > on subscriber: > postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost > dbname=postgres user=bharath port=5432' PUBLICATION testpub; > ERROR: schema "myschema" does not exist > CREATE SCHEMA myschema; > CREATE TABLE myschema.t1(a1 int, b1 int); > postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost > dbname=postgres user=bharath port=5432' PUBLICATION testpub; > NOTICE: created replication slot "testsub" on publisher > CREATE SUBSCRIPTION > Yes this feature will also have the same behavior, DDL creation should be taken care of by DBA similar to how it is handled may be using pg_dump or use sql scripts/statements to update. > > > Since the schema can have other objects such as data types, functions, > > > operators, I'm sure with your feature, non-table objects will be > > > skipped. > > > > > > > Yes, only table data will be sent to subscribers, non-table objects > > will be skipped. > > Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is > for all the tables in the database, does this i.e. skips non-table > objects and temporary tables, foreign tables and so on. So, your > feature also can behave the same way, but within the scope of the > given schema/s. > Yes, it will support only normal tables. Non table objects, foreign tables & temporary tables will not be supported. > > > As Amit pointed out earlier, the behaviour when schema dropped, I > > > think we should also consider when schema is altered, say altered to a > > > different name, maybe we should change that in the publication too. > > > > > > > I agree that when schema is altered the renamed schema should be > > reflected in the publication. > > I think, it's not only making sure that the publisher side has the new > altered schema, but also the subscriber needs those alters. Having > said that, since these alters come under DDL changes and in logical > replication we don't publish the scheme changes to the subscriber, we > may not need to anything extra for informing the schema alters to the > subscriber from the publisher, the users might have to do the same > schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub > REFRESH PUBLICATION; should work for them? If this understanding is > correct, then we should document this. > Yes, alter schema changes will be reflected in the publication, the corresponding change needs to be done by the user on the subscriber side. Once a user does ALTER SUBSCRIPTION testsub REFRESH PUBLICATION, the new altered schema changes will be reflected in the subscriber. I will update the documentation that user need to take care for subscription refresh. > > > In general, what happens if we have some temporary tables or foreign > > > tables inside the schema, will they be allowed to send the data to > > > subscribers? > > > > > > > Temporary tables & foreign tables will not be added to the publications. > > Yes the existing logical replication framework doesn't allow > replication of temporary, unlogged, foreign tables and other non-table > relations such as materialized views, indexes etc [1]. The CREATE > PUBLICATION statement either fails in check_publication_add_relation > or before that. > > CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the > single table tab1 is any of the above restricted tables, seems fine. > But, if there's a list of tables with CREATE PUBLICATION testpub FOR > TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4, > unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on > first encounter of the restricted table, say at temp_tab2. Whereas, > CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted > tables and continue to add the accepted tables into the publication > within the database. > > IMHO, if there's a list of tables specified with FOR TABLE, then > instead of throwing an error in case of any restricted table, we can > issue a warning and continue with the addition of accepted tables into > the publication. If done, this behaviour will be in sync with FOR ALL > TABLES; > > Thoughts? If okay, I can work on a patch. > I feel we can start a new thread for this to seek opinion if this base change is required and reach consensus. > Related to error messages: when foreign table is specified in CREATE > PUBLICATION statement, then "ERROR: "f1" is not a table", is thrown > [1], how about the error message "ERROR: foerign table "f1" cannot be > replicated". In general, it would be good if we could have the error > messages something like in [2] instead of the existing [1]. > > Thoughts? If okay, I can work on a patch. > I feel we can start a new thread for this to seek opinion and reach consensus. > [1] > t1 is a temporary table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: table "t1" cannot be replicated > DETAIL: Temporary and unlogged relations cannot be replicated. > > t1 is an unlogged table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: table "t1" cannot be replicated > DETAIL: Temporary and unlogged relations cannot be replicated. > > f1 is a foreign table: > postgres=# CREATE PUBLICATION testpub FOR TABLE f1; > ERROR: "f1" is not a table > DETAIL: Only tables can be added to publications. > > mv1 is a mat view: > postgres=# CREATE PUBLICATION testpub FOR TABLE mv1; > ERROR: "mv1" is not a table > > idx1 is an index: > postgres=# CREATE PUBLICATION testpub FOR TABLE idx1; > ERROR: "idx1" is an index > > [2] > t1 is a temporary table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: temporary table "t1" cannot be replicated > DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > > t1 is an unlogged table: > postgres=# CREATE PUBLICATION testpub FOR TABLE t1; > ERROR: unlogged table "t1" cannot be replicated > DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > > f1 is a foreign table: > postgres=# CREATE PUBLICATION testpub FOR TABLE f1; > ERROR: foreign table "f1" cannot be replicated > DETAIL: Temporary, unlogged and foreign relations cannot be replicated. > > > > Say a user has created a publication for a schema with hundreds of > > > tables in it, at some point later, can he stop replicating a single or > > > some tables from that schema? > > > > > > > There is no provision for this currently. > > The documentation [1] says, we can ALTER PUBLICATION testpub DROP > TABLE t1; which removes the table from the list of published tables, > but looks like it requires ALTER SUBSCRIPTION testsub REFRESH > PUBLICATION; for the changes to become effective on the subscriber. I > have done some testing for this case: > 1) created publication for table t1, see \d+ t1, the associated > publication is visible in the output > 2) created subscription on the subscriber, initial available data from > the publisher for table t1 is received > 3) insert into table t1 on the publisher > 4) inserted data in (3) is received in the subscriber table t1 > 5) alter publication to drop the table t1 on the publisher, see \d+ > t1, there will not be any associated publication in the output > 6) execute alter subscription refresh publication on the subscriber, > with the expectation that it should not receive the data from the > publisher for the table t1 since it's dropped from the publication in > (5) > 7) insert into table t1 on the publisher > 8) still the newly inserted data in (7) from the publisher, will be > received into the table t1 in the subscriber > > IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and > the above use case, it looks like a bug to me. If I'm wrong, can > someone correct me? > > Thoughts? I think you started a new thread for this, let's conclude on this there. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Tue, Jan 12, 2021 at 5:16 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> Hi Vignesh,
>
> I had a look at the patch, please consider following comments.
>
> On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> Hi,
>>
>> This feature adds schema option while creating publication. Users will
>> be able to specify one or more schemas while creating publication,
>> when the user specifies schema option, then the data changes for the
>> tables present in the schema specified by the user will be replicated
>> to the subscriber. Few examples have been listed below:
>>
>> Create a publication that publishes all changes for all the tables
>> present in production schema:
>> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
>>
> Should it be FOR TABLES IN SCHEMA instead of FOR ALL TABLES SCHEMA?
>
For adding tables into publication we have syntax like:
CREATE PUBLICATION mypub FOR TABLE tbl1, tbl2;
For all tables we have syntax like:
CREATE PUBLICATION mypub FOR ALL TABLES;
Initial syntax that I proposed was:
CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production;
I feel the below syntax is better, as it is consistent with others:
CREATE PUBLICATION mypub FOR SCHEMA sch1, sch2;
>>
>> Create a publication that publishes all changes for all the tables
>> present in marketing and sales schemas:
>> CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales;
>>
>> Add some schemas to the publication:
>> ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june;
>>
> As per current implementation this command fails even if one of the schemas does not
> exist. I think this is counterintuitive, it should throw a warning and continue adding the rest.
>
We have the similar behavior in case of adding non-existent table while creating a publication:
CREATE PUBLICATION mypub3 FOR TABLE non_existent_table;
ERROR: relation "non_existent_table" does not exist
I feel we can keep the behavior similarly to maintain the consistency.
>>
>> Drop some schema from the publication:
>> ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july;
>>
> Same for drop schema, if one of these schemas does not exist in publication,
> the entire DROP operation is aborted.
We have similar behavior in case of dropping non-existent table while altering publication
alter publication mypub5 drop table test1,testx;
ERROR: relation "testx" does not exist
I feel we can keep the behavior similarly to maintain the consistency.
Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 9, 2021 at 5:21 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Jan 8, 2021 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Jan 7, 2021 at 10:03 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > This feature adds schema option while creating publication. Users will > > > be able to specify one or more schemas while creating publication, > > > when the user specifies schema option, then the data changes for the > > > tables present in the schema specified by the user will be replicated > > > to the subscriber. Few examples have been listed below: > > > > > > Create a publication that publishes all changes for all the tables > > > present in production schema: > > > CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA production; > > > > > > Create a publication that publishes all changes for all the tables > > > present in marketing and sales schemas: > > > CREATE PUBLICATION sales_publication FOR ALL TABLES SCHEMA marketing, sales; > > > > > > Add some schemas to the publication: > > > ALTER PUBLICATION sales_publication ADD SCHEMA marketing_june, sales_june; > > > > > > Drop some schema from the publication: > > > ALTER PUBLICATION production_quarterly_publication DROP SCHEMA production_july; > > > > > > Attached is a POC patch for the same. I felt this feature would be quite useful. > > > > > > > What do we do if the user Drops the schema? Do we automatically remove > > it from the publication? > > > I have not yet handled this scenario yet, I will handle this and > adding of tests in the next patch. > I have handled the above scenario(drop schema should automatically remove the schema entry from publication schema relation) & addition of tests in the new v2 patch attached. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Attachment
>
>
> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>
This scenario is handled now in the patch posted at [1].
>
> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales; --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.
I have not yet handled this, I'm working on this and will try post a patch for this in the next version.
[1] - https://www.postgresql.org/message-id/CALDaNm02%3Dk8K_ZSN7_dyVHyMTW4B5hOaeo2PzdWG%3Da7GtLH0oA%40mail.gmail.com
Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
I have handled the above scenario(drop schema should automatically
remove the schema entry from publication schema relation) & addition
of tests in the new v2 patch attached.
Thoughts?
+ of schemas in the publication with the specified one. The <literal>ADD
Thanks Rahila for your comments. Please find my thoughts below: On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote: > > Hi Vignesh, > >> >> I have handled the above scenario(drop schema should automatically >> remove the schema entry from publication schema relation) & addition >> of tests in the new v2 patch attached. >> Thoughts? > > > Please see some initial comments: > > 1. I think there should be more tests to show that the schema data is actually replicated > to the subscriber. Currently, I am not seeing the data being replicated when I use FOR SCHEMA. > I will fix this issue and include more tests in my next version of the patch. > 2. How does replication behave when a table is added or removed from a subscribed schema > using ALTER TABLE SET SCHEMA? > I would like to keep the behavior similar to the table behavior. I will post more details for this along with my next version of the patch. > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't > specify one, this can be handy to replicate current schema tables. > It looks like a good use case, I will check on the feasibility of this and try to implement this. > 4. + The fourth, fifth and sixth variants change which schemas are part of the > + publication. The <literal>SET TABLE</literal> clause will replace the list > + of schemas in the publication with the specified one. The <literal>ADD > > There is a typo above s/SET TABLE/SET SCHEMA I will fix this in the next version of the patch. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Fri, Jan 22, 2021 at 10:01 AM vignesh C <vignesh21@gmail.com> wrote: > > Thanks Rahila for your comments. Please find my thoughts below: > > On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote: > > > > Hi Vignesh, > > > >> > >> I have handled the above scenario(drop schema should automatically > >> remove the schema entry from publication schema relation) & addition > >> of tests in the new v2 patch attached. > >> Thoughts? > > > > > > Please see some initial comments: > > > > 1. I think there should be more tests to show that the schema data is actually replicated > > to the subscriber. Currently, I am not seeing the data being replicated when I use FOR SCHEMA. > > > I will fix this issue and include more tests in my next version of the patch. Modified to handle this and also added a few more tests. > > 2. How does replication behave when a table is added or removed from a subscribed schema > > using ALTER TABLE SET SCHEMA? > > > I would like to keep the behavior similar to the table behavior. I > will post more details for this along with my next version of the > patch. > If a table is set to a different schema, after the schema change table data will not be sent to the subscriber. When a new table is added to the published schema, the table data will be sent by the publisher, subscriber will not apply the changes. If the change needs to be reflected, subscriber's publication should be refreshed using "alter subscription mysub1 refresh publication". This relation will be reflected in the subscriber relation when the subscriber's publication is refreshed. If a table is dropped, there is no impact on subscriber, This relation will be present in pg_subscriber_rel after refreshing subscriber publication. > > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't > > specify one, this can be handy to replicate current schema tables. > > > It looks like a good use case, I will check on the feasibility of this > and try to implement this. This can be done, I will handle this later. > > 4. + The fourth, fifth and sixth variants change which schemas are part of the > > + publication. The <literal>SET TABLE</literal> clause will replace the list > > + of schemas in the publication with the specified one. The <literal>ADD > > > > There is a typo above s/SET TABLE/SET SCHEMA > I will fix this in the next version of the patch. Modified it. I have separated the tests and documentation into a separate patch to make review easier. Attached v3 patch with the fixes. Thoughts? Regards, Vignesh
Attachment
On Sun, Jan 31, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Jan 22, 2021 at 10:01 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks Rahila for your comments. Please find my thoughts below: > > > > On Wed, Jan 20, 2021 at 6:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote: > > > > > > Hi Vignesh, > > > > > >> > > >> I have handled the above scenario(drop schema should automatically > > >> remove the schema entry from publication schema relation) & addition > > >> of tests in the new v2 patch attached. > > >> Thoughts? > > > > > > > > > Please see some initial comments: > > > > > > 1. I think there should be more tests to show that the schema data is actually replicated > > > to the subscriber. Currently, I am not seeing the data being replicated when I use FOR SCHEMA. > > > > > I will fix this issue and include more tests in my next version of the patch. > > Modified to handle this and also added a few more tests. > > > > 2. How does replication behave when a table is added or removed from a subscribed schema > > > using ALTER TABLE SET SCHEMA? > > > > > I would like to keep the behavior similar to the table behavior. I > > will post more details for this along with my next version of the > > patch. > > > > If a table is set to a different schema, after the schema change table > data will not be sent to the subscriber. > When a new table is added to the published schema, the table data will > be sent by the publisher, subscriber will not apply the changes. If > the change needs to be reflected, subscriber's publication should be > refreshed using "alter subscription mysub1 refresh publication". This > relation will be reflected in the subscriber relation when the > subscriber's publication is refreshed. > If a table is dropped, there is no impact on subscriber, This relation > will be present in pg_subscriber_rel after refreshing subscriber > publication. > > > > 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't > > > specify one, this can be handy to replicate current schema tables. > > > > > It looks like a good use case, I will check on the feasibility of this > > and try to implement this. > > This can be done, I will handle this later. > > > > 4. + The fourth, fifth and sixth variants change which schemas are part of the > > > + publication. The <literal>SET TABLE</literal> clause will replace the list > > > + of schemas in the publication with the specified one. The <literal>ADD > > > > > > There is a typo above s/SET TABLE/SET SCHEMA > > I will fix this in the next version of the patch. > > Modified it. > I have separated the tests and documentation into a separate patch to > make review easier. Attached v3 patch with the fixes. > Thoughts? The earlier patch does not apply on the head. The v4 patch attached has the following changes: a) Rebased it on head. b) Removed pubschemas, pubtables columns and replaced it with pubtype in pg_publication table. c) List the schemas in describe publication. d) List the publication in list schemas. e) Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for "FOR SCHEMA" in create publication and alter publication. g) Included the newly added structure type to typedefs.lst Regards, Vignesh
Attachment
On Monday, May 24, 2021 at 8:31 PM vignesh C <vignesh21@gmail.com> wrote: > The earlier patch does not apply on the head. The v4 patch attached > has the following changes: > a) Rebased it on head. b) Removed pubschemas, pubtables columns and > replaced it with pubtype in pg_publication table. c) List the schemas > in describe publication. d) List the publication in list schemas. e) > Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for > "FOR SCHEMA" in create publication and alter publication. g) Included > the newly added structure type to typedefs.lst Thanks for your patch. I ran "make check-world" after applying your patch but it failed on my machine. I saw the following log: -------------- parallel group (2 tests): subscription publication publication ... FAILED 108 ms subscription ... ok 87 ms diff -U3 /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out --- /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out 2021-05-25 15:44:52.261683712+0800 +++ /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out 2021-05-25 15:48:41.393672595+0800 @@ -359,10 +359,10 @@ "public" \dn public; - List of schemas - Name | Owner ---------+--------- - public | vignesh +List of schemas + Name | Owner +--------+------- + public | fnst Publications: "testpub3_forschema" -------------- I think the owner of CURRENT_SCHEMA should not be written into publication.out because the result is related to the user. Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default owner before this test case. Thoughts? Regards Tang
On Tue, May 25, 2021 at 2:47 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, May 24, 2021 at 8:31 PM vignesh C <vignesh21@gmail.com> wrote: > > The earlier patch does not apply on the head. The v4 patch attached > > has the following changes: > > a) Rebased it on head. b) Removed pubschemas, pubtables columns and > > replaced it with pubtype in pg_publication table. c) List the schemas > > in describe publication. d) List the publication in list schemas. e) > > Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for > > "FOR SCHEMA" in create publication and alter publication. g) Included > > the newly added structure type to typedefs.lst > > Thanks for your patch. > > I ran "make check-world" after applying your patch but it failed on my machine. I saw the following log: > -------------- > parallel group (2 tests): subscription publication > publication ... FAILED 108 ms > subscription ... ok 87 ms > > > diff -U3 /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out > --- /home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out 2021-05-25 15:44:52.261683712+0800 > +++ /home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out 2021-05-25 15:48:41.393672595+0800 > @@ -359,10 +359,10 @@ > "public" > > \dn public; > - List of schemas > - Name | Owner > ---------+--------- > - public | vignesh > +List of schemas > + Name | Owner > +--------+------- > + public | fnst > Publications: > "testpub3_forschema" > -------------- > > I think the owner of CURRENT_SCHEMA should not be written into publication.out because the result is related to the user. > Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default owner before this test case. Thoughts? Thanks for identifying and reporting this issue. I have \dn with the equivalent query to display only the publication name. The updated patch has the fix for the same. Regards, Vignesh
Attachment
On Sat, Jun 5, 2021 at 7:02 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for identifying and reporting this issue. I have \dn with the > equivalent query to display only the publication name. The updated > patch has the fix for the same. Currently, FOR ALL TABLES is there to add all the tables(existing and future) in the current database in which the publication is created. I wonder before providing FOR SCHEMA capability, we better target FOR DATABASE first, something like CREATE PUBLICATION ... FOR DATABASE foo, bar, baz, qux; Of course users with the proper permissions on the specified databases can add them to the publication. This can help to add all the tables in other databases as well. Then, the CREATE PUBLICATION ... FOR SCHEMA foo, bar, baz, qux; makes more sense. Because, my understanding is that: database is a collection of tables, schema is a collection of databases. I may be wrong here, but it's just a thought. What do you think? With Regards, Bharath Rupireddy.
On Sat, Jun 5, 2021 at 7:02 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for identifying and reporting this issue. I have \dn with the
> equivalent query to display only the publication name. The updated
> patch has the fix for the same.
Currently, FOR ALL TABLES is there to add all the tables(existing and
future) in the current database in which the publication is created. I
wonder before providing FOR SCHEMA capability, we better target FOR
DATABASE first, something like CREATE PUBLICATION ... FOR DATABASE
foo, bar, baz, qux; Of course users with the proper permissions on the
specified databases can add them to the publication. This can help to
add all the tables in other databases as well. Then, the CREATE
PUBLICATION ... FOR SCHEMA foo, bar, baz, qux; makes more sense.
Because, my understanding is that: database is a collection of tables,
schema is a collection of databases. I may be wrong here, but it's
just a thought. What do you think?
On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > And, with this feature, since there can be many huge tables inside a > schema, the initial table sync phase of the replication can take a > while. > > Say a user has created a publication for a schema with hundreds of > tables in it, at some point later, can he stop replicating a single or > some tables from that schema? > Isn't this applies to FOR ALL TABLES syntax as well where the user might want to exclude one or a few tables? I am not sure if it is a good idea to deal with this as part of this patch. -- With Regards, Amit Kapila.
On Sat, Jun 5, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for identifying and reporting this issue. I have \dn with the > equivalent query to display only the publication name. The updated > patch has the fix for the same. > The patch no longer applies, I think a recent change to tab-complete has broken it. regards, Ajin Cherian Fujitsu Australia
On Wed, Jun 16, 2021 at 5:12 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Sat, Jun 5, 2021 at 11:32 PM vignesh C <vignesh21@gmail.com> wrote: > > > Thanks for identifying and reporting this issue. I have \dn with the > > equivalent query to display only the publication name. The updated > > patch has the fix for the same. > > > > The patch no longer applies, I think a recent change to tab-complete > has broken it. Thanks for reporting it, the attached patch is a rebased version of the patch with few review comment fixes, I will reply with the comment fixes to the respective mails. Regards, Vignesh
Attachment
>
> On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > And, with this feature, since there can be many huge tables inside a
> > schema, the initial table sync phase of the replication can take a
> > while.
> >
> > Say a user has created a publication for a schema with hundreds of
> > tables in it, at some point later, can he stop replicating a single or
> > some tables from that schema?
> >
>
> Isn't this applies to FOR ALL TABLES syntax as well where the user
> might want to exclude one or a few tables? I am not sure if it is a
> good idea to deal with this as part of this patch.
Yes, this applies to "FOR ALL TABLES" publications also, i had worked for this, the patch for the same is attached at [1]. I had created a separate patch for this so that it is easier for reviewing.
[1] - https://www.postgresql.org/message-id/CALDaNm10g2h29a-oFHsadk-Du6RDhnVQT_vfTGqR82DsjxQLqQ%40mail.gmail.com
>
> Hi Vignesh,
>
>>
>> I have handled the above scenario(drop schema should automatically
>> remove the schema entry from publication schema relation) & addition
>> of tests in the new v2 patch attached.
>> Thoughts?
>
>
> Please see some initial comments:
>
> 1. I think there should be more tests to show that the schema data is actually replicated
> to the subscriber. Currently, I am not seeing the data being replicated when I use FOR SCHEMA.
>
> 2. How does replication behave when a table is added or removed from a subscribed schema
> using ALTER TABLE SET SCHEMA?
>
> 3. Can we have a default schema like a public or current schema that gets replicated in case the user didn't
> specify one, this can be handy to replicate current schema tables.
I felt supporting a syntax like below will be useful for supporting current schema:
create publication testpub for schema schema_name
or
create publication testpub for schema CURRENT_SCHEMA
Let the user specify CURRENT_SCHEMA explicitly if required instead of not specifying anything.
I have implemented similar lines in the V6 patch at [1]. Thoughts?
[1] - https://www.postgresql.org/message-id/CALDaNm10g2h29a-oFHsadk-Du6RDhnVQT_vfTGqR82DsjxQLqQ%40mail.gmail.com
Regards,
Vignesh
On Thu, Jun 17, 2021 at 12:41 AM vignesh C <vignesh21@gmail.com> wrote: > Thanks for reporting it, the attached patch is a rebased version of > the patch with few review comment fixes, I will reply with the comment > fixes to the respective mails. > I've applied the patch, it applies cleand and ran "make check" and tests run fine. Some comments for patch 1: In the commit message, some grammar mistakes: "Changes was done in pg_dump to handle pubtype updation in pg_publication table while the database gets upgraded." -------------- change to -- Changes were done in pg_dump to handle pubtype updation in pg_publication table while the database gets upgraded. ============= Prototypes present in pg_publication.h was moved to publicationcmds.h so that minimal datastructures ... ----------------- change to -- Prototypes present in pg_publication.h were moved to publicationcmds.h so that minimal datastructures .. ======================== In patch 1: In getObjectDescription() + if (!nspname) + { + pfree(pubname); + pfree(nspname); + ReleaseSysCache(tup); Why free nspname if it is NULL? Same comment in getObjectIdentityParts() ============================ In GetAllTablesPublicationRelations() + ScanKeyData key[2]; TableScanDesc scan; HeapTuple tuple; List *result = NIL; + int keycount = 0; classRel = table_open(RelationRelationId, AccessShareLock); - ScanKeyInit(&key[0], + ScanKeyInit(&key[keycount++], Here you have init key[1], but the code below in the same function inits key[0]. I am not sure if this will affect that code below. if (pubviaroot) { ScanKeyInit(&key[0], Anum_pg_class_relkind, BTEqualStrategyNumber, F_CHAREQ, CharGetDatum(RELKIND_PARTITIONED_TABLE)); ================================= in UpdatePublicationTypeTupleValue(): + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls, + replaces); + + /* Update the catalog. */ + CatalogTupleUpdate(rel, &tup->t_self, tup); Not sure if this tup needs to be freed or if the memory context will take care of it. ===================== regards, Ajin Cherian Fujitsu Australia
Thanks for the comments. On Fri, Jun 18, 2021 at 5:25 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Thu, Jun 17, 2021 at 12:41 AM vignesh C <vignesh21@gmail.com> wrote: > > > Thanks for reporting it, the attached patch is a rebased version of > > the patch with few review comment fixes, I will reply with the comment > > fixes to the respective mails. > > > > I've applied the patch, it applies cleand and ran "make check" and > tests run fine. > > Some comments for patch 1: > > In the commit message, some grammar mistakes: > > "Changes was done in > pg_dump to handle pubtype updation in pg_publication table while the database > gets upgraded." > > -------------- change to -- > > Changes were done in > pg_dump to handle pubtype updation in pg_publication table while the database > gets upgraded. > I will modify this. > ============= > > Prototypes present in pg_publication.h was moved to publicationcmds.h so > that minimal datastructures ... > > ----------------- change to -- > > Prototypes present in pg_publication.h were moved to publicationcmds.h so > that minimal datastructures .. > > ======================== I will modify this. > > In patch 1: > > In getObjectDescription() > > + if (!nspname) > + { > + pfree(pubname); > + pfree(nspname); > + ReleaseSysCache(tup); > > Why free nspname if it is NULL? > > Same comment in getObjectIdentityParts() I will modify this. > ============================ > > In GetAllTablesPublicationRelations() > > + ScanKeyData key[2]; > TableScanDesc scan; > HeapTuple tuple; > List *result = NIL; > + int keycount = 0; > > classRel = table_open(RelationRelationId, AccessShareLock); > > - ScanKeyInit(&key[0], > + ScanKeyInit(&key[keycount++], > > Here you have init key[1], but the code below in the same function > inits key[0]. I am not sure if this will affect that code below. > > if (pubviaroot) > { > ScanKeyInit(&key[0], > Anum_pg_class_relkind, > BTEqualStrategyNumber, F_CHAREQ, > CharGetDatum(RELKIND_PARTITIONED_TABLE)); I felt this is ok as we specify the keycount to be 1, so only the key[0] will be used. Thoughts? ScanKeyInit(&key[0], Anum_pg_class_relkind, BTEqualStrategyNumber, F_CHAREQ, CharGetDatum(RELKIND_PARTITIONED_TABLE)); scan = table_beginscan_catalog(classRel, 1, key); > ================================= > > in UpdatePublicationTypeTupleValue(): > > + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls, > + replaces); > + > + /* Update the catalog. */ > + CatalogTupleUpdate(rel, &tup->t_self, tup); > > Not sure if this tup needs to be freed or if the memory context will > take care of it. I felt this is ok, as the cleanup is handled in the caller function "AlterPublication", thoughts? /* Cleanup. */ heap_freetuple(tup); table_close(rel, RowExclusiveLock); I will post an update patch for the fixes shortly. Regards, Vignesh
On Mon, Jun 21, 2021 at 3:16 PM vignesh C <vignesh21@gmail.com> wrote: > I felt this is ok as we specify the keycount to be 1, so only the > key[0] will be used. Thoughts? > ScanKeyInit(&key[0], > Anum_pg_class_relkind, > BTEqualStrategyNumber, F_CHAREQ, > CharGetDatum(RELKIND_PARTITIONED_TABLE)); > > scan = table_beginscan_catalog(classRel, 1, key); > It maybe fine, just doesn't look correct when you look at the function as a whole. > > ================================= > > > > in UpdatePublicationTypeTupleValue(): > > > > + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls, > > + replaces); > > + > > + /* Update the catalog. */ > > + CatalogTupleUpdate(rel, &tup->t_self, tup); > > > > Not sure if this tup needs to be freed or if the memory context will > > take care of it. > > I felt this is ok, as the cleanup is handled in the caller function > "AlterPublication", thoughts? > /* Cleanup. */ > heap_freetuple(tup); > table_close(rel, RowExclusiveLock); that should be fine. regards, Ajin Cherian Fujitsu Australia
On Tue, Jun 22, 2021 at 6:57 AM Ajin Cherian <itsajin@gmail.com> wrote: > > On Mon, Jun 21, 2021 at 3:16 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I felt this is ok as we specify the keycount to be 1, so only the > > key[0] will be used. Thoughts? > > ScanKeyInit(&key[0], > > Anum_pg_class_relkind, > > BTEqualStrategyNumber, F_CHAREQ, > > CharGetDatum(RELKIND_PARTITIONED_TABLE)); > > > > scan = table_beginscan_catalog(classRel, 1, key); > > > > It maybe fine, just doesn't look correct when you look at the function > as a whole. I have added a local variable for this to avoid confusion. Updated patch has the fix for this, this also includes the fixes for the other comments you had given. I have removed the skip table patches to keep the focus on the main patch, once this patch gets into committable shape, I will focus on the skip table patch. Regards, Vignesh
Attachment
On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote: > I have removed the skip table patches to keep the focus on the main > patch, once this patch gets into committable shape, I will focus on > the skip table patch. IMO it's a good idea to start a new thread for the "skip table" feature so that we can discuss it separately. If required you can specify in that thread that the idea of the "skip table" can be applied to the "add schema level support" feature. With Regards, Bharath Rupireddy.
On Tue, Jun 22, 2021 at 2:15 PM vignesh C <vignesh21@gmail.com> wrote: > > > Updated patch has the fix for this, this also includes the fixes for > the other comments you had given. > I have removed the skip table patches to keep the focus on the main > patch, once this patch gets into committable shape, I will focus on > the skip table patch. > I have the following initial comments on the v7 patches: v7-0001 (1) The patch comment is pretty rough and needs work. I suggest something like the following: This patch adds schema-level support for publication. A new schema option allows one or more schemas to be specified, whose tables are selected by the publisher for sending the data to the subscriber. pg_publication maintains information about the publication. Previously, the "puballtables" bool column was used to indicate if the publication was the "FOR ALL TABLES" type (if true) or the "FOR TABLE" type (if false). With the introduction of the "FOR SCHEMA" publication type, it is not easy to determine the publication type. Therefore, a new column "pubtype" has been added to the pg_publication relation to indicate the publication type. There was the possibility of avoiding addition of this new column, but that would require checking puballtables of pg_publication and checking pg_publication_rel for table type publication and then checking pg_publication_schema for schema type publication. Instead, I preferred to add the "pubtype" column, which makes things easier, and also will help support new options in the future. A new system table "pg_publication_schema" has been added, to maintain the schemas that the user wants to publish through the publication. The schema/publication/publication_schema dependency was created to handle the corresponding renaming/removal of schemas to the publication/publication_schema when the schema is renamed/dropped. The Decoder identifies if the relation is part of the publication and replicates it to the subscriber. Changes were made to pg_dump to handle pubtype updation in the pg_publication table when the database is upgraded. Prototypes present in pg_publication.h have been moved to publicationcmds.h so that minimal data structures are exported to pg_dump and psql clients, as the rest of the information need not be exported. CATALOG_VERSION_NO needs to be updated while committing, as this feature involves a catalog change. TODO: version checks for psql/pg_dump need to be changed from 140000 to 150000 once the ongoing release is completed. (2) src/backend/catalog/objectaddress.c getObjectIdentityParts(), case OCLASS_PUBLICATION_SCHEMA Shouldn't pubname/nspname be pfree()d, if objargs/objname are NULL? (3) src/backend/catalog/pg_publication.c (i) GetAllTablesPublications BEFORE: + * Gets list of relations published. AFTER: + * Gets the list of relations published. There are several other cases of newly-added "Gets list of ..." comments. (ii) GetAllTablesPublicationRelations BEFORE: + * Gets list of all relation published by FOR SCHEMA publication(s). AFTER: + * Gets the list of all relations published by FOR SCHEMA publication(s). (4) src/backend/commands/publicationcmds.c Missing function header for function "UpdatePublicationTypeTupleValue". (5) src/backend/parser/gram.y Violation of function header comment format: BEFORE: +/* makeSchemaSpec + * Create a SchemaSpec with the given type + */ AFTER: +/* + * makeSchemaSpec + * Create a SchemaSpec with the given type + */ (6) src/bin/pg_dump/pg_dump.c The following code is within a loop that processes schemas. I think that (in the comment) "Clean up and return" should instead say "Clean up and process the next schema" Also, should say "Schema is not a member". + /* + * Schema is not member of any publications. Clean up and return. + */ + PQclear(res); + continue; (7) src/bin/psql/describe.c Missing function header for function "addFooterToPublicationDesc". v7-0002 (1) doc/src/sgml/catalogs.sgml Typo BEFORE: + respctively. The publication type cannot be changed in other cases. AFTER: + respectively. The publication type cannot be changed in other cases. (2) doc/src/sgml/ref/create_publication.sgml BEFORE: + Create a publication that publishes all changes for all the tables present in +production schema: AFTER: + Create a publication that publishes all changes for all the tables present in +the schema "production": BEFORE: + Create a publication that publishes all changes for all the tables present in +marketing and sales schemas: AFTER: + Create a publication that publishes all changes for all the tables present in +the schemas "marketing" and "sales": (3) src/test/regress/expected/publication.out BEFORE: +-- Drop schema that is not preset in the publication AFTER: +-- Drop schema that is not present in the publication BEFORE: +--- Check create publication on a object which is not schema AFTER: +--- Check create publication on an object which is not schema (4) src/test/regress/sql/publication.sql BEFORE: +-- Drop schema that is not preset in the publication AFTER: +-- Drop schema that is not present in the publication Regards, Greg Nancarrow Fujitsu Australia
Hi I applied your V7* patch and complied it. The following warnings came out, please take a look. >pg_publication.c:688:22: warning: ‘tables’ may be used uninitialized in this function [-Wmaybe-uninitialized] > funcctx->user_fctx = (void *) tables; > ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ >describe.c: In function ‘describePublications’: >describe.c:6479:35: warning: ‘pubtype’ may be used uninitialized in this function [-Wmaybe-uninitialized] > else if (has_pubtype && pubtype == PUBTYPE_SCHEMA) For the warning in pg_publication.c, maybe we can replace the following 'else if' with 'else'. + else if (publication->pubtype == PUBTYPE_SCHEMA) For the warning in describe.c, initialization of 'pubtype' is needed. Regards Tang
On Thu, Jun 24, 2021 at 4:41 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > I applied your V7* patch and complied it. The following warnings came out, please take a look. > I encountered the following warnings when building with the v7 patches applied: (looks like missing #include "catalog/objectaddress.h") pg_prewarm.c:109:29: warning: implicit declaration of function ‘get_relkind_objtype’; did you mean ‘get_element_type’? [-Wimplicit-function-declaration] aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind), get_rel_name(relOid)); ^~~~~~~~~~~~~~~~~~~ get_element_type heap_surgery.c:391:9: warning: implicit declaration of function ‘get_relkind_objtype’; did you mean ‘get_publication_type’? [-Wimplicit-function-declaration] get_relkind_objtype(rel->rd_rel->relkind), ^~~~~~~~~~~~~~~~~~~ get_publication_type pgrowlocks.c:136:29: warning: implicit declaration of function ‘get_relkind_objtype’ [-Wimplicit-function-declaration] aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind), Regards, Greg Nancarrow Fujitsu Australia
On Wed, Jun 23, 2021 at 2:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Tue, Jun 22, 2021 at 2:15 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Updated patch has the fix for this, this also includes the fixes for > > the other comments you had given. > > I have removed the skip table patches to keep the focus on the main > > patch, once this patch gets into committable shape, I will focus on > > the skip table patch. > > > > I have the following initial comments on the v7 patches: > > v7-0001 > > (1) The patch comment is pretty rough and needs work. > > I suggest something like the following: > > This patch adds schema-level support for publication. > A new schema option allows one or more schemas to be specified, whose tables > are selected by the publisher for sending the data to the subscriber. > > pg_publication maintains information about the publication. Previously, the > "puballtables" bool column was used to indicate if the publication was the > "FOR ALL TABLES" type (if true) or the "FOR TABLE" type (if false). With the > introduction of the "FOR SCHEMA" publication type, it is not easy to determine > the publication type. Therefore, a new column "pubtype" has been added to the > pg_publication relation to indicate the publication type. > There was the possibility of avoiding addition of this new column, but that > would require checking puballtables of pg_publication and checking > pg_publication_rel for table type publication and then checking > pg_publication_schema for schema type publication. Instead, I preferred to add > the "pubtype" column, which makes things easier, and also will help support > new options in the future. > A new system table "pg_publication_schema" has been added, to maintain the > schemas that the user wants to publish through the publication. The > schema/publication/publication_schema dependency was created to handle the > corresponding renaming/removal of schemas to the publication/publication_schema > when the schema is renamed/dropped. The Decoder identifies if the relation is > part of the publication and replicates it to the subscriber. Changes were made > to pg_dump to handle pubtype updation in the pg_publication table when the > database is upgraded. > > Prototypes present in pg_publication.h have been moved to publicationcmds.h so > that minimal data structures are exported to pg_dump and psql clients, as the > rest of the information need not be exported. > > CATALOG_VERSION_NO needs to be updated while committing, as this feature > involves a catalog change. > > TODO: version checks for psql/pg_dump need to be changed from 140000 to 150000 > once the ongoing release is completed. > Modified. > (2) src/backend/catalog/objectaddress.c > getObjectIdentityParts(), case OCLASS_PUBLICATION_SCHEMA > > Shouldn't pubname/nspname be pfree()d, if objargs/objname are NULL? > Modified > (3) src/backend/catalog/pg_publication.c > > (i) > GetAllTablesPublications > > > BEFORE: > + * Gets list of relations published. > AFTER: > + * Gets the list of relations published. > > There are several other cases of newly-added "Gets list of ..." comments. > Modified > (ii) > GetAllTablesPublicationRelations > > BEFORE: > + * Gets list of all relation published by FOR SCHEMA publication(s). > AFTER: > + * Gets the list of all relations published by FOR SCHEMA publication(s). > Modified > (4) src/backend/commands/publicationcmds.c > > Missing function header for function "UpdatePublicationTypeTupleValue". > Included function header. > (5) src/backend/parser/gram.y > > Violation of function header comment format: > > BEFORE: > +/* makeSchemaSpec > + * Create a SchemaSpec with the given type > + */ > > AFTER: > +/* > + * makeSchemaSpec > + * Create a SchemaSpec with the given type > + */ > Modified it to: /* * makeSchemaSpec - Create a SchemaSpec with the given type */ > (6) src/bin/pg_dump/pg_dump.c > > The following code is within a loop that processes schemas. > I think that (in the comment) "Clean up and return" should instead say > "Clean up and process the next schema" > Also, should say "Schema is not a member". > > + /* > + * Schema is not member of any publications. Clean up and return. > + */ > + PQclear(res); > + continue; > Modified. > (7) src/bin/psql/describe.c > > Missing function header for function "addFooterToPublicationDesc". > Included it. > v7-0002 > > (1) doc/src/sgml/catalogs.sgml > Typo > > BEFORE: > + respctively. The publication type cannot be changed in other cases. > AFTER: > + respectively. The publication type cannot be changed in other cases. > Modified > (2) doc/src/sgml/ref/create_publication.sgml > > BEFORE: > + Create a publication that publishes all changes for all the tables > present in > +production schema: > AFTER: > + Create a publication that publishes all changes for all the tables > present in > +the schema "production": > > > BEFORE: > + Create a publication that publishes all changes for all the tables > present in > +marketing and sales schemas: > AFTER: > + Create a publication that publishes all changes for all the tables > present in > +the schemas "marketing" and "sales": > Modified. > (3) src/test/regress/expected/publication.out > > BEFORE: > +-- Drop schema that is not preset in the publication > AFTER: > +-- Drop schema that is not present in the publication > > BEFORE: > +--- Check create publication on a object which is not schema > AFTER: > +--- Check create publication on an object which is not schema > Modified. > (4) src/test/regress/sql/publication.sql > > BEFORE: > +-- Drop schema that is not preset in the publication > AFTER: > +-- Drop schema that is not present in the publication > Modified. Thanks for reviewing and providing the comments, Attached v8 patches have the fixes for the comments. Regards, Vignesh
Attachment
>
> On Thu, Jun 24, 2021 at 4:41 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > I applied your V7* patch and complied it. The following warnings came out, please take a look.
> >
>
> I encountered the following warnings when building with the v7 patches applied:
> (looks like missing #include "catalog/objectaddress.h")
>
> pg_prewarm.c:109:29: warning: implicit declaration of function
> ‘get_relkind_objtype’; did you mean ‘get_element_type’?
> [-Wimplicit-function-declaration]
> aclcheck_error(aclresult,
> get_relkind_objtype(rel->rd_rel->relkind), get_rel_name(relOid));
> ^~~~~~~~~~~~~~~~~~~
> get_element_type
>
> heap_surgery.c:391:9: warning: implicit declaration of function
> ‘get_relkind_objtype’; did you mean ‘get_publication_type’?
> [-Wimplicit-function-declaration]
> get_relkind_objtype(rel->rd_rel->relkind),
> ^~~~~~~~~~~~~~~~~~~
> get_publication_type
>
> pgrowlocks.c:136:29: warning: implicit declaration of function
> ‘get_relkind_objtype’ [-Wimplicit-function-declaration]
> aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
>
Thanks for reporting these warnings, I have fixed this in the v8 patch attached at [1].
On Thu, Jun 24, 2021 at 12:10 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > Hi > > I applied your V7* patch and complied it. The following warnings came out, please take a look. > > >pg_publication.c:688:22: warning: ‘tables’ may be used uninitialized in this function [-Wmaybe-uninitialized] > > funcctx->user_fctx = (void *) tables; > > ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > >describe.c: In function ‘describePublications’: > >describe.c:6479:35: warning: ‘pubtype’ may be used uninitialized in this function [-Wmaybe-uninitialized] > > else if (has_pubtype && pubtype == PUBTYPE_SCHEMA) > > For the warning in pg_publication.c, maybe we can replace the following 'else if' with 'else'. > + else if (publication->pubtype == PUBTYPE_SCHEMA) > > For the warning in describe.c, initialization of 'pubtype' is needed. Thanks for reporting these warnings, I have fixed this in the v8 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.gmail.com Regards, Vignesh
On Friday, June 25, 2021 2:25 AM vignesh C <vignesh21@gmail.com>wrote: > > Thanks for reporting these warnings, I have fixed this in the v8 patch > attached at [1]. > [1] - https://www.postgresql.org/message- > id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail. > gmail.com > Thanks for your patch. The warnings are fixed. But I found an issue while using your V8 patch, which is similar to [1]. The case is as below: Drop a schema from publication and refresh publication at subscriber, then insert into publisher table, the inserts stillreplicated to subscriber. The expect result is that the data is no longer replicated. For example: ------publisher------ create schema s1; create table s1.t1 (a int primary key); create publication pub for schema s1; ------subscriber------ create schema s1; create table s1.t1 (a int primary key); create subscription sub connection 'dbname=postgres port=5432' publication pub; ------publisher------ insert into s1.t1 values (1); ------subscriber------ postgres=# select * from s1.t1; a --- 1 (1 row) ------publisher------ alter publication pub drop schema s1; insert into s1.t1 values (2); ------subscriber------ postgres=# select * from s1.t1; a --- 1 2 (2 rows) The similar issue [1] (related to "ALTER PUBLICATION .. DROP TABLE") was fixed by modifying in rel_sync_cache_publication_cbcallback, which is related to PUBLICATIONRELMAP syscache. In my case, I think it used PUBLICATIONSCHEMAMAPsyscache, and no callback was registered for it. Should we register a callback for it or fix it in otherways? [1]https://www.postgresql.org/message-id/flat/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com Regards Tang
On Tuesday, June 29, 2021 11:25 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > Thanks for your patch. The warnings are fixed. > > But I found an issue while using your V8 patch, which is similar to [1]. The case > is as below: > Drop a schema from publication and refresh publication at subscriber, then > insert into publisher table, the inserts still replicated to subscriber. The expect > result is that the data is no longer replicated. > I also saw a problem while using "ALTER PUBLICATION ... ADD SCHEMA ...". The case is as below: Add a schema to publication, then refresh publication at subscriber. Insert into publisher table, the inserts couldn't replicateto subscriber. Steps to reproduce the case: ------publisher------ CREATE PUBLICATION testpub FOR SCHEMA public; ------subscriber------ CREATE SUBSCRIPTION testsub CONNECTION 'dbname=postgres port=5432' PUBLICATION testpub; ------publisher------ CREATE SCHEMA s1; CREATE TABLE s1.t1 (a int PRIMARY KEY); insert into s1.t1 values (1); ALTER PUBLICATION testpub ADD SCHEMA s1; ------subscriber------ CREATE SCHEMA s1; CREATE TABLE s1.t1 (a int PRIMARY KEY); ALTER SUBSCRIPTION testsub REFRESH PUBLICATION; postgres=# SELECT * FROM s1.t1; a --- 1 (1 row) ------publisher------ insert into s1.t1 values (2); ------subscriber------ postgres=# SELECT * FROM s1.t1; a --- 1 (1 row) when I executed "ALTER PUBLICATION ... ADD TABLE ...", rel_sync_cache_publication_cb callback function set replicate_validto false, then it would validate the entry in get_rel_sync_entry function, and marked the pubactions to true.so it worked ok. In the case of "ALTER PUBLICATION ... ADD SCHEMA ...", replicate_valid would not be set to false. Because of this, the pubactionswere still false in get_rel_sync_entry function. So I think the reason for it is similar to the one I reported before [1]. [1] https://www.postgresql.org/message-id/OS0PR01MB61134B20314DE45795DD384CFB029%40OS0PR01MB6113.jpnprd01.prod.outlook.com Regards Tang
>
> On Friday, June 25, 2021 2:25 AM vignesh C <vignesh21@gmail.com>wrote:
> >
> > Thanks for reporting these warnings, I have fixed this in the v8 patch
> > attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm044P_cds1OxZvFse5rE_qQfhbUg5MdtMgsa7t_bZGJdw%40mail.
> > gmail.com
> >
>
> Thanks for your patch. The warnings are fixed.
>
> But I found an issue while using your V8 patch, which is similar to [1]. The case is as below:
> Drop a schema from publication and refresh publication at subscriber, then insert into publisher table, the inserts still replicated to subscriber. The expect result is that the data is no longer replicated.
>
> For example:
> ------publisher------
> create schema s1;
> create table s1.t1 (a int primary key);
> create publication pub for schema s1;
>
> ------subscriber------
> create schema s1;
> create table s1.t1 (a int primary key);
> create subscription sub connection 'dbname=postgres port=5432' publication pub;
>
> ------publisher------
> insert into s1.t1 values (1);
>
> ------subscriber------
> postgres=# select * from s1.t1;
> a
> ---
> 1
> (1 row)
>
> ------publisher------
> alter publication pub drop schema s1;
> insert into s1.t1 values (2);
>
> ------subscriber------
> postgres=# select * from s1.t1;
> a
> ---
> 1
> 2
> (2 rows)
>
Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at [1].
Attachment
On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at [1].
Thanks for your patch. I confirmed that the two issues I reported has been fixed.
Regards
Tang
On Tue, Jun 22, 2021 at 10:11 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote: > > I have removed the skip table patches to keep the focus on the main > > patch, once this patch gets into committable shape, I will focus on > > the skip table patch. > > IMO it's a good idea to start a new thread for the "skip table" > feature so that we can discuss it separately. If required you can > specify in that thread that the idea of the "skip table" can be > applied to the "add schema level support" feature. Hi Vignesh, I will find sometime to review the v9 patch set. I'm curious to know whether the latest v9 patch set has the changes for the "skip table" feature? Or is it being discussed separately? Regards, Bharath Rupireddy.
On Thu, Jul 1, 2021 at 5:43 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Jun 22, 2021 at 10:11 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Tue, Jun 22, 2021 at 9:45 AM vignesh C <vignesh21@gmail.com> wrote: > > > I have removed the skip table patches to keep the focus on the main > > > patch, once this patch gets into committable shape, I will focus on > > > the skip table patch. > > > > IMO it's a good idea to start a new thread for the "skip table" > > feature so that we can discuss it separately. If required you can > > specify in that thread that the idea of the "skip table" can be > > applied to the "add schema level support" feature. > > Hi Vignesh, > > I will find sometime to review the v9 patch set. I'm curious to know > whether the latest v9 patch set has the changes for the "skip table" > feature? Or is it being discussed separately? The v9 patch does not include the "Skip table" feature, I'm planning to focus on that once the current patch is stabilized. Regards, Vignesh
On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at[1]. A comment on v9: src/bin/psql/describe.c + if (pset.sversion >= 15000) I think it should be 150000. Regards Tang
On Fri, Jul 2, 2021 at 10:18 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reportedat [1]. > > A comment on v9: > > src/bin/psql/describe.c > > + if (pset.sversion >= 15000) > > I think it should be 150000. Thanks for reporting this, I will fix this in the next version of the patch. Regards, Vignesh
On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reported at[1]. Hi, I had a look at the patch, please consider following comments. (1) - if (pub->alltables) + if (pub->pubtype == PUBTYPE_ALLTABLES) { publish = true; if (pub->pubviaroot && am_partition) publish_as_relid = llast_oid(get_partition_ancestors(relid)); } + if (pub->pubtype == PUBTYPE_SCHEMA) + { + Oid schemaId = get_rel_namespace(relid); + List *pubschemas = GetPublicationSchemas(pub->oid); + + if (list_member_oid(pubschemas, schemaId)) + { It might be better use "else if" for the second check here. Like: else if (pub->pubtype == PUBTYPE_SCHEMA) Besides, we already have the {schemaoid, pubid} set here, it might be better to scan the cache PUBLICATIONSCHEMAMAP instead of invoking GetPublicationSchemas() which will scan the whole table. (2) + /* Identify which schemas should be dropped. */ + foreach(oldlc, oldschemaids) + { + Oid oldschemaid = lfirst_oid(oldlc); + ListCell *newlc; + bool found = false; + + foreach(newlc, schemaoidlist) + { + Oid newschemaid = lfirst_oid(newlc); + + if (newschemaid == oldschemaid) + { + found = true; + break; + } + } It seems we can use " if (list_member_oid(schemaoidlist, oldschemaid)) " to replace the second foreach loop. (3) there are some testcases change in 0001 patch, it might be better move them to 0002 patch. (4) + case OBJECT_PUBLICATION_SCHEMA: + objnode = (Node *) list_make2(linitial(name), linitial(args)); + break; case OBJECT_USER_MAPPING: objnode = (Node *) list_make2(linitial(name), linitial(args)); break; Does it looks better to merge these two switch cases ? Like: case OBJECT_PUBLICATION_SCHEMA: case OBJECT_USER_MAPPING: objnode = (Node *) list_make2(linitial(name), linitial(args)); break; best regards, houzj
On Thursday, July 8, 2021 11:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> > wrote: > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also > fixes the other issue you reported at [1]. > > Hi, > > I had a look at the patch, please consider following comments. Some more commets. Currently, postgres caches publication actions info in the RelationData::rd_pubactions, but after applying the patch, it seems rd_pubactions is not initialized when using schema level publication. It cound result in some unexpected behaviour when checking if command can be executed with current replica identity. ---- CheckCmdReplicaIdentity ... pubactions = GetRelationPublicationActions(rel); if (cmd == CMD_UPDATE && pubactions->pubupdate) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("cannot update table \"%s\" because it does not have a replica identity and publishes updates", RelationGetRelationName(rel)), errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE."))); ---- Based on the above code, we access rd_pubactions from relcache in function GetRelationPublicationActions(), if we don't have correct publication information, we won't get the error in publication, instead we could lead to the failure in subscription side. To fix, I think (1) the patch can modify the function GetRelationPublicationActions() to get the schema level publication related to the relation, then merge the publication action to the rd_pubactions. In addition, (2) it seems we also need to add the relcache invalidation code about schema level publication when alter publication options. I attached a rough code diff about (1) and (2), the diff is based on the v9 patchset. I hope it can help fix the above issues. Best regards, Hou zhijie
Attachment
On Fri, Jul 9, 2021 at 1:28 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Currently, postgres caches publication actions info in the > RelationData::rd_pubactions, but after applying the patch, it seems > rd_pubactions is not initialized when using schema level publication. > > It cound result in some unexpected behaviour when checking if command can be > executed with current replica identity. > While testing this patch, I'm finding that for a FOR SCHEMA publication, UPDATEs and DELETEs on tables belonging to that schema are not getting replicated (but INSERTs and TRUNCATEs are). Could this be related to the issues that Hou-san has identified? Regards, Greg Nancarrow Fujitsu Australia
On Fri, Jul 9, 2021 at 8:58 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Thursday, July 8, 2021 11:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote > > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> > > wrote: > > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also > > fixes the other issue you reported at [1]. > > > > Hi, > > > > I had a look at the patch, please consider following comments. > > Some more commets. > > Currently, postgres caches publication actions info in the > RelationData::rd_pubactions, but after applying the patch, it seems > rd_pubactions is not initialized when using schema level publication. > > It cound result in some unexpected behaviour when checking if command can be > executed with current replica identity. > > ---- > CheckCmdReplicaIdentity > ... > pubactions = GetRelationPublicationActions(rel); > if (cmd == CMD_UPDATE && pubactions->pubupdate) > ereport(ERROR, > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > errmsg("cannot update table \"%s\" because it does not have a replica identity and publishesupdates", > RelationGetRelationName(rel)), > errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE."))); > ---- > Based on the above code, we access rd_pubactions from relcache in function > GetRelationPublicationActions(), if we don't have correct publication > information, we won't get the error in publication, instead we could lead to > the failure in subscription side. > > To fix, I think (1) the patch can modify the function > GetRelationPublicationActions() to get the schema level publication related to > the relation, then merge the publication action to the rd_pubactions. > > In addition, (2) it seems we also need to add the relcache invalidation code > about schema level publication when alter publication options. > > I attached a rough code diff about (1) and (2), the diff is based on the v9 patchset. > I hope it can help fix the above issues. Thanks for identifying the issues and also providing the fix for it. Those changes are required, I have taken your changes into my patch. The Attached patch has the changes for the same. Regards, Vignesh
Attachment
>
> On Fri, Jul 9, 2021 at 1:28 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Currently, postgres caches publication actions info in the
> > RelationData::rd_pubactions, but after applying the patch, it seems
> > rd_pubactions is not initialized when using schema level publication.
> >
> > It cound result in some unexpected behaviour when checking if command can be
> > executed with current replica identity.
> >
>
> While testing this patch, I'm finding that for a FOR SCHEMA
> publication, UPDATEs and DELETEs on tables belonging to that schema
> are not getting replicated (but INSERTs and TRUNCATEs are).
> Could this be related to the issues that Hou-san has identified?
Thanks for reporting this issue. I felt this issue is the same as the issue which Hou San had reported. This issue is fixed in the v10 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
Regards,
Vignesh
On Thu, Jul 8, 2021 at 9:16 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reportedat [1]. > > Hi, > > I had a look at the patch, please consider following comments. > > (1) > - if (pub->alltables) > + if (pub->pubtype == PUBTYPE_ALLTABLES) > { > publish = true; > if (pub->pubviaroot && am_partition) > publish_as_relid = llast_oid(get_partition_ancestors(relid)); > } > > + if (pub->pubtype == PUBTYPE_SCHEMA) > + { > + Oid schemaId = get_rel_namespace(relid); > + List *pubschemas = GetPublicationSchemas(pub->oid); > + > + if (list_member_oid(pubschemas, schemaId)) > + { > > It might be better use "else if" for the second check here. > Like: else if (pub->pubtype == PUBTYPE_SCHEMA) > > Besides, we already have the {schemaoid, pubid} set here, it might be > better to scan the cache PUBLICATIONSCHEMAMAP instead of invoking > GetPublicationSchemas() which will scan the whole table. Modified. > (2) > + /* Identify which schemas should be dropped. */ > + foreach(oldlc, oldschemaids) > + { > + Oid oldschemaid = lfirst_oid(oldlc); > + ListCell *newlc; > + bool found = false; > + > + foreach(newlc, schemaoidlist) > + { > + Oid newschemaid = lfirst_oid(newlc); > + > + if (newschemaid == oldschemaid) > + { > + found = true; > + break; > + } > + } > > It seems we can use " if (list_member_oid(schemaoidlist, oldschemaid)) " > to replace the second foreach loop. > Modified. > (3) > there are some testcases change in 0001 patch, it might be better move them > to 0002 patch. These changes are required to modify the existing tests. I kept it in 0001 so that 0001 patch can be committed independently. I think we can keep the change as it is, I did not make any changes for this. > (4) > + case OBJECT_PUBLICATION_SCHEMA: > + objnode = (Node *) list_make2(linitial(name), linitial(args)); > + break; > case OBJECT_USER_MAPPING: > objnode = (Node *) list_make2(linitial(name), linitial(args)); > break; > > Does it looks better to merge these two switch cases ? > Like: > case OBJECT_PUBLICATION_SCHEMA: > case OBJECT_USER_MAPPING: > objnode = (Node *) list_make2(linitial(name), linitial(args)); > break; Modified. Thanks for the comments, these comments are fixed as part of the v10 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com Regards, Vignesh
On Fri, Jul 2, 2021 at 10:18 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Wednesday, June 30, 2021 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for reporting this issue, the attached v9 patch fixes this issue. This also fixes the other issue you reportedat [1]. > > A comment on v9: > > src/bin/psql/describe.c > > + if (pset.sversion >= 15000) > > I think it should be 150000. Thanks for reporting this issue, this issue is fixed in the v10 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com Regards, Vignesh
On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for reporting this issue, this issue is fixed in the v10 patch > attached at [1]. > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R- > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com Thanks for fixing it. By applying your V10 patch, I saw three problems, please have a look. 1. An issue about pg_dump. When public schema was published, the publication was created in the output file, but public schema was not added to it.(Other schemas could be added as expected.) I looked into it and found that selectDumpableNamespace function marks DUMP_COMPONENT_DEFINITION as needless when the schemais public, leading to schema public is ignored in getPublicationSchemas. So we'd better check whether schemas shouldbe dumped in another way. I tried to fix it with the following change, please have a look. (Maybe we also need to add some comments for it.) diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index f6b4f12648..a327d2568b 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, NamespaceInfo nspinfo[], int numSchemas) * Ignore publication membership of schemas whose definitions are not * to be dumped. */ - if (!(nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)) + if (!((nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) + || (strcmp(nsinfo->dobj.name, "public") == 0 && nsinfo->dobj.dump != DUMP_COMPONENT_NONE))) continue; pg_log_info("reading publication membership for schema \"%s\"", 2. improper behavior for system schema I found I could create publication for system schema, such as pg_catalog. I think it's better to report an error messagehere, just like creating publication for system table is unallowed. 3. fix for dumpPublicationSchema Should we add a declaration for it and add const decorations to the it's second parameter? Like other similar functions. Regards Tang
On Mon, Jul 12, 2021 at 7:24 PM vignesh C <vignesh21@gmail.com> wrote: > > > Thanks for reporting this issue. I felt this issue is the same as the issue which Hou San had reported. This issue is fixedin the v10 patch attached at [1]. > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com > I did some testing and the issue that I reported does seem to be fixed by the v10 patch. I have some patch review comments for the v10 patch: (1) The following: + if (!OidIsValid(address.objectId)) + { + if (!missing_ok) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("publication schema \"%s\" in publication \"%s\" does not exist", + schemaname, pubname))); + return address; + } + + return address; could just be simplified to: + if (!OidIsValid(address.objectId) && !missing_ok) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("publication schema \"%s\" in publication \"%s\" does not exist", + schemaname, pubname))); + } + + return address; (2) src/backend/catalog/objectaddress.c I think there is a potential illegal memory access (psform->psnspcid) in the case of "!missing_ok", as the tuple is released from the cache on the previous line. + psform = (Form_pg_publication_schema) GETSTRUCT(tup); + pubname = get_publication_name(psform->pspubid, false); + nspname = get_namespace_name(psform->psnspcid); + if (!nspname) + { + pfree(pubname); + ReleaseSysCache(tup); + if (!missing_ok) + elog(ERROR, "cache lookup failed for schema %u", + psform->psnspcid); + break; + } I think this should be: + psform = (Form_pg_publication_schema) GETSTRUCT(tup); + pubname = get_publication_name(psform->pspubid, false); + nspname = get_namespace_name(psform->psnspcid); + if (!nspname) + { + Oid psnspcid = psform->psnspcid; + + pfree(pubname); + ReleaseSysCache(tup); + if (!missing_ok) + elog(ERROR, "cache lookup failed for schema %u", + psnspcid); + break; + } There are two cases of this that need correction (see: case OCLASS_PUBLICATION_SCHEMA). (3) Incomplete function header comment + * makeSchemaSpec - Create a SchemaSpec with the given type Should be: + * makeSchemaSpec - Create a SchemaSpec with the given type and location (4) src/bin/psql/describe.c Shouldn't the following comment say "version 15"? + /* Prior to version 14 check was based on all tables */ + if ((has_pubtype && pubtype == PUBTYPE_TABLE) || + (!has_pubtype && !puballtables)) (5) typedefs.list I think you also need to add "Form_pg_publication_schema" to typedefs.list. Regards, Greg Nancarrow Fujitsu Australia
On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for reporting this issue, this issue is fixed in the v10 patch > > attached at [1]. > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R- > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com > > Thanks for fixing it. > > By applying your V10 patch, I saw three problems, please have a look. > > 1. An issue about pg_dump. > When public schema was published, the publication was created in the output file, but public schema was not added to it.(Other schemas could be added as expected.) > > I looked into it and found that selectDumpableNamespace function marks DUMP_COMPONENT_DEFINITION as needless when the schemais public, leading to schema public is ignored in getPublicationSchemas. So we'd better check whether schemas shouldbe dumped in another way. > > I tried to fix it with the following change, please have a look. (Maybe we also need to add some comments for it.) > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c > index f6b4f12648..a327d2568b 100644 > --- a/src/bin/pg_dump/pg_dump.c > +++ b/src/bin/pg_dump/pg_dump.c > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, NamespaceInfo nspinfo[], int numSchemas) > * Ignore publication membership of schemas whose definitions are not > * to be dumped. > */ > - if (!(nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)) > + if (!((nsinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) > + || (strcmp(nsinfo->dobj.name, "public") == 0 && nsinfo->dobj.dump != DUMP_COMPONENT_NONE))) > continue; > > pg_log_info("reading publication membership for schema \"%s\"", I felt it is intentionally done like that as the pubic schema is created by default, hence it is not required to dump else we will get errors while restoring. Thougths? > 2. improper behavior for system schema > I found I could create publication for system schema, such as pg_catalog. I think it's better to report an error messagehere, just like creating publication for system table is unallowed. Modified. > 3. fix for dumpPublicationSchema > Should we add a declaration for it and add const decorations to the it's second parameter? Like other similar functions. Modified to include const, declaration is not required as the function definition is before function call, so not making this change. Thanks for your comments, the attached v11 patch fixes the issues. Regards, Vignesh
Attachment
>
> On Mon, Jul 12, 2021 at 7:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Thanks for reporting this issue. I felt this issue is the same as the issue which Hou San had reported. This issue is fixed in the v10 patch attached at [1].
> > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> >
>
> I did some testing and the issue that I reported does seem to be fixed
> by the v10 patch.
>
> I have some patch review comments for the v10 patch:
>
> (1)
>
> The following:
>
> + if (!OidIsValid(address.objectId))
> + {
> + if (!missing_ok)
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("publication schema \"%s\" in publication \"%s\"
> does not exist",
> + schemaname, pubname)));
> + return address;
> + }
> +
> + return address;
>
> could just be simplified to:
>
> + if (!OidIsValid(address.objectId) && !missing_ok)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("publication schema \"%s\" in publication \"%s\" does not exist",
> + schemaname, pubname)));
> + }
> +
> + return address;
Modified
> (2) src/backend/catalog/objectaddress.c
>
> I think there is a potential illegal memory access (psform->psnspcid)
> in the case of "!missing_ok", as the tuple is released from the cache
> on the previous line.
>
> + psform = (Form_pg_publication_schema) GETSTRUCT(tup);
> + pubname = get_publication_name(psform->pspubid, false);
> + nspname = get_namespace_name(psform->psnspcid);
> + if (!nspname)
> + {
> + pfree(pubname);
> + ReleaseSysCache(tup);
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for schema %u",
> + psform->psnspcid);
> + break;
> + }
>
>
> I think this should be:
>
> + psform = (Form_pg_publication_schema) GETSTRUCT(tup);
> + pubname = get_publication_name(psform->pspubid, false);
> + nspname = get_namespace_name(psform->psnspcid);
> + if (!nspname)
> + {
> + Oid psnspcid = psform->psnspcid;
> +
> + pfree(pubname);
> + ReleaseSysCache(tup);
> + if (!missing_ok)
> + elog(ERROR, "cache lookup failed for schema %u",
> + psnspcid);
> + break;
> + }
>
> There are two cases of this that need correction (see: case
> OCLASS_PUBLICATION_SCHEMA).
Modified
> (3) Incomplete function header comment
>
> + * makeSchemaSpec - Create a SchemaSpec with the given type
>
> Should be:
>
> + * makeSchemaSpec - Create a SchemaSpec with the given type and location
Modified
> (4) src/bin/psql/describe.c
>
> Shouldn't the following comment say "version 15"?
>
> + /* Prior to version 14 check was based on all tables */
> + if ((has_pubtype && pubtype == PUBTYPE_TABLE) ||
> + (!has_pubtype && !puballtables))
Modified
> (5) typedefs.list
>
> I think you also need to add "Form_pg_publication_schema" to typedefs.list.
Modified.
Thanks for the comments, these comments are fixed in the v11 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1oZzaEsZC1W8MRNGZ6LWOayC54_UzyRV%2BnCh8w0yW74g%40mail.gmail.com
Regards,
Vignesh
Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote: > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com > <tanghy.fnst@fujitsu.com> wrote: > > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Thanks for reporting this issue, this issue is fixed in the v10 > > > patch attached at [1]. > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R- > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com > > > > Thanks for fixing it. > > > > By applying your V10 patch, I saw three problems, please have a look. > > > > 1. An issue about pg_dump. > > When public schema was published, the publication was created in the > > output file, but public schema was not added to it. (Other schemas > > could be added as expected.) > > > > I looked into it and found that selectDumpableNamespace function marks > DUMP_COMPONENT_DEFINITION as needless when the schema is public, > leading to schema public is ignored in getPublicationSchemas. So we'd better > check whether schemas should be dumped in another way. > > > > I tried to fix it with the following change, please have a look. > > (Maybe we also need to add some comments for it.) > > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c > > index f6b4f12648..a327d2568b 100644 > > --- a/src/bin/pg_dump/pg_dump.c > > +++ b/src/bin/pg_dump/pg_dump.c > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, > NamespaceInfo nspinfo[], int numSchemas) > > * Ignore publication membership of schemas whose > definitions are not > > * to be dumped. > > */ > > - if (!(nsinfo->dobj.dump & > DUMP_COMPONENT_DEFINITION)) > > + if (!((nsinfo->dobj.dump & > DUMP_COMPONENT_DEFINITION) > > + || (strcmp(nsinfo->dobj.name, "public") == 0 > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE))) > > continue; > > > > pg_log_info("reading publication membership for schema > > \"%s\"", > > I felt it is intentionally done like that as the pubic schema is created by default, > hence it is not required to dump else we will get errors while restoring. > Thougths? Thanks for the new patches and I also looked at this issue. For user defined schema and publication: -------------------------- create schema s1; create publication pub2 for SCHEMA s1; -------------------------- pg_dump will only generate the following SQLs: ------pg_dump result------ CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate'); ALTER PUBLICATION pub2 ADD SCHEMA s1; -------------------------- But for the public schema: -------------------------- create publication pub for SCHEMA public; -------------------------- pg_dump will only generate the following SQL: ------pg_dump result------ CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate'); -------------------------- It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which means the public schema won't be published after restoring. So, I think we'd better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ? Best regards, Hou zhijie
On Wed, Jul 14, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for your comments, the attached v11 patch fixes the issues. > Thanks for your work on this. I have some minor review comments on the documentation: (1) wrong link (uses altersubscription instead of alterpublication) doc/src/sgml/catalogs.sgml BEFORE: + created as an empty publication type. When a table or schema is added to + the publication using <link linkend="sql-altersubscription"> + <command>ALTER PUBLICATION</command></link> then the publication type AFTER: + created as an empty publication type. When a table or schema is added to + the publication using <link linkend="sql-alterpublication"> + <command>ALTER PUBLICATION</command></link> then the publication type (2) Improve wording and suggest "or" instead of "and" doc/src/sgml/catalogs.sgml BEFORE: + If a publication is created without specifying any of + <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> and + <literal>FOR SCHEMA</literal> option, then the publication will be AFTER: + If a publication is created without specifying any of the + <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> or + <literal>FOR SCHEMA</literal> options, then the publication will be (3) space at start of literal doc/src/sgml/catalogs.sgml + and <literal> FOR SCHEMA</literal>, so for such publications there will be a (4) Should say "variants of this command change ..." ? + The fourth, fifth and sixth variants change which schemas are part of the Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ... (PubType is getting set to 'e' instead of 's' test_pub=# create publication pub1; CREATE PUBLICATION test_pub=# create table myschema.test(i int); CREATE TABLE test_pub=# alter publication pub1 set schema myschema; ALTER PUBLICATION test_pub=# \dRp pub1 List of publications Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType ------+-------+------------+---------+---------+---------+-----------+----------+--------- pub1 | gregn | f | t | t | t | t | f | e (1 row) test_pub=# alter publication pub1 add table test; ALTER PUBLICATION test_pub=# \dRp pub1 List of publications Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType ------+-------+------------+---------+---------+---------+-----------+----------+--------- pub1 | gregn | f | t | t | t | t | f | t (1 row) When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty publication, it seems OK. Regards, Greg Nancarrow Fujitsu Australia
On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com > > <tanghy.fnst@fujitsu.com> wrote: > > > > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Thanks for reporting this issue, this issue is fixed in the v10 > > > > patch attached at [1]. > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R- > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com > > > > > > Thanks for fixing it. > > > > > > By applying your V10 patch, I saw three problems, please have a look. > > > > > > 1. An issue about pg_dump. > > > When public schema was published, the publication was created in the > > > output file, but public schema was not added to it. (Other schemas > > > could be added as expected.) > > > > > > I looked into it and found that selectDumpableNamespace function marks > > DUMP_COMPONENT_DEFINITION as needless when the schema is public, > > leading to schema public is ignored in getPublicationSchemas. So we'd better > > check whether schemas should be dumped in another way. > > > > > > I tried to fix it with the following change, please have a look. > > > (Maybe we also need to add some comments for it.) > > > > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c > > > index f6b4f12648..a327d2568b 100644 > > > --- a/src/bin/pg_dump/pg_dump.c > > > +++ b/src/bin/pg_dump/pg_dump.c > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, > > NamespaceInfo nspinfo[], int numSchemas) > > > * Ignore publication membership of schemas whose > > definitions are not > > > * to be dumped. > > > */ > > > - if (!(nsinfo->dobj.dump & > > DUMP_COMPONENT_DEFINITION)) > > > + if (!((nsinfo->dobj.dump & > > DUMP_COMPONENT_DEFINITION) > > > + || (strcmp(nsinfo->dobj.name, "public") == 0 > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE))) > > > continue; > > > > > > pg_log_info("reading publication membership for schema > > > \"%s\"", > > > > I felt it is intentionally done like that as the pubic schema is created by default, > > hence it is not required to dump else we will get errors while restoring. > > Thougths? > > Thanks for the new patches and I also looked at this issue. > > For user defined schema and publication: > -------------------------- > create schema s1; > create publication pub2 for SCHEMA s1; > -------------------------- > > pg_dump will only generate the following SQLs: > > ------pg_dump result------ > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate'); > ALTER PUBLICATION pub2 ADD SCHEMA s1; > -------------------------- > > But for the public schema: > -------------------------- > create publication pub for SCHEMA public; > -------------------------- > > pg_dump will only generate the following SQL: > > ------pg_dump result------ > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate'); > -------------------------- > > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which > means the public schema won't be published after restoring. So, I think we'd > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ? Thanks for reporting this issue, this issue is fixed in the v12 patch attached. Regards, Vignesh
Attachment
>
> On Wed, Jul 14, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for your comments, the attached v11 patch fixes the issues.
> >
>
> Thanks for your work on this.
>
> I have some minor review comments on the documentation:
>
> (1) wrong link (uses altersubscription instead of alterpublication)
> doc/src/sgml/catalogs.sgml
>
> BEFORE:
> + created as an empty publication type. When a table or schema is added to
> + the publication using <link linkend="sql-altersubscription">
> + <command>ALTER PUBLICATION</command></link> then the publication type
>
> AFTER:
> + created as an empty publication type. When a table or schema is added to
> + the publication using <link linkend="sql-alterpublication">
> + <command>ALTER PUBLICATION</command></link> then the publication type
Modified.
> (2) Improve wording and suggest "or" instead of "and"
> doc/src/sgml/catalogs.sgml
>
> BEFORE:
> + If a publication is created without specifying any of
> + <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> and
> + <literal>FOR SCHEMA</literal> option, then the publication will be
>
> AFTER:
> + If a publication is created without specifying any of the
> + <literal>FOR ALL TABLES</literal>, <literal>FOR TABLE</literal> or
> + <literal>FOR SCHEMA</literal> options, then the publication will be
Modified.
> (3) space at start of literal
> doc/src/sgml/catalogs.sgml
>
> + and <literal> FOR SCHEMA</literal>, so for such publications there will be a
Modified.
> (4) Should say "variants of this command change ..." ?
>
> + The fourth, fifth and sixth variants change which schemas are part of the
Modified.
>
> Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
> (PubType is getting set to 'e' instead of 's'
>
> test_pub=# create publication pub1;
> CREATE PUBLICATION
> test_pub=# create table myschema.test(i int);
> CREATE TABLE
> test_pub=# alter publication pub1 set schema myschema;
> ALTER PUBLICATION
> test_pub=# \dRp pub1
> List of publications
> Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> Via root | PubType
> ------+-------+------------+---------+---------+---------+-----------+----------+---------
> pub1 | gregn | f | t | t | t | t |
> f | e
> (1 row)
>
> test_pub=# alter publication pub1 add table test;
> ALTER PUBLICATION
> test_pub=# \dRp pub1
> List of publications
> Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> Via root | PubType
> ------+-------+------------+---------+---------+---------+-----------+----------+---------
> pub1 | gregn | f | t | t | t | t |
> f | t
> (1 row)
>
>
> When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
> publication, it seems OK.
Modified.
Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
Vignesh
On Friday, July 16, 2021 6:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote:
> >
> > Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ...
> > (PubType is getting set to 'e' instead of 's'
> >
> > test_pub=# create publication pub1;
> > CREATE PUBLICATION
> > test_pub=# create table myschema.test(i int);
> > CREATE TABLE
> > test_pub=# alter publication pub1 set schema myschema;
> > ALTER PUBLICATION
> > test_pub=# \dRp pub1
> > List of publications
> > Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> > Via root | PubType
> > ------+-------+------------+---------+---------+---------+-----------+----------+---------
> > pub1 | gregn | f | t | t | t | t |
> > f | e
> > (1 row)
> >
> > test_pub=# alter publication pub1 add table test;
> > ALTER PUBLICATION
> > test_pub=# \dRp pub1
> > List of publications
> > Name | Owner | All tables | Inserts | Updates | Deletes | Truncates |
> > Via root | PubType
> > ------+-------+------------+---------+---------+---------+-----------+----------+---------
> > pub1 | gregn | f | t | t | t | t |
> > f | t
> > (1 row)
> >
> >
> > When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty
> > publication, it seems OK.
>
> Modified.
>
Thanks for your patch. But there is a problem about "ALTER PUBLICATION … SET TABLE …", which is similar to the issue Greg reported at [1].
For example:
postgres=# CREATE TABLE public.t1 (a int);
CREATE TABLE
postgres=# CREATE PUBLICATION pub1;
CREATE PUBLICATION
postgres=# ALTER PUBLICATION pub1 SET TABLE public.t1;
ALTER PUBLICATION
postgres=# \dRp
List of publications
Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType
------+----------+------------+---------+---------+---------+-----------+----------+---------
pub1 | postgres | f | t | t | t | t | f | e
(1 row)
I think PubType in this case should be 't' instead of 'e'. Please have a look.
Regards,
Tang
On Friday, July 16, 2021 6:10 PM vignesh C <vignesh21@gmail.com> > On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote: > > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com > > > <tanghy.fnst@fujitsu.com> wrote: > > > > > > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com> > wrote: > > > > > > > > > > Thanks for reporting this issue, this issue is fixed in the v10 > > > > > patch attached at [1]. > > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R- > > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com > > > > > > > > Thanks for fixing it. > > > > > > > > By applying your V10 patch, I saw three problems, please have a look. > > > > > > > > 1. An issue about pg_dump. > > > > When public schema was published, the publication was created in the > > > > output file, but public schema was not added to it. (Other schemas > > > > could be added as expected.) > > > > > > > > I looked into it and found that selectDumpableNamespace function marks > > > DUMP_COMPONENT_DEFINITION as needless when the schema is public, > > > leading to schema public is ignored in getPublicationSchemas. So we'd better > > > check whether schemas should be dumped in another way. > > > > > > > > I tried to fix it with the following change, please have a look. > > > > (Maybe we also need to add some comments for it.) > > > > > > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c > > > > index f6b4f12648..a327d2568b 100644 > > > > --- a/src/bin/pg_dump/pg_dump.c > > > > +++ b/src/bin/pg_dump/pg_dump.c > > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout, > > > NamespaceInfo nspinfo[], int numSchemas) > > > > * Ignore publication membership of schemas whose > > > definitions are not > > > > * to be dumped. > > > > */ > > > > - if (!(nsinfo->dobj.dump & > > > DUMP_COMPONENT_DEFINITION)) > > > > + if (!((nsinfo->dobj.dump & > > > DUMP_COMPONENT_DEFINITION) > > > > + || (strcmp(nsinfo->dobj.name, "public") == 0 > > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE))) > > > > continue; > > > > > > > > pg_log_info("reading publication membership for schema > > > > \"%s\"", > > > > > > I felt it is intentionally done like that as the pubic schema is created by default, > > > hence it is not required to dump else we will get errors while restoring. > > > Thougths? > > > > Thanks for the new patches and I also looked at this issue. > > > > For user defined schema and publication: > > -------------------------- > > create schema s1; > > create publication pub2 for SCHEMA s1; > > -------------------------- > > > > pg_dump will only generate the following SQLs: > > > > ------pg_dump result------ > > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate'); > > ALTER PUBLICATION pub2 ADD SCHEMA s1; > > -------------------------- > > > > But for the public schema: > > -------------------------- > > create publication pub for SCHEMA public; > > -------------------------- > > > > pg_dump will only generate the following SQL: > > > > ------pg_dump result------ > > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate'); > > -------------------------- > > > > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which > > means the public schema won't be published after restoring. So, I think we'd > > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ? > > Thanks for reporting this issue, this issue is fixed in the v12 patch attached. > I tested your v12 patch and found a problem in the following case. Step 1: postgres=# create schema s1; CREATE SCHEMA postgres=# create table s1.t1 (a int); CREATE TABLE postgres=# create publication pub_t for table s1.t1; CREATE PUBLICATION postgres=# create publication pub_s for schema s1; CREATE PUBLICATION Step 2: pg_dump -N s1 I dumped and excluded schema s1, pg_dump generated the following SQL: ------------------------------- ALTER PUBLICATION pub_s ADD SCHEMA s1; I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD TABLE s1.t1" was not generated in my case. Thoughts? Regards Tang
On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote: > > Modified. > > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1]. > [1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com > There seems to be a problem with ALTER PUBLICATION ... SET TABLE ... After that command, it still regards it as an empty (e) publication, so I can then ALTER PUBLICATION ... ADD SCHEMA ... e.g. test_pub=# create schema myschema; CREATE SCHEMA test_pub=# CREATE TABLE myschema.test (key int, value text, data jsonb); CREATE TABLE test_pub=# create publication pub1; CREATE PUBLICATION test_pub=# \dRp+ pub1 Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype -------+------------+---------+---------+---------+-----------+----------+--------- gregn | f | t | t | t | t | f | e (1 row) test_pub=# alter publication pub1 set table myschema.test; ALTER PUBLICATION test_pub=# \dRp+ pub1 Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype -------+------------+---------+---------+---------+-----------+----------+--------- gregn | f | t | t | t | t | f | e (1 row) test_pub=# alter publication pub1 add schema myschema; ALTER PUBLICATION test_pub=# \dRp+ pub1 Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype -------+------------+---------+---------+---------+-----------+----------+--------- gregn | f | t | t | t | t | f | s Schemas: "myschema" Regards, Greg Nancarrow Fujitsu Australia
On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Modified.
>
> Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
> [1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>
There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
After that command, it still regards it as an empty (e) publication,
so I can then ALTER PUBLICATION ... ADD SCHEMA ...
in AlterPublicationTables for SET TABLE command.
More broadly, I am not clear about the behaviour of the patch when a
publication is created to publish only certain tables, and is later altered to publish
a whole schema. I think such behaviour is legitimate. However,
AFAIU as per current code we can't update the publication type
from PUBTYPE_TABLE to PUBTYPE_SCHEMA.
I have some review comments as follows:
1.
In ConvertSchemaSpecListToOidList(List *schemas) function:
+ search_path = fetch_search_path(false);
+ nspname = get_namespace_name(linitial_oid(search_path));
+ if (nspname == NULL) /* recently-deleted namespace? */
+ ereport(ERROR,
+ errcode(ERRCODE_UNDEFINED_SCHEMA),
+ errmsg("no schema has been selected"));
+
+ schemoid = get_namespace_oid(nspname, false);
+ break;
The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
doing Schema oid = liinital_oid(search_path)); should be enough.
2. In the same function should there be an if else condition block instead of a switch case as
there are only two cases.
Thank you,
On Mon, Jul 19, 2021 at 8:43 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Friday, July 16, 2021 6:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Fri, Jul 16, 2021 at 9:25 AM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote: > > > > > > > > Also, there seems to be an issue with ALTER PUBLICATION ... SET SCHEMA ... > > > > (PubType is getting set to 'e' instead of 's' > > > > > > > > test_pub=# create publication pub1; > > > > CREATE PUBLICATION > > > > test_pub=# create table myschema.test(i int); > > > > CREATE TABLE > > > > test_pub=# alter publication pub1 set schema myschema; > > > > ALTER PUBLICATION > > > > test_pub=# \dRp pub1 > > > > List of publications > > > > Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | > > > > Via root | PubType > > > > ------+-------+------------+---------+---------+---------+-----------+----------+--------- > > > > pub1 | gregn | f | t | t | t | t | > > > > f | e > > > > (1 row) > > > > > > > > test_pub=# alter publication pub1 add table test; > > > > ALTER PUBLICATION > > > > test_pub=# \dRp pub1 > > > > List of publications > > > > Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | > > > > Via root | PubType > > > > ------+-------+------------+---------+---------+---------+-----------+----------+--------- > > > > pub1 | gregn | f | t | t | t | t | > > > > f | t > > > > (1 row) > > > > > > > > > > > > When I use "ADD SCHEMA" instead of "SET SCHEMA" on an empty > > > > publication, it seems OK. > > > > > > Modified. > > > > > > > Thanks for your patch. But there is a problem about "ALTER PUBLICATION … SET TABLE …", which is similar to the issue Gregreported at [1]. > > > > For example: > > postgres=# CREATE TABLE public.t1 (a int); > > CREATE TABLE > > postgres=# CREATE PUBLICATION pub1; > > CREATE PUBLICATION > > postgres=# ALTER PUBLICATION pub1 SET TABLE public.t1; > > ALTER PUBLICATION > > postgres=# \dRp > > List of publications > > Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubType > > ------+----------+------------+---------+---------+---------+-----------+----------+--------- > > pub1 | postgres | f | t | t | t | t | f | e > > (1 row) > > > > I think PubType in this case should be 't' instead of 'e'. Please have a look. Thanks for reporting this issue, this issue is fixed in the attached v13 patch. I have changed relation name pg_publication_schema to pg_publication_sch so that the names are in similar lines with pg_publication_rel relation and similar changes were done for variable names too. Regards, Vignesh
Attachment
>
> On Friday, July 16, 2021 6:10 PM vignesh C <vignesh21@gmail.com>
> > On Wed, Jul 14, 2021 at 6:25 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Wednesday, July 14, 2021 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.fnst@fujitsu.com
> > > > <tanghy.fnst@fujitsu.com> wrote:
> > > > >
> > > > > On Monday, July 12, 2021 5:36 PM vignesh C <vignesh21@gmail.com>
> > wrote:
> > > > > >
> > > > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > > > patch attached at [1].
> > > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > > > >
> > > > > Thanks for fixing it.
> > > > >
> > > > > By applying your V10 patch, I saw three problems, please have a look.
> > > > >
> > > > > 1. An issue about pg_dump.
> > > > > When public schema was published, the publication was created in the
> > > > > output file, but public schema was not added to it. (Other schemas
> > > > > could be added as expected.)
> > > > >
> > > > > I looked into it and found that selectDumpableNamespace function marks
> > > > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > > > leading to schema public is ignored in getPublicationSchemas. So we'd better
> > > > check whether schemas should be dumped in another way.
> > > > >
> > > > > I tried to fix it with the following change, please have a look.
> > > > > (Maybe we also need to add some comments for it.)
> > > > >
> > > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > > > index f6b4f12648..a327d2568b 100644
> > > > > --- a/src/bin/pg_dump/pg_dump.c
> > > > > +++ b/src/bin/pg_dump/pg_dump.c
> > > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > > > NamespaceInfo nspinfo[], int numSchemas)
> > > > > * Ignore publication membership of schemas whose
> > > > definitions are not
> > > > > * to be dumped.
> > > > > */
> > > > > - if (!(nsinfo->dobj.dump &
> > > > DUMP_COMPONENT_DEFINITION))
> > > > > + if (!((nsinfo->dobj.dump &
> > > > DUMP_COMPONENT_DEFINITION)
> > > > > + || (strcmp(nsinfo->dobj.name, "public") == 0
> > > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > > > > continue;
> > > > >
> > > > > pg_log_info("reading publication membership for schema
> > > > > \"%s\"",
> > > >
> > > > I felt it is intentionally done like that as the pubic schema is created by default,
> > > > hence it is not required to dump else we will get errors while restoring.
> > > > Thougths?
> > >
> > > Thanks for the new patches and I also looked at this issue.
> > >
> > > For user defined schema and publication:
> > > --------------------------
> > > create schema s1;
> > > create publication pub2 for SCHEMA s1;
> > > --------------------------
> > >
> > > pg_dump will only generate the following SQLs:
> > >
> > > ------pg_dump result------
> > > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete, truncate');
> > > ALTER PUBLICATION pub2 ADD SCHEMA s1;
> > > --------------------------
> > >
> > > But for the public schema:
> > > --------------------------
> > > create publication pub for SCHEMA public;
> > > --------------------------
> > >
> > > pg_dump will only generate the following SQL:
> > >
> > > ------pg_dump result------
> > > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete, truncate');
> > > --------------------------
> > >
> > > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA public;" which
> > > means the public schema won't be published after restoring. So, I think we'd
> > > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?
> >
> > Thanks for reporting this issue, this issue is fixed in the v12 patch attached.
> >
>
> I tested your v12 patch and found a problem in the following case.
>
> Step 1:
> postgres=# create schema s1;
> CREATE SCHEMA
> postgres=# create table s1.t1 (a int);
> CREATE TABLE
> postgres=# create publication pub_t for table s1.t1;
> CREATE PUBLICATION
> postgres=# create publication pub_s for schema s1;
> CREATE PUBLICATION
>
> Step 2:
> pg_dump -N s1
>
> I dumped and excluded schema s1, pg_dump generated the following SQL:
> -------------------------------
> ALTER PUBLICATION pub_s ADD SCHEMA s1;
>
> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD TABLE s1.t1" was not generated in my case. Thoughts?
Thanks for reporting this issue, this issue is fixed in the v13 patch posted at [1]
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com
Regards,
Vignesh
On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Modified. > > > > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1]. > > [1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com > > > > There seems to be a problem with ALTER PUBLICATION ... SET TABLE ... > After that command, it still regards it as an empty (e) publication, > so I can then ALTER PUBLICATION ... ADD SCHEMA ... > > e.g. > > test_pub=# create schema myschema; > CREATE SCHEMA > test_pub=# CREATE TABLE myschema.test (key int, value text, data jsonb); > CREATE TABLE > test_pub=# create publication pub1; > CREATE PUBLICATION > test_pub=# \dRp+ pub1 > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > root | Pubtype > -------+------------+---------+---------+---------+-----------+----------+--------- > gregn | f | t | t | t | t | f | e > (1 row) > > test_pub=# alter publication pub1 set table myschema.test; > ALTER PUBLICATION > test_pub=# \dRp+ pub1 > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > root | Pubtype > -------+------------+---------+---------+---------+-----------+----------+--------- > gregn | f | t | t | t | t | f | e > (1 row) > > test_pub=# alter publication pub1 add schema myschema; > ALTER PUBLICATION > test_pub=# \dRp+ pub1 > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > root | Pubtype > -------+------------+---------+---------+---------+-----------+----------+--------- > gregn | f | t | t | t | t | f | s > Schemas: > "myschema" Thanks for reporting this issue, this issue is fixed in the v13 patch posted at [1] [1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com Regards, Vignesh
>
>
>
> On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>>
>> On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>> >
>> > Modified.
>> >
>> > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
>> > [1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>> >
>>
>> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
>> After that command, it still regards it as an empty (e) publication,
>> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>>
>
> One issue here is that the code to update publication type is missing
> in AlterPublicationTables for SET TABLE command.
Modified.
> More broadly, I am not clear about the behaviour of the patch when a
> publication is created to publish only certain tables, and is later altered to publish
> a whole schema. I think such behaviour is legitimate. However,
> AFAIU as per current code we can't update the publication type
> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.
I initially thought this might not be required for users, I have not made any change for this, I will try to get a few more people's opinion on this and then fix it if required.
> I have some review comments as follows:
> 1.
> In ConvertSchemaSpecListToOidList(List *schemas) function:
> + search_path = fetch_search_path(false);
> + nspname = get_namespace_name(linitial_oid(search_path));
> + if (nspname == NULL) /* recently-deleted namespace? */
> + ereport(ERROR,
> + errcode(ERRCODE_UNDEFINED_SCHEMA),
> + errmsg("no schema has been selected"));
> +
> + schemoid = get_namespace_oid(nspname, false);
> + break;
>
> The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
> doing Schema oid = liinital_oid(search_path)); should be enough.
Modified
> 2. In the same function should there be an if else condition block instead of a switch case as
> there are only two cases.
Modified.
Thanks for the comments, these comments are fixed in the v13 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com
Regards,
Vignesh
From: vignesh C <vignesh21@gmail.com>
Sent: Thursday, July 22, 2021 1:38 AM
To: Rahila Syed <rahilasyed90@gmail.com>
Cc: Greg Nancarrow <gregn4422@gmail.com>; Tang, Haiying/唐 海英 <tanghy.fnst@fujitsu.com>; Ajin Cherian <itsajin@gmail.com>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Subject: Re: Added schema level support for publication.
On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
>
>
> On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>>
>> On Fri, Jul 16, 2021 at 8:13 PM vignesh C <vignesh21@gmail.com> wrote:
>> >
>> > Modified.
>> >
>> > Thanks for the comments, these issues are fixed as part of the v12 patch posted at [1].
>> > [1] - https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>> >
>>
>> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
>> After that command, it still regards it as an empty (e) publication,
>> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>>
>
> One issue here is that the code to update publication type is missing
> in AlterPublicationTables for SET TABLE command.
Modified.
> More broadly, I am not clear about the behaviour of the patch when a
> publication is created to publish only certain tables, and is later altered to publish
> a whole schema. I think such behaviour is legitimate. However,
> AFAIU as per current code we can't update the publication type
> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.
I initially thought this might not be required for users, I have not made any change for this, I will try to get a few more people's opinion on this and then fix it if required.
> I have some review comments as follows:
> 1.
> In ConvertSchemaSpecListToOidList(List *schemas) function:
> + search_path = fetch_search_path(false);
> + nspname = get_namespace_name(linitial_oid(search_path));
> + if (nspname == NULL) /* recently-deleted namespace? */
> + ereport(ERROR,
> + errcode(ERRCODE_UNDEFINED_SCHEMA),
> + errmsg("no schema has been selected"));
> +
> + schemoid = get_namespace_oid(nspname, false);
> + break;
>
> The call get_namespace_oid() is perhaps not needed as fetch_search_path already fetches oids and simply
> doing Schema oid = liinital_oid(search_path)); should be enough.
Modified
> 2. In the same function should there be an if else condition block instead of a switch case as
> there are only two cases.
Modified.
Thanks for the comments, these comments are fixed in the v13 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com
Regards,
Vignesh
On, July 22, 2021 1:38 AM vignesh C <vignesh21@gmail.com> wrote: >On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed <mailto:rahilasyed90@gmail.com> wrote: >> More broadly, I am not clear about the behaviour of the patch when a >> publication is created to publish only certain tables, and is later altered to publish >> a whole schema. I think such behaviour is legitimate. However, >> AFAIU as per current code we can't update the publication type >> from PUBTYPE_TABLE to PUBTYPE_SCHEMA. > I initially thought this might not be required for users, I have not made any > change for this, I will try to get a few more people's opinion on this and then fix it if required. Currently, It's not allowed to ALTER a FOR TABLE PUBLICATION to a FOR ALL TABLES PUBLICATION. So, I am not sure it's legitimate to ALTER a FOR TABLE PUBLICATION to a FOR SCHEMA PUBLICATION. Personally, It sounds more like a separate feature which can be discussed in a separate thread. Best regards, houzj
On July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote > > I think PubType in this case should be 't' instead of 'e'. Please have a look. > > Thanks for reporting this issue, this issue is fixed in the attached v13 patch. > I have changed relation name pg_publication_schema to pg_publication_sch > so that the names are in similar lines with pg_publication_rel relation and similar > changes were done for variable names too. Hi, Thanks for the new version patches. I had a few comments. 1) + + appendPQExpBuffer(query, "ALTER PUBLICATION %s ", fmtId(pubrinfo->pubname)); + appendPQExpBuffer(query, "ADD SCHEMA %s;\n", fmtId(schemainfo->dobj.name)); + It seems we can combine these two function call. like appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD SCHEMA %s;\n", fmtId(pubrinfo->pubname), fmtId(schemainfo->dobj.name)); 2) + footers[0] = pstrdup("Publications:"); + This word seems need to be translated. footers[0] = pstrdup(_("Publications:")); 3) I think it might be better to add a testcase to cover the issue reported before [1]. [1] https://www.postgresql.org/message-id/CAJcOf-dsKOYKmdrU5nwWeFoHvhiACbmw_KU%3DJQMEeDp6WwijqA%40mail.gmail.com 4) Personally, the new name pg_publication_sch is not very easy to understand. (Maybe it's because I am not a native english speaker. If others feel ok, please ignore this comment) Best regards, Houzj
On Thu, Jul 22, 2021 at 1:42 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Personally, the new name pg_publication_sch is not very easy to understand. > (Maybe it's because I am not a native english speaker. If others feel ok, > please ignore this comment) > I was actually thinking the same thing. I prefer the full SCHEMA/schema, even for all the internal variables/definitions which have been changed since the last patch version. I think Vignesh was trying to be consistent with pg_publication_rel and pg_subscription_rel, but maybe "rel" is better understood to be an abbreviation for "relation" than "sch" for "schema"? Thoughts from others? Regards, Greg Nancarrow Fujitsu Australia
On Thursday, July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for reporting this issue, this issue is fixed in the attached v13 patch. > I have changed relation name pg_publication_schema to > pg_publication_sch so that the names are in similar lines with > pg_publication_rel relation and similar changes were done for variable > names too. Thanks for your fixing. The issue is fixed as you said. After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ... SET SCHEMA...", I would get the following error message: postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1; ERROR: duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index" DETAIL: Key (psnspcid, pspubid)=(16406, 16405) already exists. I think the error message is pretty hard to understand. Maybe we can do sth to improve this scenario. Here is two proposal: 1. Don't report error message, just add some code to make the above command to be executed successfully, just like "ALTER PUBLICATION ... SET TABLE ..." as follolws: postgres=# ALTER PUBLICATION pub2 SET TABLE t1,t1; ALTER PUBLICATION postgres=# \dRp+ pub2 Publication pub2 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype ----------+------------+---------+---------+---------+-----------+----------+--------- postgres | f | t | t | t | t | f | t Tables: "public.t1" 2. Report a easily understandable message like: Schema s1 specified more than once Thoughts? Regards Tang
On Fri, Jul 23, 2021 at 10:56 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > > After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ... SETSCHEMA ...", I would get the following error message: > > postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1; > ERROR: duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index" > DETAIL: Key (psnspcid, pspubid)=(16406, 16405) already exists. > That definitely seems to be a bug, since "ALTER PUBLICATION ... SET TABLE ..." ignores duplicates and there is no ERROR. "CREATE PUBLICATION ... SET SCHEMA s1, s1;" and "ALTER PUBLICATION ... ADD SCHEMA s1, s1;" also give the same kind of error. Regards, Greg Nancarrow Fujitsu Australia
On Thu, Jul 22, 2021 at 9:12 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote > > > I think PubType in this case should be 't' instead of 'e'. Please have a look. > > > > Thanks for reporting this issue, this issue is fixed in the attached v13 patch. > > I have changed relation name pg_publication_schema to pg_publication_sch > > so that the names are in similar lines with pg_publication_rel relation and similar > > changes were done for variable names too. > > Hi, > > Thanks for the new version patches. > I had a few comments. > > 1) > + > + appendPQExpBuffer(query, "ALTER PUBLICATION %s ", fmtId(pubrinfo->pubname)); > + appendPQExpBuffer(query, "ADD SCHEMA %s;\n", fmtId(schemainfo->dobj.name)); > + > > It seems we can combine these two function call. > like appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD SCHEMA %s;\n", > fmtId(pubrinfo->pubname), > fmtId(schemainfo->dobj.name)); > Modified. > 2) > + footers[0] = pstrdup("Publications:"); > + > > This word seems need to be translated. > footers[0] = pstrdup(_("Publications:")); Modified > 3) > I think it might be better to add a testcase to cover the issue > reported before [1]. > > [1] https://www.postgresql.org/message-id/CAJcOf-dsKOYKmdrU5nwWeFoHvhiACbmw_KU%3DJQMEeDp6WwijqA%40mail.gmail.com Added > 4) > Personally, the new name pg_publication_sch is not very easy to understand. > (Maybe it's because I am not a native english speaker. If others feel ok, > please ignore this comment) I have changed it to pg_publication_schema as earlier, as Greg also felt the other name was better. Thanks for the comments, the attached v14 patch has the fixes for the same. Regards, Vignesh
Attachment
>
> On Thu, Jul 22, 2021 at 1:42 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Personally, the new name pg_publication_sch is not very easy to understand.
> > (Maybe it's because I am not a native english speaker. If others feel ok,
> > please ignore this comment)
> >
>
> I was actually thinking the same thing.
> I prefer the full SCHEMA/schema, even for all the internal
> variables/definitions which have been changed since the last patch
> version.
> I think Vignesh was trying to be consistent with pg_publication_rel
> and pg_subscription_rel, but maybe "rel" is better understood to be an
> abbreviation for "relation" than "sch" for "schema"?
> Thoughts from others?
I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better. This is fixed in the v14 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com
On Fri, Jul 23, 2021 at 6:26 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Thursday, July 22, 2021 1:30 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for reporting this issue, this issue is fixed in the attached v13 patch. > > I have changed relation name pg_publication_schema to > > pg_publication_sch so that the names are in similar lines with > > pg_publication_rel relation and similar changes were done for variable > > names too. > > Thanks for your fixing. The issue is fixed as you said. > > After applying your V13 patch. I noticed that if I specify duplicate schema names when using "ALTER PUBLICATION ... SETSCHEMA ...", I would get the following error message: > > postgres=# ALTER PUBLICATION pub1 SET SCHEMA s1,s1; > ERROR: duplicate key value violates unique constraint "pg_publication_sch_psnspcid_pspubid_index" > DETAIL: Key (psnspcid, pspubid)=(16406, 16405) already exists. > > I think the error message is pretty hard to understand. Maybe we can do sth to improve this scenario. > > Here is two proposal: > 1. Don't report error message, just add some code to make the above command to be executed successfully, > just like "ALTER PUBLICATION ... SET TABLE ..." as follolws: > > postgres=# ALTER PUBLICATION pub2 SET TABLE t1,t1; > ALTER PUBLICATION > postgres=# \dRp+ pub2 > Publication pub2 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | Pubtype > ----------+------------+---------+---------+---------+-----------+----------+--------- > postgres | f | t | t | t | t | f | t > Tables: > "public.t1" > > 2. Report a easily understandable message like: Schema s1 specified more than once > I have changed it to not report any error, this issue is fixed in the v14 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com Regards, Vignesh
On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote: > > I have changed it to not report any error, this issue is fixed in the > v14 patch attached at [1]. > [1] - https://www.postgresql.org/message- > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g > mail.com > Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869),please rebase it. Besides, I tested your patch on old HEAD and confirmed that the issue I reported was fixed. Regards Tang
On Fri, Jul 23, 2021 at 10:16 PM vignesh C <vignesh21@gmail.com> wrote: > > I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better. Thisis fixed in the v14 patch attached at [1]. > [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com > Thanks, I think it looks better. On another issue, there seems to be problems with pg_dump support for FOR SCHEMA publications. For example: CREATE PUBLICATION p FOR SCHEMA myschema; pg_dump is dumping: CREATE PUBLICATION p WITH (publish = 'insert, update, delete, truncate'); ... ALTER PUBLICATION p ADD SCHEMA p; Obviously that last line should instead be "ALTER PUBLICATION p ADD SCHEMA myschema;" I think the bug is caused by the following: + appendPQExpBuffer(query, + "ALTER PUBLICATION %s ADD SCHEMA %s;\n", + fmtId(pubsinfo->pubname), fmtId(schemainfo->dobj.name)); The comment for fmtId() says: * Note that the returned string must be used before calling fmtId again, * since we re-use the same return buffer each time. So I think there was a reason why the ALTER PUBLICATION and ADD SCHEMA were previously formatted separately (and so should NOT have been combined). It should be restored to how it was in the previous patch version. Regards, Greg Nancarrow Fujitsu Australia
On Mon, Jul 26, 2021 at 8:17 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Jul 23, 2021 at 10:16 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have changed it to pg_pubication_schema as earlier as both you and Houzj San felt pg_publication_schema was better.This is fixed in the v14 patch attached at [1]. > > [1] - https://www.postgresql.org/message-id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.gmail.com > > > > Thanks, I think it looks better. > > On another issue, there seems to be problems with pg_dump support for > FOR SCHEMA publications. > > For example: > > CREATE PUBLICATION p FOR SCHEMA myschema; > > pg_dump is dumping: > > CREATE PUBLICATION p WITH (publish = 'insert, update, delete, truncate'); > ... > ALTER PUBLICATION p ADD SCHEMA p; > > Obviously that last line should instead be "ALTER PUBLICATION p ADD > SCHEMA myschema;" > > I think the bug is caused by the following: > > + appendPQExpBuffer(query, > + "ALTER PUBLICATION %s ADD SCHEMA %s;\n", > + fmtId(pubsinfo->pubname), fmtId(schemainfo->dobj.name)); > > The comment for fmtId() says: > > * Note that the returned string must be used before calling fmtId again, > * since we re-use the same return buffer each time. > > So I think there was a reason why the ALTER PUBLICATION and ADD SCHEMA > were previously formatted separately (and so should NOT have been > combined). > It should be restored to how it was in the previous patch version. Thanks for the comment, this is modified in the v15 patch attached. Regards, Vignesh
Attachment
>
> On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have changed it to not report any error, this issue is fixed in the
> > v14 patch attached at [1].
> > [1] - https://www.postgresql.org/message-
> > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g
> > mail.com
> >
>
> Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869), please rebase it.
>
Thanks for reporting it, it is rebased at the v15 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm27LRWF9ney%3DcVeD-0jc2%2BJ5Y0wNQhighZB%3DAat4VbNBA%40mail.gmail.com
On Mon, Jul 26, 2021 at 3:21 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comment, this is modified in the v15 patch attached. > I have several minor review comments. (1) src/backend/catalog/objectaddress.c Should start comment sentences with an uppercase letter, for consistency. + /* fetch publication name and schema oid from input list */ I also notice that some 1-sentence comments end with "." (full-stop) and others don't. It seems to alternate all over the place, and so is quite noticeable. Unfortunately, it already seems to be like this in much of the code that this patch patches. Ideally (at least my personal preference is) 1-sentence comments should not end with a ".". (2) src/backend/catalog/pg_publication.c errdetail message I think the following should say "Temporary schemas ..." (since the existing error message for tables says "System tables cannot be added to publications."). + errdetail("Temporary schema cannot be added to publications."))); (3) src/backend/commands/publicationcmds.c PublicationAddTables I think that the Assert below is not correct (i.e. not restrictive enough). Although the condition passes, it is allowing, for example, stmt->for_all_tables==true if stmt->schemas==NIL, and that doesn't seem to be correct. I suggest the following change: BEFORE: + Assert(!stmt || !stmt->for_all_tables || !stmt->schemas); AFTER: + Assert(!stmt || (!stmt->for_all_tables && !stmt->schemas)); (4) src/backend/commands/publicationcmds.c PublicationAddSchemas Similarly, I think that the Assert below is not restrictive enough, and think it should be changed: BEFORE: + Assert(!stmt || !stmt->for_all_tables || !stmt->tables); AFTER: + Assert(!stmt || (!stmt->for_all_tables && !stmt->tables)); (5) src/bin/pg_dump/common.c Spelling mistake. BEFORE: + pg_log_info("reading publciation schemas"); AFTER: + pg_log_info("reading publication schemas"); Regards, Greg Nancarrow Fujitsu Australia
On Monday, July 26, 2021 1:25 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Jul 26, 2021 at 7:41 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Friday, July 23, 2021 8:18 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have changed it to not report any error, this issue is fixed in the
> > > v14 patch attached at [1].
> > > [1] - https://www.postgresql.org/message-
> > > id/CALDaNm3DTj535ezfmm8QHLOtOkcHF2ZcCfSjfR%3DVbTbLZXFRsA%40mail.g
> > > mail.com
> > >
> >
> > Thanks for your new patch. But there's a conflict when apply patch v14 on the latest HEAD (it seems caused by commit #678f5448c2d869), please rebase it.
> >
>
> Thanks for reporting it, it is rebased at the v15 patch attached at [1].
Thanks for your new patch. I applied your patch and it succeeded.
Here are some comments on the tests in your patch. The attached file included changes about the comments, please have a look.
1. src/test/regress/sql/publication.sql
There are some existing tests to verify that we can't add table to ‘for all tables publication', should we add some tests about adding schema to ‘for all tables publication’ / ‘for table publication’?
I added some cases in the attached file.
Besides, the following existing comment seems not suitable. It uses 'SET' but the comment says 'add'. And since we can add table or schema, I think we should point out what we add is table, thoughts?
-- fail - can't add to for all tables publication
ALTER PUBLICATION testpub_foralltables SET TABLE pub_test.testpub_nopk;
2. src/test/subscription/t/001_rep_changes.pl
2.1
+# Insert some data into few tables and verify that inserted data is replicated.
+$node_publisher->safe_psql('postgres', "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))");
+$node_publisher->safe_psql('postgres', "INSERT INTO sch2.tab1 VALUES(generate_series(11,20))");
+
+$node_publisher->wait_for_catchup('tap_sub_schema');
There are two spaces between "INTO" and "sch1.tab1".
2.2
Most of publication names are lowercase, and some of them are uppercase. I think it will be better if all of them are lowercase.
I modified them in the attached file.
2.3
+# Verify that the subscription relation list is updated after refresh.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_subscription_rel WHERE srsubid IN (SELECT oid FROM pg_subscription WHERE subname = 'tap_sub_schema')");
+is($result, qq(5),
+ 'check subscription relation status was dropped on subscriber');
Should it be 'check subscription relation status is not yet dropped on subscriber' here?
2.4
+# Drop publications as we don't need them anymore
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema");
There is only one publication, so the comment here should be 'Drop publication as we don't need it anymore'.
3.
There are some existing test cases about publication for table and publication for all tables in 002_pg_dump.pl, so I think we could add some test cases about publication for schema.
I tried to add some cases in the attached file.
Regards
Tang
Attachment
On Tue, Jul 27, 2021 at 12:21 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote > > Thanks for your new patch. I applied your patch and it succeeded. > > Here are some comments on the tests in your patch. The attached file included changes about the comments, please have alook. > > > > 1. src/test/regress/sql/publication.sql > > There are some existing tests to verify that we can't add table to ‘for all tables publication', should we add some testsabout adding schema to ‘for all tables publication’ / ‘for table publication’? > > I added some cases in the attached file. I have taken the changes. > > Besides, the following existing comment seems not suitable. It uses 'SET' but the comment says 'add'. And since we canadd table or schema, I think we should point out what we add is table, thoughts? > > -- fail - can't add to for all tables publication > > ALTER PUBLICATION testpub_foralltables SET TABLE pub_test.testpub_nopk; > Since this is in base code, you might want to post a patch on a separate thread. > 2. src/test/subscription/t/001_rep_changes.pl > > 2.1 > > +# Insert some data into few tables and verify that inserted data is replicated. > > +$node_publisher->safe_psql('postgres', "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))"); > > +$node_publisher->safe_psql('postgres', "INSERT INTO sch2.tab1 VALUES(generate_series(11,20))"); > > + > > +$node_publisher->wait_for_catchup('tap_sub_schema'); > > > > There are two spaces between "INTO" and "sch1.tab1". I have taken the changes. > 2.2 > > Most of publication names are lowercase, and some of them are uppercase. I think it will be better if all of them are lowercase. > > I modified them in the attached file. I have taken the changes. > 2.3 > > +# Verify that the subscription relation list is updated after refresh. > > +$result = $node_subscriber->safe_psql('postgres', > > + "SELECT count(*) FROM pg_subscription_rel WHERE srsubid IN (SELECT oid FROM pg_subscription WHERE subname = 'tap_sub_schema')"); > > +is($result, qq(5), > > + 'check subscription relation status was dropped on subscriber'); > > > > Should it be 'check subscription relation status is not yet dropped on subscriber' here? I have taken the changes. > 2.4 > > +# Drop publications as we don't need them anymore > > +$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema"); > > > > There is only one publication, so the comment here should be 'Drop publication as we don't need it anymore'. I have taken the changes. > 3. > > There are some existing test cases about publication for table and publication for all tables in 002_pg_dump.pl, so I thinkwe could add some test cases about publication for schema. > > I tried to add some cases in the attached file. Thanks for the patch, I have merged the changes. Attached v16 patch has the fixes for the same. Regards, Vignesh
Attachment
>
> On Mon, Jul 26, 2021 at 3:21 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comment, this is modified in the v15 patch attached.
> >
>
> I have several minor review comments.
>
> (1) src/backend/catalog/objectaddress.c
> Should start comment sentences with an uppercase letter, for consistency.
>
> + /* fetch publication name and schema oid from input list */
>
> I also notice that some 1-sentence comments end with "." (full-stop)
> and others don't. It seems to alternate all over the place, and so is
> quite noticeable.
> Unfortunately, it already seems to be like this in much of the code
> that this patch patches.
> Ideally (at least my personal preference is) 1-sentence comments
> should not end with a ".".
Modified.
> (2) src/backend/catalog/pg_publication.c
> errdetail message
>
> I think the following should say "Temporary schemas ..." (since the
> existing error message for tables says "System tables cannot be added
> to publications.").
>
> + errdetail("Temporary schema cannot be added to publications.")));
>
Modified.
> (3) src/backend/commands/publicationcmds.c
> PublicationAddTables
>
> I think that the Assert below is not correct (i.e. not restrictive enough).
> Although the condition passes, it is allowing, for example,
> stmt->for_all_tables==true if stmt->schemas==NIL, and that doesn't
> seem to be correct.
> I suggest the following change:
>
> BEFORE:
> + Assert(!stmt || !stmt->for_all_tables || !stmt->schemas);
> AFTER:
> + Assert(!stmt || (!stmt->for_all_tables && !stmt->schemas));
>
Modified.
> (4) src/backend/commands/publicationcmds.c
> PublicationAddSchemas
>
> Similarly, I think that the Assert below is not restrictive enough,
> and think it should be changed:
>
> BEFORE:
> + Assert(!stmt || !stmt->for_all_tables || !stmt->tables);
> AFTER:
> + Assert(!stmt || (!stmt->for_all_tables && !stmt->tables));
>
Modified.
> (5) src/bin/pg_dump/common.c
>
> Spelling mistake.
>
> BEFORE:
> + pg_log_info("reading publciation schemas");
> AFTER:
> + pg_log_info("reading publication schemas");
Thanks for the comments, the comments are fixed in the v16 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2LgV5XcLF80rJ60NwnjKpZj%3D%3DLxJpO4W2TG2G5XmUtDA%40mail.gmail.com
On July 28, 2021 6:44 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for the patch, I have merged the changes. Attached v16 patch has the > fixes for the same. Thanks for the new version patches. Here are a few comments: 1) + /* Identify which schemas should be dropped */ + foreach(oldlc, oldschemaids) + { + Oid oldschemaid = lfirst_oid(oldlc); + + if (!list_member_oid(schemaoidlist, oldschemaid)) + delschemas = lappend_oid(delschemas, oldschemaid); + } + We can use list_difference here to simplify the code. 2) + + /* Filter out duplicates if user specifies "sch1, sch1" */ + if (list_member_oid(schemaoidlist, schemaoid)) + continue; + + schemaoidlist = lappend_oid(schemaoidlist, schemaoid); It might be more concise to use list_append_unique_oid() here. 3) + <para> + Create a publication that publishes all changes for all the tables present in +the schema "production": +<programlisting> The second line seems not aligned. After: + <para> + Create a publication that publishes all changes for all the tables present in + the schema "production": +<programlisting> 4) + resetPQExpBuffer(query); + + /* Get the publication membership for the schema */ + appendPQExpBuffer(query, + "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid " + "FROM pg_publication_schema ps, pg_publication p " + "WHERE ps.psnspcid = '%u' " + "AND p.oid = ps.pspubid", + nsinfo->dobj.catId.oid); It seems we can use printfPQExpBuffer() here which is a convenience routine that does the same thing as resetPQExpBuffer() followed by appendPQExpBuffer(). Best regards, Houzj
On Fri, Jul 30, 2021 at 12:12 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On July 28, 2021 6:44 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the patch, I have merged the changes. Attached v16 patch has the > > fixes for the same. > > Thanks for the new version patches. > Here are a few comments: > > 1) > > + /* Identify which schemas should be dropped */ > + foreach(oldlc, oldschemaids) > + { > + Oid oldschemaid = lfirst_oid(oldlc); > + > + if (!list_member_oid(schemaoidlist, oldschemaid)) > + delschemas = lappend_oid(delschemas, oldschemaid); > + } > + > > We can use list_difference here to simplify the code. Modified. > 2) > > + > + /* Filter out duplicates if user specifies "sch1, sch1" */ > + if (list_member_oid(schemaoidlist, schemaoid)) > + continue; > + > + schemaoidlist = lappend_oid(schemaoidlist, schemaoid); > > It might be more concise to use list_append_unique_oid() here. Modified. > 3) > > + <para> > + Create a publication that publishes all changes for all the tables present in > +the schema "production": > +<programlisting> > > The second line seems not aligned. > After: > + <para> > + Create a publication that publishes all changes for all the tables present in > + the schema "production": > +<programlisting> Modified. > 4) > > + resetPQExpBuffer(query); > + > + /* Get the publication membership for the schema */ > + appendPQExpBuffer(query, > + "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid " > + "FROM pg_publication_schema ps, pg_publication p " > + "WHERE ps.psnspcid = '%u' " > + "AND p.oid = ps.pspubid", > + nsinfo->dobj.catId.oid); > > It seems we can use printfPQExpBuffer() here which is a convenience routine > that does the same thing as resetPQExpBuffer() followed by appendPQExpBuffer(). Modified. Thanks for the comments, attached v17 patches has the fixes for the same. Regards, Vignesh
Attachment
On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote: > > Thanks for the comments, attached v17 patches has the fixes for the same. Thanks for your new patch. I saw the following warning when compiling. It seems we don't need this variable any more. publicationcmds.c: In function ‘AlterPublicationSchemas’: publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable] ListCell *oldlc; ^~~~~ Regards Tang
On August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, attached v17 patches has the fixes for the same. Thanks for the new patch, it looks good to me except one minor thing: It might be better to add the [CREATE PUBLICATION xxx FOR SCHEMA ] in tab-complete.c Best regards, houzj
On Tuesday, August 3, 2021 4:10 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > On August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the comments, attached v17 patches has the fixes for the same. > > Thanks for the new patch, it looks good to me except one minor thing: > It might be better to add the [CREATE PUBLICATION xxx FOR SCHEMA ] in > tab-complete.c Sorry, the patch already had the logic, please ignore this comment. Best regards, houzj
On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote: > > > > Thanks for the comments, attached v17 patches has the fixes for the same. > > Thanks for your new patch. > > I saw the following warning when compiling. It seems we don't need this variable any more. > > publicationcmds.c: In function ‘AlterPublicationSchemas’: > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable] > ListCell *oldlc; > ^~~~~ Thanks for reporting this, this is fixed in the v18 patch attached. Regards, Vignesh
Attachment
On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for reporting this, this is fixed in the v18 patch attached. > I have started looking into this patch and below are some initial comments. 1. + /* Fetch publication name and schema oid from input list */ + schemaname = strVal(linitial(object)); + pubname = strVal(lsecond(object)); I think the comment should be: "Fetch schema name and publication name from input list" 2. @@ -3902,6 +3958,46 @@ getObjectDescription(const ObjectAddress *object, bool missing_ok) break; } + case OCLASS_PUBLICATION_SCHEMA: + { + HeapTuple tup; + char *pubname; + Form_pg_publication_schema psform; + char *nspname; + + tup = SearchSysCache1(PUBLICATIONSCHEMA, + ObjectIdGetDatum(object->objectId)); + if (!HeapTupleIsValid(tup)) + { + if (!missing_ok) + elog(ERROR, "cache lookup failed for publication schema %u", + object->objectId); + break; + } + + psform = (Form_pg_publication_schema) GETSTRUCT(tup); + pubname = get_publication_name(psform->pspubid, false); + nspname = get_namespace_name(psform->psnspcid); + if (!nspname) + { + Oid psnspcid = psform->psnspcid; + + pfree(pubname); + ReleaseSysCache(tup); + if (!missing_ok) + elog(ERROR, "cache lookup failed for schema %u", + psnspcid); + break; + } The above code in getObjectDescription looks quite similar to what you have in getObjectIdentityParts(). Can we extract the common code into a separate function? 3. Can we use column name pubkind (similar to relkind in pg_class) instead of pubtype? If so, please change PUBTYPE_ALLTABLES and similar other defines to PUBKIND_*. 4. @@ -3632,6 +3650,7 @@ typedef struct CreatePublicationStmt List *options; /* List of DefElem nodes */ List *tables; /* Optional list of tables to add */ bool for_all_tables; /* Special publication for all tables in db */ + List *schemas; /* Optional list of schemas */ } CreatePublicationStmt; Isn't it better to keep a schemas list after tables? 5. @@ -1163,12 +1168,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid) Publication *pub = lfirst(lc); bool publish = false; - if (pub->alltables) + if (pub->pubtype == PUBTYPE_ALLTABLES) { publish = true; if (pub->pubviaroot && am_partition) publish_as_relid = llast_oid(get_partition_ancestors(relid)); } + else if (pub->pubtype == PUBTYPE_SCHEMA) + { + Oid schemaId = get_rel_namespace(relid); + Oid psid = GetSysCacheOid2(PUBLICATIONSCHEMAMAP, + Anum_pg_publication_schema_oid, + ObjectIdGetDatum(schemaId), + ObjectIdGetDatum(pub->oid)); + + if (OidIsValid(psid)) + { + publish = true; + if (pub->pubviaroot && am_partition) + publish_as_relid = llast_oid(get_partition_ancestors(relid)); + } + } Isn't it better to get schema publications once as we get relation publications via GetRelationPublications and then decide whether to publish or not? I think that will save repeated cache searches for each publication requested by the subscriber? 6. + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */ + PublicationSchemaPsnspcidPspubidIndexId, + 2, + { + Anum_pg_publication_schema_psnspcid, + Anum_pg_publication_schema_pspubid, + 0, + 0 + }, Why don't we keep pubid as the first column in this index? 7. getPublicationSchemas() { .. + /* Get the publication membership for the schema */ + printfPQExpBuffer(query, + "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid " + "FROM pg_publication_schema ps, pg_publication p " + "WHERE ps.psnspcid = '%u' " + "AND p.oid = ps.pspubid", + nsinfo->dobj.catId.oid); .. } Why do you need to use join here? Why the query and another handling in this function be similar to what we have in getPublicationTables? Also, there is another function GetPublicationSchemas() in the patch, can we name one of these differently for the purpose of easy grepability? -- With Regards, Amit Kapila.
On Tuesday, August 3, 2021 11:08 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for reporting this, this is fixed in the v18 patch attached. Thanks for fixing it. Few suggestions for V18: 1. +# Clean up the tables on both publisher and subscriber as we don't need them +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch1 cascade"); +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch2 cascade"); +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch3 cascade"); +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch1 cascade"); +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch2 cascade"); +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch3 cascade"); Should we change the comment to "Clean up the schemas ... ", instead of 'tables'? 2. +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(*) FROM sch1.tab3"); Spaces are used here(and some other places), but in most places we use a TAB, so I think it's better to change it to a TAB. 3. List *tables; /* List of tables to add/drop */ bool for_all_tables; /* Special publication for all tables in db */ DefElemAction tableAction; /* What action to perform with the tables */ + List *schemas; /* Optional list of schemas */ } AlterPublicationStmt; Should we change the comment here to 'List of schemas to add/drop', then it can be consistent with the comment for 'tables'. I also noticed that 'tableAction' variable is used when we add/drop/set schemas, so maybe the variable name is not suitable any more. Besides, the following comment is above these codes. Should we add some comments for schema? /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */ And it says 'add/drop', do we need to add 'set'? (it's not related to this patch, so I think I can add it in another thread[1] if needed, which is related to comment improvement) 4. I saw the existing tests about permissions in publication.sql, should we add tests for schema publication? Like this: diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql index 33dbdf7bed..c19337631e 100644 --- a/src/test/regress/sql/publication.sql +++ b/src/test/regress/sql/publication.sql @@ -160,16 +160,19 @@ GRANT CREATE ON DATABASE regression TO regress_publication_user2; SET ROLE regress_publication_user2; SET client_min_messages = 'ERROR'; CREATE PUBLICATION testpub2; -- ok +CREATE PUBLICATION testpub3; -- ok RESET client_min_messages; ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1; -- fail +ALTER PUBLICATION testpub3 ADD SCHEMA pub_test; -- fail SET ROLE regress_publication_user; GRANT regress_publication_user TO regress_publication_user2; SET ROLE regress_publication_user2; ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1; -- ok +ALTER PUBLICATION testpub3 ADD SCHEMA pub_test; -- ok -DROP PUBLICATION testpub2; +DROP PUBLICATION testpub2, testpub3; SET ROLE regress_publication_user; REVOKE CREATE ON DATABASE regression FROM regress_publication_user2; [1] https://www.postgresql.org/message-id/flat/OS0PR01MB6113480F937572BF1216DD61FBEF9%40OS0PR01MB6113.jpnprd01.prod.outlook.com Regards Tang
On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for reporting this, this is fixed in the v18 patch attached. > > > > I have started looking into this patch and below are some initial comments. > Few more comments: =================== 1. Do we need the previous column 'puballtables' after adding pubtype or pubkind in pg_publication? 2. @@ -224,6 +279,20 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt) .. + nspcrel = table_open(NamespaceRelationId, ShareUpdateExclusiveLock); + PublicationAddSchemas(puboid, schemaoidlist, true, NULL); + table_close(nspcrel, ShareUpdateExclusiveLock); What is the reason for opening and taking a lock on NamespaceRelationId? Do you want to avoid dropping the corresponding schema during this duration? If so, that is not sufficient because what if somebody drops it just before you took lock on NamespaceRelationId. I think you need to use LockDatabaseObject to avoid dropping the schema and note that it should be unlocked only at the end of the transaction similar to what we do for tables. I guess you need to add this locking inside the function PublicationAddSchemas. Also, it would be good if you can add few comments in this part of the code to explain the reason for locking. 3. The function PublicationAddSchemas is called from multiple places in the patch but the locking protection is not there at all places. I think if you add locking as suggested in the previous point then it should be okay. I think you need similar locking for PublicationDropSchemas. 4. @@ -421,16 +537,84 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel, PublicationAddTables(pubid, rels, true, stmt); CloseTableList(delrels); + if (pubform->pubtype == PUBTYPE_EMPTY) + UpdatePublicationTypeTupleValue(rel, tup, + Anum_pg_publication_pubtype, + PUBTYPE_TABLE); } At the above and all other similar places, the patch first updates the pg_publication after performing the rel/schema operation. Isn't it better to first update pg_publication to remain in sync with how CreatePublication works? I am not able to see any issue with the way you have it in the patch but it is better to keep the code consistent across various similar functions to avoid confusion in the future. -- With Regards, Amit Kapila.
On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com > <tanghy.fnst@fujitsu.com> wrote: > > > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote: > > > > > > Thanks for the comments, attached v17 patches has the fixes for the same. > > > > Thanks for your new patch. > > > > I saw the following warning when compiling. It seems we don't need this variable any more. > > > > publicationcmds.c: In function ‘AlterPublicationSchemas’: > > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable] > > ListCell *oldlc; > > ^~~~~ > > Thanks for reporting this, this is fixed in the v18 patch attached. I've also started reviewing this patch. I've not looked at the patch yet but here are initial comments/questions based on using this feature: pg_publication catalog still has puballtables column but it's still necessary? IIUC since pubtype = 'a' means publishing all tables in the database puballtables seems no longer necessary. --- Suppose that a parent table and its child table are defined in different schemas, there is a publication for the schema where only the parent table is defined, and the subscriber subscribes to the publication, should changes for its child table be replicated to the subscriber? In FOR TABLE cases, i.g., where the subscriber subscribes to the publication that is only for the parent table, changes for its child table are replicated to the subscriber. As far as I tested v18 patch, changes for the child table are not replicated in FOR SCHEMA cases. Here is the test script: On publisher and subscriber: create schema p_schema; create schema c_schema; create table p_schema.p (a int) partition by list (a); create table c_schema.c partition of p_schema.p for values in (1); On publisher: create publication pub_p_schema for schema p_schema; On subscriber: create subscription pub connection 'dbname=postgres' publication pub_p_schema; On publisher: insert into p_schema.p values (1); select * from p_schema.p; a --- 1 (1 row) On subscriber: select * from p_schema.p; a --- (0 rows) Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for reporting this, this is fixed in the v18 patch attached. > > > > I have started looking into this patch and below are some initial comments. > > 1. > + /* Fetch publication name and schema oid from input list */ > + schemaname = strVal(linitial(object)); > + pubname = strVal(lsecond(object)); > > I think the comment should be: "Fetch schema name and publication name > from input list" > Modified. > 2. > @@ -3902,6 +3958,46 @@ getObjectDescription(const ObjectAddress > *object, bool missing_ok) > break; > } > > + case OCLASS_PUBLICATION_SCHEMA: > + { > + HeapTuple tup; > + char *pubname; > + Form_pg_publication_schema psform; > + char *nspname; > + > + tup = SearchSysCache1(PUBLICATIONSCHEMA, > + ObjectIdGetDatum(object->objectId)); > + if (!HeapTupleIsValid(tup)) > + { > + if (!missing_ok) > + elog(ERROR, "cache lookup failed for publication schema %u", > + object->objectId); > + break; > + } > + > + psform = (Form_pg_publication_schema) GETSTRUCT(tup); > + pubname = get_publication_name(psform->pspubid, false); > + nspname = get_namespace_name(psform->psnspcid); > + if (!nspname) > + { > + Oid psnspcid = psform->psnspcid; > + > + pfree(pubname); > + ReleaseSysCache(tup); > + if (!missing_ok) > + elog(ERROR, "cache lookup failed for schema %u", > + psnspcid); > + break; > + } > > The above code in getObjectDescription looks quite similar to what you > have in getObjectIdentityParts(). Can we extract the common code into > a separate function? Modified. > 3. Can we use column name pubkind (similar to relkind in pg_class) > instead of pubtype? If so, please change PUBTYPE_ALLTABLES and similar > other defines to PUBKIND_*. > Modified. > 4. > @@ -3632,6 +3650,7 @@ typedef struct CreatePublicationStmt > List *options; /* List of DefElem nodes */ > List *tables; /* Optional list of tables to add */ > bool for_all_tables; /* Special publication for all tables in db */ > + List *schemas; /* Optional list of schemas */ > } CreatePublicationStmt; > > Isn't it better to keep a schemas list after tables? Modified. > 5. > @@ -1163,12 +1168,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid) > Publication *pub = lfirst(lc); > bool publish = false; > > - if (pub->alltables) > + if (pub->pubtype == PUBTYPE_ALLTABLES) > { > publish = true; > if (pub->pubviaroot && am_partition) > publish_as_relid = llast_oid(get_partition_ancestors(relid)); > } > + else if (pub->pubtype == PUBTYPE_SCHEMA) > + { > + Oid schemaId = get_rel_namespace(relid); > + Oid psid = GetSysCacheOid2(PUBLICATIONSCHEMAMAP, > + Anum_pg_publication_schema_oid, > + ObjectIdGetDatum(schemaId), > + ObjectIdGetDatum(pub->oid)); > + > + if (OidIsValid(psid)) > + { > + publish = true; > + if (pub->pubviaroot && am_partition) > + publish_as_relid = llast_oid(get_partition_ancestors(relid)); > + } > + } > > Isn't it better to get schema publications once as we get relation > publications via GetRelationPublications and then decide whether to > publish or not? I think that will save repeated cache searches for > each publication requested by the subscriber? Modified. > 6. > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */ > + PublicationSchemaPsnspcidPspubidIndexId, > + 2, > + { > + Anum_pg_publication_schema_psnspcid, > + Anum_pg_publication_schema_pspubid, > + 0, > + 0 > + }, > > Why don't we keep pubid as the first column in this index? I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as it is, thoughts? > 7. > getPublicationSchemas() > { > .. > + /* Get the publication membership for the schema */ > + printfPQExpBuffer(query, > + "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid " > + "FROM pg_publication_schema ps, pg_publication p " > + "WHERE ps.psnspcid = '%u' " > + "AND p.oid = ps.pspubid", > + nsinfo->dobj.catId.oid); > .. > } > > Why do you need to use join here? Why the query and another handling > in this function be similar to what we have in getPublicationTables? > Also, there is another function GetPublicationSchemas() in the patch, > can we name one of these differently for the purpose of easy > grepability? Modified it similar to getPublicationTables without joins. The function is renamed to getPublicationNamespaces. Thanks for the comments, the attached v19 patch has the fixes for the comments. Regards, Vignesh
Attachment
>
> On Tuesday, August 3, 2021 11:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for reporting this, this is fixed in the v18 patch attached.
>
> Thanks for fixing it.
>
> Few suggestions for V18:
>
> 1.
> +# Clean up the tables on both publisher and subscriber as we don't need them
> +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch1 cascade");
> +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch2 cascade");
> +$node_publisher->safe_psql('postgres', "DROP SCHEMA sch3 cascade");
> +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch1 cascade");
> +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch2 cascade");
> +$node_subscriber->safe_psql('postgres', "DROP SCHEMA sch3 cascade");
>
> Should we change the comment to "Clean up the schemas ... ", instead of 'tables'?
Modified.
> 2.
> +$result = $node_subscriber->safe_psql('postgres',
> + "SELECT count(*) FROM sch1.tab3");
>
> Spaces are used here(and some other places), but in most places we use a TAB, so
> I think it's better to change it to a TAB.
Modified.
> 3.
> List *tables; /* List of tables to add/drop */
> bool for_all_tables; /* Special publication for all tables in db */
> DefElemAction tableAction; /* What action to perform with the tables */
> + List *schemas; /* Optional list of schemas */
> } AlterPublicationStmt;
>
> Should we change the comment here to 'List of schemas to add/drop', then it can
> be consistent with the comment for 'tables'.
Modified.
> I also noticed that 'tableAction' variable is used when we add/drop/set schemas,
> so maybe the variable name is not suitable any more.
Changed the variable name.
> Besides, the following comment is above these codes. Should we add some comments
> for schema?
>
> /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
Modified
> And it says 'add/drop', do we need to add 'set'? (it's not related to this
> patch, so I think I can add it in another thread[1] if needed, which is related
> to comment improvement)
You can include the change in the patch posted.
> 4.
> I saw the existing tests about permissions in publication.sql, should we add
> tests for schema publication? Like this:
>
> diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
> index 33dbdf7bed..c19337631e 100644
> --- a/src/test/regress/sql/publication.sql
> +++ b/src/test/regress/sql/publication.sql
> @@ -160,16 +160,19 @@ GRANT CREATE ON DATABASE regression TO regress_publication_user2;
> SET ROLE regress_publication_user2;
> SET client_min_messages = 'ERROR';
> CREATE PUBLICATION testpub2; -- ok
> +CREATE PUBLICATION testpub3; -- ok
> RESET client_min_messages;
>
> ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1; -- fail
> +ALTER PUBLICATION testpub3 ADD SCHEMA pub_test; -- fail
>
> SET ROLE regress_publication_user;
> GRANT regress_publication_user TO regress_publication_user2;
> SET ROLE regress_publication_user2;
> ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1; -- ok
> +ALTER PUBLICATION testpub3 ADD SCHEMA pub_test; -- ok
>
> -DROP PUBLICATION testpub2;
> +DROP PUBLICATION testpub2, testpub3;
>
> SET ROLE regress_publication_user;
> REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
Added.
The changes for the above are available in the v19 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3BMLBpWOSdS3Q2vwpsM%3D0yovpJm8dKHRqNyFpANbrhpw%40mail.gmail.com
Regards,
Vignesh
>
> On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for reporting this, this is fixed in the v18 patch attached.
> > >
> >
> > I have started looking into this patch and below are some initial comments.
> >
>
> Few more comments:
> ===================
> 1. Do we need the previous column 'puballtables' after adding pubtype
> or pubkind in pg_publication?
I felt this should be retained as old client will still be using puballtables, like in case of old client executing \dRp+ commands.
> 2.
> @@ -224,6 +279,20 @@ CreatePublication(ParseState *pstate,
> CreatePublicationStmt *stmt)
> ..
> + nspcrel = table_open(NamespaceRelationId, ShareUpdateExclusiveLock);
> + PublicationAddSchemas(puboid, schemaoidlist, true, NULL);
> + table_close(nspcrel, ShareUpdateExclusiveLock);
>
> What is the reason for opening and taking a lock on
> NamespaceRelationId? Do you want to avoid dropping the corresponding
> schema during this duration? If so, that is not sufficient because
> what if somebody drops it just before you took lock on
> NamespaceRelationId. I think you need to use LockDatabaseObject to
> avoid dropping the schema and note that it should be unlocked only at
> the end of the transaction similar to what we do for tables. I guess
> you need to add this locking inside the function
> PublicationAddSchemas. Also, it would be good if you can add few
> comments in this part of the code to explain the reason for locking.
Modified.
> 3. The function PublicationAddSchemas is called from multiple places
> in the patch but the locking protection is not there at all places. I
> think if you add locking as suggested in the previous point then it
> should be okay. I think you need similar locking for
> PublicationDropSchemas.
Modified.
> 4.
> @@ -421,16 +537,84 @@ AlterPublicationTables(AlterPublicationStmt
> *stmt, Relation rel,
> PublicationAddTables(pubid, rels, true, stmt);
>
> CloseTableList(delrels);
> + if (pubform->pubtype == PUBTYPE_EMPTY)
> + UpdatePublicationTypeTupleValue(rel, tup,
> + Anum_pg_publication_pubtype,
> + PUBTYPE_TABLE);
> }
>
> At the above and all other similar places, the patch first updates the
> pg_publication after performing the rel/schema operation. Isn't it
> better to first update pg_publication to remain in sync with how
> CreatePublication works? I am not able to see any issue with the way
> you have it in the patch but it is better to keep the code consistent
> across various similar functions to avoid confusion in the future.
Modified.
Thanks for the comments, the changes for the above are available in the v19 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3BMLBpWOSdS3Q2vwpsM%3D0yovpJm8dKHRqNyFpANbrhpw%40mail.gmail.com
Regards,
Vignesh
On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Few more comments: > > =================== > > 1. Do we need the previous column 'puballtables' after adding pubtype > > or pubkind in pg_publication? > > I felt this should be retained as old client will still be using puballtables, like in case of old client executing \dRp+ commands. > But do we guarantee that old clients work with newer server versions? For example, psql docs say: "psql works best with servers of the same or an older major version. Backslash commands are particularly likely to fail if the server is of a newer version than psql itself." [1] (See Notes Section). Similarly, pg_dump docs say: "However, pg_dump cannot dump from PostgreSQL servers newer than its own major version; it will refuse to even try, rather than risk making an invalid dump." [2] (See Notes Section). It seems Sawada-San has the same question and IIUC docs suggest we don't need such compatibility, so what makes you think we need it? [1] - https://www.postgresql.org/docs/devel/app-psql.html [2] - https://www.postgresql.org/docs/devel/app-pgdump.html -- With Regards, Amit Kapila.
On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > 6. > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */ > > + PublicationSchemaPsnspcidPspubidIndexId, > > + 2, > > + { > > + Anum_pg_publication_schema_psnspcid, > > + Anum_pg_publication_schema_pspubid, > > + 0, > > + 0 > > + }, > > > > Why don't we keep pubid as the first column in this index? > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as > it is, thoughts? > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it because it is searched using the only relid in GetRelationPublications, so, similarly, in the patch, you are using schema_oid in GetSchemaPublications, so probably that will help. I was wondering why you haven't directly used the cache in GetSchemaPublications similar to GetRelationPublications? It seems there is a danger for concurrent object drop. Can you please check how the safety is ensured say when either one wants to drop the corresponding relation/schema or publication? Another point is why can't we use the other index (where the index is on relid or schema_oid (PublicationSchemaObjectIndexId))? -- With Regards, Amit Kapila.
On Fri, Aug 6, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Few more comments: > > > =================== > > > 1. Do we need the previous column 'puballtables' after adding pubtype > > > or pubkind in pg_publication? > > > > I felt this should be retained as old client will still be using puballtables, like in case of old client executing \dRp+ commands. > > > > But do we guarantee that old clients work with newer server versions? > For example, psql docs say: "psql works best with servers of the same > or an older major version. Backslash commands are particularly likely > to fail if the server is of a newer version than psql itself." [1] > (See Notes Section). Similarly, pg_dump docs say: "However, pg_dump > cannot dump from PostgreSQL servers newer than its own major version; > it will refuse to even try, rather than risk making an invalid dump." > [2] (See Notes Section). > > It seems Sawada-San has the same question and IIUC docs suggest we > don't need such compatibility, so what makes you think we need it? Ok, I was not sure if we can remove any system table columns, hence had retained it. It seems that my understanding was wrong. I will remove this column in the next version of the patch. Regards, Vignesh
On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > 6. > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */ > > > + PublicationSchemaPsnspcidPspubidIndexId, > > > + 2, > > > + { > > > + Anum_pg_publication_schema_psnspcid, > > > + Anum_pg_publication_schema_pspubid, > > > + 0, > > > + 0 > > > + }, > > > > > > Why don't we keep pubid as the first column in this index? > > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as > > it is, thoughts? > > > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it > because it is searched using the only relid in > GetRelationPublications, so, similarly, in the patch, you are using > schema_oid in GetSchemaPublications, so probably that will help. I was > wondering why you haven't directly used the cache in > GetSchemaPublications similar to GetRelationPublications? Both of the approaches work, I was not sure which one is better, If the approach in GetRelationPublications is better, I will change it to something similar to GetRelationPublications. Thoughts? It seems > there is a danger for concurrent object drop. Can you please check how > the safety is ensured say when either one wants to drop the > corresponding relation/schema or publication? If a table is dropped concurrently from another session during logical replication of some operation in that table, while we get get_rel_sync_entry the cache invalidations(rel_sync_cache_relation_cb) happen. The cache entry will be marked as false, also the schema_sent will be marked as false. It will resend the relation using the relation that was prepared while processing this transaction from ReorderBufferProcessTXN. I felt this is ok since the relation is dropped after the operation on the table. Similarly if the publication is dropped concurrently from another session during logical replication of some operation in that table, while we get get_rel_sync_entry the cache invalidations(publication_invalidation_cb) happen. The publications will be reloaded and validated again, the data will be replicated to the server. I felt this behavior is fine since the publication is dropped after the operation on the table. Another point is why > can't we use the other index (where the index is on relid or > schema_oid (PublicationSchemaObjectIndexId))? I felt this cannot be used because In this case the index is in the oid column of pg_publication_schema and not on psnspcid column. Regards, Vignesh
On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > 6. > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */ > > > > + PublicationSchemaPsnspcidPspubidIndexId, > > > > + 2, > > > > + { > > > > + Anum_pg_publication_schema_psnspcid, > > > > + Anum_pg_publication_schema_pspubid, > > > > + 0, > > > > + 0 > > > > + }, > > > > > > > > Why don't we keep pubid as the first column in this index? > > > > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as > > > it is, thoughts? > > > > > > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it > > because it is searched using the only relid in > > GetRelationPublications, so, similarly, in the patch, you are using > > schema_oid in GetSchemaPublications, so probably that will help. I was > > wondering why you haven't directly used the cache in > > GetSchemaPublications similar to GetRelationPublications? > > Both of the approaches work, I was not sure which one is better, If > the approach in GetRelationPublications is better, I will change it to > something similar to GetRelationPublications. Thoughts? > I think it is better to use the cache as if we don't find the entry in the cache, then we will anyway search the required entry via sys table scan, see SearchCatCacheList. I think the point I wanted to ensure was that a concurrent session won't blow up the entry while we are looking at it. How is that ensured? -- With Regards, Amit Kapila.
On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com > > <tanghy.fnst@fujitsu.com> wrote: > > > > > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote: > > > > > > > > Thanks for the comments, attached v17 patches has the fixes for the same. > > > > > > Thanks for your new patch. > > > > > > I saw the following warning when compiling. It seems we don't need this variable any more. > > > > > > publicationcmds.c: In function ‘AlterPublicationSchemas’: > > > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable] > > > ListCell *oldlc; > > > ^~~~~ > > > > Thanks for reporting this, this is fixed in the v18 patch attached. > > I've also started reviewing this patch. I've not looked at the patch > yet but here are initial comments/questions based on using this > feature: > > pg_publication catalog still has puballtables column but it's still > necessary? IIUC since pubtype = 'a' means publishing all tables in the > database puballtables seems no longer necessary. I will remove puballtables in my next version of the patch. > --- > Suppose that a parent table and its child table are defined in > different schemas, there is a publication for the schema where only > the parent table is defined, and the subscriber subscribes to the > publication, should changes for its child table be replicated to the > subscriber? I felt that in this case only the table data that is present in the publish schema should be sent to the subscriber. Since the child table schema is not part of the publication, I felt this child table data should not be replicated. Thoughts? I have kept the above same behavior in the case of publication created using PUBLISH_VIA_PARTITION_ROOT option i.e the child table data will not be sent. But now I'm feeling we should send the child table data since it is being sent through the parent table which is part of the publication. Also this way users can use this option if the user has the table having partitions designed across the schemas. Thoughts? > In FOR TABLE cases, i.g., where the subscriber subscribes to the > publication that is only for the parent table, changes for its child > table are replicated to the subscriber. > > As far as I tested v18 patch, changes for the child table are not > replicated in FOR SCHEMA cases. Here is the test script: > > On publisher and subscriber: > create schema p_schema; > create schema c_schema; > create table p_schema.p (a int) partition by list (a); > create table c_schema.c partition of p_schema.p for values in (1); > > On publisher: > create publication pub_p_schema for schema p_schema; > > On subscriber: > create subscription pub connection 'dbname=postgres' publication pub_p_schema; > > On publisher: > insert into p_schema.p values (1); > select * from p_schema.p; > a > --- > 1 > (1 row) > > On subscriber: > select * from p_schema.p; > a > --- > > (0 rows) I have kept this behavior intentionally, details explained above. Thoughts? Regards, Vignesh
> On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote: > > the attached v19 patch With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that table: +CREATE ROLE user1; +GRANT CREATE ON DATABASE regression TO user1; +CREATE ROLE user2; +GRANT CREATE ON DATABASE regression TO user2; +SET SESSION AUTHORIZATION user1; +CREATE SCHEMA user1schema; +GRANT CREATE, USAGE ON SCHEMA user1schema TO user2; +RESET SESSION AUTHORIZATION; +SET SESSION AUTHORIZATION user2; +CREATE TABLE user1schema.user2private (junk text); +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM PUBLIC; +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM user1; +CREATE TABLE user1schema.user2public (junk text); +GRANT SELECT ON user1schema.user2public TO PUBLIC; +RESET SESSION AUTHORIZATION; +SET SESSION AUTHORIZATION user1; +SELECT * FROM user1schema.user2private; +ERROR: permission denied for table user2private +SELECT * FROM user1schema.user2public; + junk +------ +(0 rows) + +CREATE PUBLICATION user1pub; +WARNING: wal_level is insufficient to publish logical changes +HINT: Set wal_level to logical before creating subscriptions. +ALTER PUBLICATION user1pub + ADD TABLE user1schema.user2public; +ERROR: must be owner of table user2public +ALTER PUBLICATION user1pub + ADD TABLE user1schema.user2private, user1schema.user2public; +ERROR: must be owner of table user2private +SELECT * FROM pg_catalog.pg_publication_tables + WHERE pubname = 'user1pub'; + pubname | schemaname | tablename +---------+------------+----------- +(0 rows) + +ALTER PUBLICATION user1pub ADD SCHEMA user1schema; +SELECT * FROM pg_catalog.pg_publication_tables + WHERE pubname = 'user1pub'; + pubname | schemaname | tablename +----------+-------------+-------------- + user1pub | user1schema | user2private + user1pub | user1schema | user2public +(2 rows) It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their schemas, butthat's how it is. The design of this patch seems to assume otherwise. Perhaps ALTER PUBLICATION ... ADD SCHEMA shouldbe restricted to superusers, just as FOR ALL TABLES? Alternatively, you could add ownership checks per table to mirror the behavior of ALTER PUBLICATION ... ADD TABLE, but thatwould foreclose the option of automatically updating the list of tables in the publication as new tables are added tothe schema, since those new tables would not necessarily belong to the schema owner, and having a error thrown during CREATETABLE would be quite unfriendly. I think until this is hammered out, it is safer to require superuser privileges andthen we can revisit this issue and loosen the requirement in a subsequent commit. What do you think? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Aug 9, 2021 at 9:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > > On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote: > > > > the attached v19 patch > > With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that table: > ... ... > > It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their schemas,but that's how it is. The design of this patch seems to assume otherwise. Perhaps ALTER PUBLICATION ... ADD SCHEMAshould be restricted to superusers, just as FOR ALL TABLES? > +1. Your suggestion sounds reasonable to me. > Alternatively, you could add ownership checks per table to mirror the behavior of ALTER PUBLICATION ... ADD TABLE, butthat would foreclose the option of automatically updating the list of tables in the publication as new tables are addedto the schema, since those new tables would not necessarily belong to the schema owner, and having a error thrown duringCREATE TABLE would be quite unfriendly. I think until this is hammered out, it is safer to require superuser privilegesand then we can revisit this issue and loosen the requirement in a subsequent commit. > I think the same argument can be made for "FOR ALL TABLES .." as well. So, let's leave such a requirement for another patch. -- With Regards, Amit Kapila.
On Fri, Aug 6, 2021 at 6:32 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v19 patch has the fixes for the comments. > Some more review comments, this time for the v19 patch: (1) In patch v19-0002, there's still a couple of instances where it says "publication type" instead of "publication kind". (2) src/backend/catalog/pg_publication.c "This should only be used for normal publications." What exactly is meant by that - what type is considered normal? Maybe that comment should be more specific. (3) src/backend/catalog/pg_publication.c GetSchemaPublications Function header says "Gets list of publication oids for publications marked as FOR SCHEMA." Shouldn't it say something like: "Gets the list of FOR SCHEMA publication oids associated with a specified schema oid." or something like that? (since the function accepts a schemaid parameter) (4) src/backend/commands/publicationcmds.c In AlterPublicationSchemas(), I notice that the two error cases "cannot be added to or dropped ..." don't check stmt->action for DEFELEM_ADD/DEFELEM_DROP. Is that OK? (i.e. should these cases error out if stmt->action is not DEFELEM_ADD/DEFELEM_DROP?) Also, I assume that the else part (not DEFELEM_ADD/DEFELEM_DROP) is the "Set" case? Maybe a comment should be added to the top of the else part. (5) src/backend/commands/publicationcmds.c Typo (same in 2 places): "automaically" -> "automatically" + * will be released automaically at the end of create publication See functions: (i) CreatePublication (ii) AlterPublicationSchemas (6) src/backend/commands/publicationcmds.c LockSchemaList Function header says "are locked in ShareUpdateExclusiveLock mode" but then code calls LockDatabaseObject using "AccessShareLock". Regards, Greg Nancarrow Fujitsu Australia
On Mon, Aug 9, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > --- > > Suppose that a parent table and its child table are defined in > > different schemas, there is a publication for the schema where only > > the parent table is defined, and the subscriber subscribes to the > > publication, should changes for its child table be replicated to the > > subscriber? > > I felt that in this case only the table data that is present in the > publish schema should be sent to the subscriber. Since the child table > schema is not part of the publication, I felt this child table data > should not be replicated. > But, as point out by Sawada-San, the same is true for FOR TABLE case. I think we should be consistent here and should publish the data for the child table if the parent table's schema is published. > I have kept the above same behavior in the case of publication created > using PUBLISH_VIA_PARTITION_ROOT option i.e the child table data will > not be sent. But now I'm feeling we should send the child table data > since it is being sent through the parent table which is part of the > publication. Also this way users can use this option if the user has > the table having partitions designed across the schemas. > This sounds fine to me. -- With Regards, Amit Kapila.
On Mon, Aug 9, 2021 at 9:50 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > > > > On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote: > > > > the attached v19 patch > > With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that table: > > +CREATE ROLE user1; > +GRANT CREATE ON DATABASE regression TO user1; > +CREATE ROLE user2; > +GRANT CREATE ON DATABASE regression TO user2; > +SET SESSION AUTHORIZATION user1; > +CREATE SCHEMA user1schema; > +GRANT CREATE, USAGE ON SCHEMA user1schema TO user2; > +RESET SESSION AUTHORIZATION; > +SET SESSION AUTHORIZATION user2; > +CREATE TABLE user1schema.user2private (junk text); > +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM PUBLIC; > +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM user1; > +CREATE TABLE user1schema.user2public (junk text); > +GRANT SELECT ON user1schema.user2public TO PUBLIC; > +RESET SESSION AUTHORIZATION; > +SET SESSION AUTHORIZATION user1; > +SELECT * FROM user1schema.user2private; > +ERROR: permission denied for table user2private > +SELECT * FROM user1schema.user2public; > + junk > +------ > +(0 rows) > + > +CREATE PUBLICATION user1pub; > +WARNING: wal_level is insufficient to publish logical changes > +HINT: Set wal_level to logical before creating subscriptions. > +ALTER PUBLICATION user1pub > + ADD TABLE user1schema.user2public; > +ERROR: must be owner of table user2public > +ALTER PUBLICATION user1pub > + ADD TABLE user1schema.user2private, user1schema.user2public; > +ERROR: must be owner of table user2private > +SELECT * FROM pg_catalog.pg_publication_tables > + WHERE pubname = 'user1pub'; > + pubname | schemaname | tablename > +---------+------------+----------- > +(0 rows) > + > +ALTER PUBLICATION user1pub ADD SCHEMA user1schema; > +SELECT * FROM pg_catalog.pg_publication_tables > + WHERE pubname = 'user1pub'; > + pubname | schemaname | tablename > +----------+-------------+-------------- > + user1pub | user1schema | user2private > + user1pub | user1schema | user2public > +(2 rows) > > It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their schemas,but that's how it is. The design of this patch seems to assume otherwise. Perhaps ALTER PUBLICATION ... ADD SCHEMAshould be restricted to superusers, just as FOR ALL TABLES? I will handle this in the next version of the patch. Additionally I will add this check for "Alter publication add schema" and "Alter publication set schema". I'm not planning to add this check for "Alter publication drop schema" to keep the behavior similar to "Alter publication drop table". Also, the behavior of "Alter publication drop table" for which the user is not the owner is successful, Is this behavior correct? create table tbl1(c1 int); create table tbl2(c1 int); create publication mypub1 for table tbl1,tbl2; SET SESSION AUTHORIZATION user2; alter table tbl2 owner to user2; RESET SESSION AUTHORIZATION; postgres=> alter publication mypub1 drop table tbl2; ALTER PUBLICATION postgres=> alter publication mypub1 add table tbl2; ERROR: must be owner of table tbl2 Thoughts? Regards, Vignesh
> On Aug 10, 2021, at 10:59 PM, vignesh C <vignesh21@gmail.com> wrote: > > Also, the behavior of "Alter publication drop table" for which the > user is not the owner is successful, Is this behavior correct? I think that dropping a table from a publication should be allowed for the publication owner, without regard to the ownerof the table. Adding a table to a publication is different, as it exposes the contents of the table. Consider the following: +create user user1; +create user user2; +create table tbl1(c1 int); +create table tbl2(c1 int); +create publication pub1 for table tbl1,tbl2; +WARNING: wal_level is insufficient to publish logical changes +HINT: Set wal_level to logical before creating subscriptions. +alter table tbl1 owner to user1; +alter publication pub1 owner to user1; +alter table tbl2 owner to user2; +SET SESSION AUTHORIZATION user2; +alter publication pub1 drop table tbl1; +ERROR: must be owner of publication pub1 +alter publication pub1 drop table tbl2; +ERROR: must be owner of publication pub1 +alter publication pub1 add table tbl1; +ERROR: must be owner of publication pub1 +alter publication pub1 add table tbl2; +ERROR: must be owner of publication pub1 +RESET SESSION AUTHORIZATION; +SET SESSION AUTHORIZATION user1; +alter publication pub1 drop table tbl1; +alter publication pub1 drop table tbl2; +alter publication pub1 add table tbl1; +alter publication pub1 add table tbl2; +ERROR: must be owner of table tbl2 Clearly user2 cannot modify pub1, not even with respect to user2's own table. user1 can modify its own publication exceptfor adding someone else's table. This seems correct to me. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Aug 11, 2021 at 7:45 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > > On Aug 10, 2021, at 10:59 PM, vignesh C <vignesh21@gmail.com> wrote: > > > > Also, the behavior of "Alter publication drop table" for which the > > user is not the owner is successful, Is this behavior correct? > > I think that dropping a table from a publication should be allowed for the publication owner, without regard to the ownerof the table. > Adding a table to a publication is different, as it exposes the contents of the table. > .. .. > > Clearly user2 cannot modify pub1, not even with respect to user2's own table. user1 can modify its own publication exceptfor adding someone else's table. This seems correct to me. > That looks reasonable to me as well and I think we should follow the same for the schema. -- With Regards, Amit Kapila.
On Mon, Aug 9, 2021 at 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > 6. > > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */ > > > > > + PublicationSchemaPsnspcidPspubidIndexId, > > > > > + 2, > > > > > + { > > > > > + Anum_pg_publication_schema_psnspcid, > > > > > + Anum_pg_publication_schema_pspubid, > > > > > + 0, > > > > > + 0 > > > > > + }, > > > > > > > > > > Why don't we keep pubid as the first column in this index? > > > > > > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as > > > > it is, thoughts? > > > > > > > > > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it > > > because it is searched using the only relid in > > > GetRelationPublications, so, similarly, in the patch, you are using > > > schema_oid in GetSchemaPublications, so probably that will help. I was > > > wondering why you haven't directly used the cache in > > > GetSchemaPublications similar to GetRelationPublications? > > > > Both of the approaches work, I was not sure which one is better, If > > the approach in GetRelationPublications is better, I will change it to > > something similar to GetRelationPublications. Thoughts? > > > > I think it is better to use the cache as if we don't find the entry in > the cache, then we will anyway search the required entry via sys table > scan, see SearchCatCacheList. +1 I had the same comment while reading the v19 patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v19 patch has the fixes for the comments. Thank you for updating the patch! Here are some comments on v19 patch: + case OCLASS_PUBLICATION_SCHEMA: + RemovePublicationSchemaById(object->objectId); + break; +void +RemovePublicationSchemaById(Oid psoid) +{ + Relation rel; + HeapTuple tup; + + rel = table_open(PublicationSchemaRelationId, RowExclusiveLock); + + tup = SearchSysCache1(PUBLICATIONSCHEMA, ObjectIdGetDatum(psoid)); + + if (!HeapTupleIsValid(tup)) + elog(ERROR, "cache lookup failed for publication schema %u", + psoid); + + CatalogTupleDelete(rel, &tup->t_self); + + ReleaseSysCache(tup); + + table_close(rel, RowExclusiveLock); +} Since RemovePublicationSchemaById() does simple catalog tuple deletion, it seems to me that we can DropObjectById() to delete the row of pg_publication_schema. --- { - ScanKeyInit(&key[0], + ScanKeyData skey[1]; + + ScanKeyInit(&skey[0], Anum_pg_class_relkind, BTEqualStrategyNumber, F_CHAREQ, CharGetDatum(RELKIND_PARTITIONED_TABLE)); - scan = table_beginscan_catalog(classRel, 1, key); + scan = table_beginscan_catalog(classRel, 1, skey); Since we specify 1 as the number of keys in table_beginscan_catalog(), can we reuse 'key' instead of using 'skey'? --- Even if we drop all tables added to the publication from it, 'pubkind' doesn't go back to 'empty'. Is that intentional behavior? If we do that, we can save the lookup of pg_publication_rel and pg_publication_schema in some cases, and we can switch the publication that was created as FOR SCHEMA to FOR TABLE and vice versa. --- +static void +UpdatePublicationKindTupleValue(Relation rel, HeapTuple tup, int col, + char pubkind) Since all callers of this function specify Anum_pg_publication_pubkind to 'col', it seems 'col' is not necessary. --- +static void +AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel, + HeapTuple tup, Form_pg_publication pubform) I think we don't need to pass 'pubform' to this function since we can get it by GETSTRUCT(tup). --- + Oid schemaId = get_rel_namespace(relid); List *pubids = GetRelationPublications(relid); + List *schemaPubids = GetSchemaPublications(schemaId); Can we defer to get the list of schema publications (i.g., 'schemaPubids') until we find the PUBKIND_SCHEMA publication? Perhaps the same is true for building 'pubids'. --- + List of publications + Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | PubKind +--------------------+--------------------------+------------+---------+---------+---------+-----------+----------+--------- + testpib_ins_trunct | regress_publication_user | f | t | f | f | f | f | e + testpub_default | regress_publication_user | f | f | t | f | f | f | e I think it's more readable if \dRp shows 'all tables', 'table', 'schema', and 'empty' in PubKind instead of the single character. I think 'Pub kind' is more consistent with other column names. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote: > > > > --- > Even if we drop all tables added to the publication from it, 'pubkind' > doesn't go back to 'empty'. Is that intentional behavior? If we do > that, we can save the lookup of pg_publication_rel and > pg_publication_schema in some cases, and we can switch the publication > that was created as FOR SCHEMA to FOR TABLE and vice versa. > Do we really want to allow users to change a publication that is FOR SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES. postgres=# Alter Publication pub add table tbl1; ERROR: publication "pub" is defined as FOR ALL TABLES DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications. -- With Regards, Amit Kapila.
On 13.08.21 04:59, Amit Kapila wrote: >> Even if we drop all tables added to the publication from it, 'pubkind' >> doesn't go back to 'empty'. Is that intentional behavior? If we do >> that, we can save the lookup of pg_publication_rel and >> pg_publication_schema in some cases, and we can switch the publication >> that was created as FOR SCHEMA to FOR TABLE and vice versa. >> > Do we really want to allow users to change a publication that is FOR > SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES. > postgres=# Alter Publication pub add table tbl1; > ERROR: publication "pub" is defined as FOR ALL TABLES > DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications. I think the strict separation between publication-for-tables vs. publication-for-schemas is a mistake. Why can't I have a publication that publishes tables t1, t2, t3, *and* schemas s1, s2, s3. Also note that we have a pending patch to add sequences support to logical replication. So eventually, a publication will be able to contain a bunch of different objects of different kinds.
On Sat, Aug 14, 2021 at 3:02 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 13.08.21 04:59, Amit Kapila wrote: > >> Even if we drop all tables added to the publication from it, 'pubkind' > >> doesn't go back to 'empty'. Is that intentional behavior? If we do > >> that, we can save the lookup of pg_publication_rel and > >> pg_publication_schema in some cases, and we can switch the publication > >> that was created as FOR SCHEMA to FOR TABLE and vice versa. > >> > > Do we really want to allow users to change a publication that is FOR > > SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES. > > postgres=# Alter Publication pub add table tbl1; > > ERROR: publication "pub" is defined as FOR ALL TABLES > > DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications. > > I think the strict separation between publication-for-tables vs. > publication-for-schemas is a mistake. Why can't I have a publication > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3. Also note > that we have a pending patch to add sequences support to logical > replication. So eventually, a publication will be able to contain a > bunch of different objects of different kinds. > Valid point. -- With Regards, Amit Kapila.
On Fri, Aug 6, 2021 at 2:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Aug 4, 2021 at 12:08 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Aug 3, 2021 at 12:00 PM tanghy.fnst@fujitsu.com > > <tanghy.fnst@fujitsu.com> wrote: > > > > > > On Monday, August 2, 2021 11:40 PM vignesh C <vignesh21@gmail.com>wrote: > > > > > > > > Thanks for the comments, attached v17 patches has the fixes for the same. > > > > > > Thanks for your new patch. > > > > > > I saw the following warning when compiling. It seems we don't need this variable any more. > > > > > > publicationcmds.c: In function ‘AlterPublicationSchemas’: > > > publicationcmds.c:592:15: warning: unused variable ‘oldlc’ [-Wunused-variable] > > > ListCell *oldlc; > > > ^~~~~ > > > > Thanks for reporting this, this is fixed in the v18 patch attached. > > I've also started reviewing this patch. I've not looked at the patch > yet but here are initial comments/questions based on using this > feature: > > pg_publication catalog still has puballtables column but it's still > necessary? IIUC since pubtype = 'a' means publishing all tables in the > database puballtables seems no longer necessary. Removed puballtables. > --- > Suppose that a parent table and its child table are defined in > different schemas, there is a publication for the schema where only > the parent table is defined, and the subscriber subscribes to the > publication, should changes for its child table be replicated to the > subscriber? > > In FOR TABLE cases, i.g., where the subscriber subscribes to the > publication that is only for the parent table, changes for its child > table are replicated to the subscriber. Modified it to keep the behavior similar to FOR Table publication. > As far as I tested v18 patch, changes for the child table are not > replicated in FOR SCHEMA cases. Here is the test script: > > On publisher and subscriber: > create schema p_schema; > create schema c_schema; > create table p_schema.p (a int) partition by list (a); > create table c_schema.c partition of p_schema.p for values in (1); > > On publisher: > create publication pub_p_schema for schema p_schema; > > On subscriber: > create subscription pub connection 'dbname=postgres' publication pub_p_schema; > > On publisher: > insert into p_schema.p values (1); > select * from p_schema.p; > a > --- > 1 > (1 row) > > On subscriber: > select * from p_schema.p; > a > --- > > (0 rows) Modified to handle this. Thanks for the comments, the attached v20 patch handles the above issues. Regards, Vignesh
Attachment
>
>
>
> > On Aug 6, 2021, at 1:32 AM, vignesh C <vignesh21@gmail.com> wrote:
> >
> > the attached v19 patch
>
> With v19 applied, a schema owner can publish the contents of a table regardless of ownership or permissions on that table:
>
> +CREATE ROLE user1;
> +GRANT CREATE ON DATABASE regression TO user1;
> +CREATE ROLE user2;
> +GRANT CREATE ON DATABASE regression TO user2;
> +SET SESSION AUTHORIZATION user1;
> +CREATE SCHEMA user1schema;
> +GRANT CREATE, USAGE ON SCHEMA user1schema TO user2;
> +RESET SESSION AUTHORIZATION;
> +SET SESSION AUTHORIZATION user2;
> +CREATE TABLE user1schema.user2private (junk text);
> +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM PUBLIC;
> +REVOKE ALL PRIVILEGES ON user1schema.user2private FROM user1;
> +CREATE TABLE user1schema.user2public (junk text);
> +GRANT SELECT ON user1schema.user2public TO PUBLIC;
> +RESET SESSION AUTHORIZATION;
> +SET SESSION AUTHORIZATION user1;
> +SELECT * FROM user1schema.user2private;
> +ERROR: permission denied for table user2private
> +SELECT * FROM user1schema.user2public;
> + junk
> +------
> +(0 rows)
> +
> +CREATE PUBLICATION user1pub;
> +WARNING: wal_level is insufficient to publish logical changes
> +HINT: Set wal_level to logical before creating subscriptions.
> +ALTER PUBLICATION user1pub
> + ADD TABLE user1schema.user2public;
> +ERROR: must be owner of table user2public
> +ALTER PUBLICATION user1pub
> + ADD TABLE user1schema.user2private, user1schema.user2public;
> +ERROR: must be owner of table user2private
> +SELECT * FROM pg_catalog.pg_publication_tables
> + WHERE pubname = 'user1pub';
> + pubname | schemaname | tablename
> +---------+------------+-----------
> +(0 rows)
> +
> +ALTER PUBLICATION user1pub ADD SCHEMA user1schema;
> +SELECT * FROM pg_catalog.pg_publication_tables
> + WHERE pubname = 'user1pub';
> + pubname | schemaname | tablename
> +----------+-------------+--------------
> + user1pub | user1schema | user2private
> + user1pub | user1schema | user2public
> +(2 rows)
>
> It is a bit counterintuitive that schema owners do not have administrative privileges over tables within their schemas, but that's how it is. The design of this patch seems to assume otherwise. Perhaps ALTER PUBLICATION ... ADD SCHEMA should be restricted to superusers, just as FOR ALL TABLES?
Thanks for the comment, this is handled in the v20 patch attached at [1].
> Alternatively, you could add ownership checks per table to mirror the behavior of ALTER PUBLICATION ... ADD TABLE, but that would foreclose the option of automatically updating the list of tables in the publication as new tables are added to the schema, since those new tables would not necessarily belong to the schema owner, and having a error thrown during CREATE TABLE would be quite unfriendly. I think until this is hammered out, it is safer to require superuser privileges and then we can revisit this issue and loosen the requirement in a subsequent commit.
I agree with Amit on this point for handling this as a separate patch, I have not made any change for this, kept the behavior as is.
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com
>
> On Fri, Aug 6, 2021 at 6:32 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v19 patch has the fixes for the comments.
> >
>
> Some more review comments, this time for the v19 patch:
>
>
> (1) In patch v19-0002, there's still a couple of instances where it
> says "publication type" instead of "publication kind".
Modified
> (2) src/backend/catalog/pg_publication.c
>
> "This should only be used for normal publications."
>
> What exactly is meant by that - what type is considered normal? Maybe
> that comment should be more specific.
Modified
> (3) src/backend/catalog/pg_publication.c
> GetSchemaPublications
>
> Function header says "Gets list of publication oids for publications
> marked as FOR SCHEMA."
>
> Shouldn't it say something like: "Gets the list of FOR SCHEMA
> publication oids associated with a specified schema oid." or something
> like that?
> (since the function accepts a schemaid parameter)
Modfified
> (4) src/backend/commands/publicationcmds.c
>
> In AlterPublicationSchemas(), I notice that the two error cases
> "cannot be added to or dropped ..." don't check stmt->action for
> DEFELEM_ADD/DEFELEM_DROP.
> Is that OK? (i.e. should these cases error out if stmt->action is not
> DEFELEM_ADD/DEFELEM_DROP?)
> Also, I assume that the else part (not DEFELEM_ADD/DEFELEM_DROP) is
> the "Set" case? Maybe a comment should be added to the top of the else
> part.
The error message should also include set, I have modified the error message accordingly.
> (5) src/backend/commands/publicationcmds.c
> Typo (same in 2 places): "automaically" -> "automatically"
>
> + * will be released automaically at the end of create publication
>
> See functions:
> (i) CreatePublication
> (ii) AlterPublicationSchemas
Modified.
> (6) src/backend/commands/publicationcmds.c
> LockSchemaList
>
> Function header says "are locked in ShareUpdateExclusiveLock mode" but
> then code calls LockDatabaseObject using "AccessShareLock".
Modified.
Thanks for the comments, these issues are fixed in the v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com
>
> On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v19 patch has the fixes for the comments.
>
> Thank you for updating the patch!
>
> Here are some comments on v19 patch:
>
> + case OCLASS_PUBLICATION_SCHEMA:
> + RemovePublicationSchemaById(object->objectId);
> + break;
>
> +void
> +RemovePublicationSchemaById(Oid psoid)
> +{
> + Relation rel;
> + HeapTuple tup;
> +
> + rel = table_open(PublicationSchemaRelationId, RowExclusiveLock);
> +
> + tup = SearchSysCache1(PUBLICATIONSCHEMA, ObjectIdGetDatum(psoid));
> +
> + if (!HeapTupleIsValid(tup))
> + elog(ERROR, "cache lookup failed for publication schema %u",
> + psoid);
> +
> + CatalogTupleDelete(rel, &tup->t_self);
> +
> + ReleaseSysCache(tup);
> +
> + table_close(rel, RowExclusiveLock);
> +}
>
> Since RemovePublicationSchemaById() does simple catalog tuple
> deletion, it seems to me that we can DropObjectById() to delete the
> row of pg_publication_schema.
Relation cache invalidations were missing in the function, I have added and retained the function with invalidation changes.
> ---
> {
> - ScanKeyInit(&key[0],
> + ScanKeyData skey[1];
> +
> + ScanKeyInit(&skey[0],
> Anum_pg_class_relkind,
> BTEqualStrategyNumber, F_CHAREQ,
>
> CharGetDatum(RELKIND_PARTITIONED_TABLE));
>
> - scan = table_beginscan_catalog(classRel, 1, key);
> + scan = table_beginscan_catalog(classRel, 1, skey);
>
> Since we specify 1 as the number of keys in table_beginscan_catalog(),
> can we reuse 'key' instead of using 'skey'?
Modified.
> ---
> Even if we drop all tables added to the publication from it, 'pubkind'
> doesn't go back to 'empty'. Is that intentional behavior? If we do
> that, we can save the lookup of pg_publication_rel and
> pg_publication_schema in some cases, and we can switch the publication
> that was created as FOR SCHEMA to FOR TABLE and vice versa.
I felt this can be handled as a separate patch as the same scenario applies for all tables publication too. Thoughts?
> ---
> +static void
> +UpdatePublicationKindTupleValue(Relation rel, HeapTuple tup, int col,
> + char pubkind)
>
> Since all callers of this function specify Anum_pg_publication_pubkind
> to 'col', it seems 'col' is not necessary.
Modified
> ---
> +static void
> +AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
> + HeapTuple tup,
> Form_pg_publication pubform)
>
> I think we don't need to pass 'pubform' to this function since we can
> get it by GETSTRUCT(tup).
Modified.
> ---
> + Oid schemaId = get_rel_namespace(relid);
> List *pubids = GetRelationPublications(relid);
> + List *schemaPubids = GetSchemaPublications(schemaId);
>
> Can we defer to get the list of schema publications (i.g.,
> 'schemaPubids') until we find the PUBKIND_SCHEMA publication? Perhaps
> the same is true for building 'pubids'.
I felt that we can get the publication information and use it whenever required instead of querying in the loop. Thoughts?
> ---
> + List of publications
> + Name | Owner | All tables | Inserts
> | Updates | Deletes | Truncates | Via root | PubKind
> +--------------------+--------------------------+------------+---------+---------+---------+-----------+----------+---------
> + testpib_ins_trunct | regress_publication_user | f | t
> | f | f | f | f | e
> + testpub_default | regress_publication_user | f | f
> | t | f | f | f | e
>
> I think it's more readable if \dRp shows 'all tables', 'table',
> 'schema', and 'empty' in PubKind instead of the single character.
Modified
> I think 'Pub kind' is more consistent with other column names.
Modified
Thanks for the comments, these issues are fixed in the v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com
Vignesh
>
> On Sun, Aug 8, 2021 at 2:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > > 6.
> > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > > + 2,
> > > > > + {
> > > > > + Anum_pg_publication_schema_psnspcid,
> > > > > + Anum_pg_publication_schema_pspubid,
> > > > > + 0,
> > > > > + 0
> > > > > + },
> > > > >
> > > > > Why don't we keep pubid as the first column in this index?
> > > >
> > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > > it is, thoughts?
> > > >
> > >
> > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > > because it is searched using the only relid in
> > > GetRelationPublications, so, similarly, in the patch, you are using
> > > schema_oid in GetSchemaPublications, so probably that will help. I was
> > > wondering why you haven't directly used the cache in
> > > GetSchemaPublications similar to GetRelationPublications?
> >
> > Both of the approaches work, I was not sure which one is better, If
> > the approach in GetRelationPublications is better, I will change it to
> > something similar to GetRelationPublications. Thoughts?
> >
>
> I think it is better to use the cache as if we don't find the entry in
> the cache, then we will anyway search the required entry via sys table
> scan, see SearchCatCacheList.
Modified. This is handled in the v20 patch posted at [1].
I think the point I wanted to ensure was
> that a concurrent session won't blow up the entry while we are looking
> at it. How is that ensured?
The concurrency points occur at two places, Walsender session and user session:
For Walsender process when we query the data from the cache we will get the results based on historic snapshot. I also debugged and verified that we get the results based on historic snapshot, if we check the cache during our operation (insert is just before drop) we will be able to get the dropped record from the cache as this drop occurred after our insert. And if we query the cache after the drop, we will not get the dropped information from the cache. So I feel our existing code is good enough which handles the concurrency through the historic snapshot.
For user sessions, user session checks for replica identity for update/delete operations. To prevent concurrency issues, when schema is added to the publication, the rel cache invalidation happens in publication_add_schema by calling InvalidatePublicationRels, similarly when a schema is dropped from the publication, the rel cache invalidation is handled in RemovePublicationSchemaById by calling InvalidatePublicationRels. Once the invalidation happens it will check the system tables again before deciding. I felt this rel cache invalidation will prevent concurrency issues.
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com
Regards,
Vignesh
>
> On Fri, Aug 6, 2021 at 2:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Aug 5, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Few more comments:
> > > ===================
> > > 1. Do we need the previous column 'puballtables' after adding pubtype
> > > or pubkind in pg_publication?
Removed puballtables, this is handled as part of v20 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm00X9SQBokUTy1OxN1Sa2DFsK8rg8j_wLgc-7ZuKcuh0Q%40mail.gmail.com
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > I think the strict separation between publication-for-tables vs. > publication-for-schemas is a mistake. Why can't I have a publication > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3. Also note > that we have a pending patch to add sequences support to logical > replication. So eventually, a publication will be able to contain a > bunch of different objects of different kinds. This seems like it's going to create a mess, because the meaning of "include schema S" will change over time as we add more features. That is, with the present patch (I suppose, haven't read it) we have "schema S" meaning "publish all tables in schema S". When that other patch lands, presumably that same publication definition would start meaning "publish all tables and sequences in schema S". And a few years down the road it might start meaning something else again. That sounds like the sort of potentially app-breaking change that we don't like to make. We could avoid that bug-in-waiting if the syntax were more like "FOR ALL TABLES IN SCHEMA s", which could later extend to "FOR ALL SEQUENCES IN SCHEMA s", etc. This is then a very clean intermediate step between publishing one table and "FOR ALL TABLES" without a schema limitation. I tend to agree that a single publication should be able to incorporate any of these options. regards, tom lane
On Fri, Aug 13, 2021 at 11:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Aug 6, 2021 at 5:33 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > --- > > Even if we drop all tables added to the publication from it, 'pubkind' > > doesn't go back to 'empty'. Is that intentional behavior? If we do > > that, we can save the lookup of pg_publication_rel and > > pg_publication_schema in some cases, and we can switch the publication > > that was created as FOR SCHEMA to FOR TABLE and vice versa. > > > > Do we really want to allow users to change a publication that is FOR > SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES. > postgres=# Alter Publication pub add table tbl1; > ERROR: publication "pub" is defined as FOR ALL TABLES > DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications. When it comes to FOR ALL TABLES, we can neither add/drop tables to/from the publication. So it makes sense to me that it never changes to 'empty'. I'm not sure there are use cases in practice where to change FOR SCHEMA to FOR TABLE. But it seems a bit weird to me that pubkind doesn't change to "empty" even if the publication actually gets empty. It might be just that "empty" is misleading, though. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Sun, Aug 15, 2021 at 12:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > > I think the strict separation between publication-for-tables vs. > > publication-for-schemas is a mistake. Why can't I have a publication > > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3. Also note > > that we have a pending patch to add sequences support to logical > > replication. So eventually, a publication will be able to contain a > > bunch of different objects of different kinds. > > This seems like it's going to create a mess, because the meaning of > "include schema S" will change over time as we add more features. > That is, with the present patch (I suppose, haven't read it) we have > "schema S" meaning "publish all tables in schema S". When that other > patch lands, presumably that same publication definition would start > meaning "publish all tables and sequences in schema S". And a few years > down the road it might start meaning something else again. That sounds > like the sort of potentially app-breaking change that we don't like > to make. > > We could avoid that bug-in-waiting if the syntax were more like > "FOR ALL TABLES IN SCHEMA s", which could later extend to > "FOR ALL SEQUENCES IN SCHEMA s", etc. This is then a very clean > intermediate step between publishing one table and "FOR ALL TABLES" > without a schema limitation. +1 > I tend to agree that a single publication should be able to incorporate > any of these options. I personally prefer that a single publication can include both all tables and all sequences in a database or a schema. It would be a more convenient way to specify replicating all objects (tables and sequences) in a database or a schema. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Sun, Aug 15, 2021 at 1:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > > I think the strict separation between publication-for-tables vs. > > publication-for-schemas is a mistake. Why can't I have a publication > > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3. Also note > > that we have a pending patch to add sequences support to logical > > replication. So eventually, a publication will be able to contain a > > bunch of different objects of different kinds. > > This seems like it's going to create a mess, because the meaning of > "include schema S" will change over time as we add more features. > That is, with the present patch (I suppose, haven't read it) we have > "schema S" meaning "publish all tables in schema S". When that other > patch lands, presumably that same publication definition would start > meaning "publish all tables and sequences in schema S". And a few years > down the road it might start meaning something else again. That sounds > like the sort of potentially app-breaking change that we don't like > to make. > > We could avoid that bug-in-waiting if the syntax were more like > "FOR ALL TABLES IN SCHEMA s", which could later extend to > "FOR ALL SEQUENCES IN SCHEMA s", etc. This is then a very clean > intermediate step between publishing one table and "FOR ALL TABLES" > without a schema limitation. > > I tend to agree that a single publication should be able to incorporate > any of these options. > How about if the improved syntax from Tom Lane [1] also allowed an "AND" keyword for combining whatever you wish? Then the question from Peter E. [2] "Why can't I have a publication that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would have an intuitive solution like: CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3 AND FOR ALL TABLES IN SCHEMA s1,s2,s3; ------ [1] https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us [2] https://www.postgresql.org/message-id/4fb39707-dca9-1563-4482-b7a8315c36ca%40enterprisedb.com Kind Regards, Peter Smith. Fujitsu Australia
Peter Smith <smithpb2250@gmail.com> writes: > Then the question from Peter E. [2] "Why can't I have a publication > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would > have an intuitive solution like: > CREATE PUBLICATION pub1 > FOR TABLE t1,t2,t3 AND > FOR ALL TABLES IN SCHEMA s1,s2,s3; That seems a bit awkward, since the existing precedent is to use commas. We shouldn't need more than one FOR noise-word, either. So I was imagining syntax more like, say, CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2, SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4; Abstractly it'd be createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] cpitem := ALL TABLES | TABLE name | ALL TABLES IN SCHEMA name | ALL SEQUENCES | SEQUENCE name | ALL SEQUENCES IN SCHEMA name | name The grammar output would need some post-analysis to attribute the right type to bare "name" items, but that doesn't seem difficult. regards, tom lane
On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Smith <smithpb2250@gmail.com> writes: > > Then the question from Peter E. [2] "Why can't I have a publication > > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would > > have an intuitive solution like: > > > CREATE PUBLICATION pub1 > > FOR TABLE t1,t2,t3 AND > > FOR ALL TABLES IN SCHEMA s1,s2,s3; > > That seems a bit awkward, since the existing precedent is > to use commas. We shouldn't need more than one FOR noise-word, > either. So I was imagining syntax more like, say, When I wrote that "AND" suggestion I had in mind that commas may get weird if there were objects with keyword names. e.g. if there was a schema called SEQUENCE and a SEQUENCE called SEQUENCE then this will be allowed. CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA SEQUENCE, SEQUENCE SEQUENCE; But probably I was just overthinking it. > > CREATE PUBLICATION pub1 FOR > TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2, > SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4; > > Abstractly it'd be > > createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] > > cpitem := ALL TABLES | > TABLE name | > ALL TABLES IN SCHEMA name | > ALL SEQUENCES | > SEQUENCE name | > ALL SEQUENCES IN SCHEMA name | > name > > The grammar output would need some post-analysis to attribute the > right type to bare "name" items, but that doesn't seem difficult. That last bare "name" cpitem. looks like it would permit the following syntax: CREATE PUBLICATION pub FOR a,b,c; Was that intentional? ------ Kind Regards, Peter Smith. Fujitsu Australia.
On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > CREATE PUBLICATION pub1 FOR > > TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2, > > SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4; > > > > Abstractly it'd be > > > > createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] > > > > cpitem := ALL TABLES | > > TABLE name | > > ALL TABLES IN SCHEMA name | > > ALL SEQUENCES | > > SEQUENCE name | > > ALL SEQUENCES IN SCHEMA name | > > name > > > > The grammar output would need some post-analysis to attribute the > > right type to bare "name" items, but that doesn't seem difficult. > > That last bare "name" cpitem. looks like it would permit the following syntax: > > CREATE PUBLICATION pub FOR a,b,c; > > Was that intentional? > I think so. IIUC, the idea is that after parsing we find out whether the given name is table, sequence, or any other object. Here, I think the name could be either of table or sequence because, for schema, we won't be knowing whether to include tables, sequences, or both in the schema. Also, we can have the same name for schema and table so it might be tricky to distinguish among those unless we give priority to one of those. -- With Regards, Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes: > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote: >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Abstractly it'd be >>> >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] >>> >>> cpitem := ALL TABLES | >>> TABLE name | >>> ALL TABLES IN SCHEMA name | >>> ALL SEQUENCES | >>> SEQUENCE name | >>> ALL SEQUENCES IN SCHEMA name | >>> name >>> >>> The grammar output would need some post-analysis to attribute the >>> right type to bare "name" items, but that doesn't seem difficult. >> That last bare "name" cpitem. looks like it would permit the following syntax: >> CREATE PUBLICATION pub FOR a,b,c; >> Was that intentional? > I think so. I had supposed that we could throw an error at the post-processing stage, or alternatively default to assuming that such names are tables. Now you could instead make the grammar work like cpitem := ALL TABLES | TABLE name [, ...] | ALL TABLES IN SCHEMA name [, ...] | ALL SEQUENCES | SEQUENCE name [, ...] | ALL SEQUENCES IN SCHEMA name [, ...] which would result in a two-level-list data structure. I'm not sure that this is better, as any sort of mistake would result in a very uninformative generic "syntax error" from Bison. Errors out of a post-processing stage could be more specific than that. (Perhaps, though, we should *document* it like the latter way, even if the actual implementation is more like the first way.) regards, tom lane
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Abstractly it'd be
> >>>
> >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ]
> >>>
> >>> cpitem := ALL TABLES |
> >>> TABLE name |
> >>> ALL TABLES IN SCHEMA name |
> >>> ALL SEQUENCES |
> >>> SEQUENCE name |
> >>> ALL SEQUENCES IN SCHEMA name |
> >>> name
> >>>
> >>> The grammar output would need some post-analysis to attribute the
> >>> right type to bare "name" items, but that doesn't seem difficult.
>
> >> That last bare "name" cpitem. looks like it would permit the following syntax:
> >> CREATE PUBLICATION pub FOR a,b,c;
> >> Was that intentional?
>
> > I think so.
>
> I had supposed that we could throw an error at the post-processing stage,
> or alternatively default to assuming that such names are tables.
>
> Now you could instead make the grammar work like
>
> cpitem := ALL TABLES |
> TABLE name [, ...] |
> ALL TABLES IN SCHEMA name [, ...] |
> ALL SEQUENCES |
> SEQUENCE name [, ...] |
> ALL SEQUENCES IN SCHEMA name [, ...]
>
> which would result in a two-level-list data structure. I'm not sure
> that this is better, as any sort of mistake would result in a very
> uninformative generic "syntax error" from Bison. Errors out of a
> post-processing stage could be more specific than that.
I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is something like below:
CreatePublicationStmt:
CREATE PUBLICATION name FOR pub_obj_list opt_definition
{
CreatePublicationStmt *n = makeNode(CreatePublicationStmt);
n->pubname = $3;
n->options = $6;
n->pubobjects = (List *)$5;
$$ = (Node *)n;
}
;
pub_obj_list: PublicationObjSpec
{ $$ = list_make1($1); }
| pub_obj_list ',' PublicationObjSpec
{ $$ = lappend($1, $3); }
;
/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec: TABLE pubobj_expr
{ ....}
| ALL TABLES IN_P SCHEMA pubobj_expr
{ ....}
| pubobj_expr
{ ....}
;
pubobj_expr:
any_name
{ ....}
| any_name '*'
{ ....}
| ONLY any_name
{ ....}
| ONLY '(' any_name ')'
{ ....}
| CURRENT_SCHEMA
{ ....}
;
I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA support in case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like "sch1.t1".
I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR ALL TABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables.
Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALL TABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2" and "ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP separately which cannot be combined currently. I have not mentioned SEQUENCE handling separately, the sequence will be extended in similar lines. I thought of throwing an error at post processing, the first option that Tom Lane had suggested if the user specified syntax like: create publication pub1 for t1,t2;
Thoughts?
[1] - https://www.postgresql.org/message-id/877603.1629120678%40sss.pgh.pa.us
On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote: > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >>> Abstractly it'd be > > >>> > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] > > >>> > > >>> cpitem := ALL TABLES | > > >>> TABLE name | > > >>> ALL TABLES IN SCHEMA name | > > >>> ALL SEQUENCES | > > >>> SEQUENCE name | > > >>> ALL SEQUENCES IN SCHEMA name | > > >>> name > > >>> > > >>> The grammar output would need some post-analysis to attribute the > > >>> right type to bare "name" items, but that doesn't seem difficult. > > > > >> That last bare "name" cpitem. looks like it would permit the following syntax: > > >> CREATE PUBLICATION pub FOR a,b,c; > > >> Was that intentional? > > > > > I think so. > > > > I had supposed that we could throw an error at the post-processing stage, > > or alternatively default to assuming that such names are tables. > > > > Now you could instead make the grammar work like > > > > cpitem := ALL TABLES | > > TABLE name [, ...] | > > ALL TABLES IN SCHEMA name [, ...] | > > ALL SEQUENCES | > > SEQUENCE name [, ...] | > > ALL SEQUENCES IN SCHEMA name [, ...] > > > > which would result in a two-level-list data structure. I'm not sure > > that this is better, as any sort of mistake would result in a very > > uninformative generic "syntax error" from Bison. Errors out of a > > post-processing stage could be more specific than that. > > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is something likebelow: > CreatePublicationStmt: > CREATE PUBLICATION name FOR pub_obj_list opt_definition > { > CreatePublicationStmt *n = makeNode(CreatePublicationStmt); > n->pubname = $3; > n->options = $6; > n->pubobjects = (List *)$5; > $$ = (Node *)n; > } > ; > pub_obj_list: PublicationObjSpec > { $$ = list_make1($1); } > | pub_obj_list ',' PublicationObjSpec > { $$ = lappend($1, $3); } > ; > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */ > PublicationObjSpec: TABLE pubobj_expr > { ....} > | ALL TABLES IN_P SCHEMA pubobj_expr > { ....} > | pubobj_expr > { ....} > ; > pubobj_expr: > any_name > { ....} > | any_name '*' > { ....} > | ONLY any_name > { ....} > | ONLY '(' any_name ')' > { ....} > | CURRENT_SCHEMA > { ....} > ; "FOR ALL TABLES” (that includes all tables in the database) is missing in this syntax? > > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA supportin case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like "sch1.t1". I think that relation_expr also accepts objects like "sch1.t1", no? > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR ALLTABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. I think so too. > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALL TABLESIN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2" and "ALTER PUBLICATIONpub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP separatelywhich cannot be combined currently. If we support the former syntax, the latter two syntaxes are also supported. Why do we want to support only the latter separate two syntaxes? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Wed, Aug 25, 2021 at 1:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > >>> Abstractly it'd be > > > >>> > > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] > > > >>> > > > >>> cpitem := ALL TABLES | > > > >>> TABLE name | > > > >>> ALL TABLES IN SCHEMA name | > > > >>> ALL SEQUENCES | > > > >>> SEQUENCE name | > > > >>> ALL SEQUENCES IN SCHEMA name | > > > >>> name > > > >>> > > > >>> The grammar output would need some post-analysis to attribute the > > > >>> right type to bare "name" items, but that doesn't seem difficult. > > > > > > >> That last bare "name" cpitem. looks like it would permit the following syntax: > > > >> CREATE PUBLICATION pub FOR a,b,c; > > > >> Was that intentional? > > > > > > > I think so. > > > > > > I had supposed that we could throw an error at the post-processing stage, > > > or alternatively default to assuming that such names are tables. > > > > > > Now you could instead make the grammar work like > > > > > > cpitem := ALL TABLES | > > > TABLE name [, ...] | > > > ALL TABLES IN SCHEMA name [, ...] | > > > ALL SEQUENCES | > > > SEQUENCE name [, ...] | > > > ALL SEQUENCES IN SCHEMA name [, ...] > > > > > > which would result in a two-level-list data structure. I'm not sure > > > that this is better, as any sort of mistake would result in a very > > > uninformative generic "syntax error" from Bison. Errors out of a > > > post-processing stage could be more specific than that. > > > > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is somethinglike below: > > CreatePublicationStmt: > > CREATE PUBLICATION name FOR pub_obj_list opt_definition > > { > > CreatePublicationStmt *n = makeNode(CreatePublicationStmt); > > n->pubname = $3; > > n->options = $6; > > n->pubobjects = (List *)$5; > > $$ = (Node *)n; > > } > > ; > > pub_obj_list: PublicationObjSpec > > { $$ = list_make1($1); } > > | pub_obj_list ',' PublicationObjSpec > > { $$ = lappend($1, $3); } > > ; > > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */ > > PublicationObjSpec: TABLE pubobj_expr > > { ....} > > | ALL TABLES IN_P SCHEMA pubobj_expr > > { ....} > > | pubobj_expr > > { ....} > > ; > > pubobj_expr: > > any_name > > { ....} > > | any_name '*' > > { ....} > > | ONLY any_name > > { ....} > > | ONLY '(' any_name ')' > > { ....} > > | CURRENT_SCHEMA > > { ....} > > ; > > "FOR ALL TABLES” (that includes all tables in the database) is missing > in this syntax? "FOR ALL TABLES" is present in CreatePublicationStmt rule, sorry for not including all of CreatePublicationStmt rule in the previous mail, I thought of keeping the contents shorter: CreatePublicationStmt: CREATE PUBLICATION name opt_definition {....} | CREATE PUBLICATION name FOR ALL TABLES opt_definition {....} | CREATE PUBLICATION name FOR pub_obj_list opt_definition {....} ; It is not in pub_obj_list as the user will be able to specify either of "FOR ALL TABLES" or "FOR TABLE/ FOR ALL TABLES IN SCHEMA" along with create publication. > > > > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMA supportin case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objects like "sch1.t1". > > I think that relation_expr also accepts objects like "sch1.t1", no? Earlier syntax only supported relations, the relations were parsed into RangeVar datatype. The new feature supports schema for which only the schema name is required. To keep the parsing rule common, I used any_name which will store the dotted name into a list for both relation and schema. I will later convert it into rangevar for relation and schema oid for schema names during the processing and create the publication. I felt relation_expr was able to handle dotted names because of qualified_name having "ColId indirection", here indirection rule takes care of handling the dotted names. > > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FOR ALLTABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. > > I think so too. > > > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALLTABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2" and "ALTERPUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP separatelywhich cannot be combined currently. > > If we support the former syntax, the latter two syntaxes are also > supported. Why do we want to support only the latter separate two > syntaxes? We can support either syntax, I was not sure which one is better. If alter also should support similar syntax I can do it as a separate patch so as to not increase the main patch size. Thoughts? Attached v21 patch has the changes based on the new syntax and fixes few of the other review comments provided by reviewers. Regards, Vignesh
Attachment
>
> On 13.08.21 04:59, Amit Kapila wrote:
> >> Even if we drop all tables added to the publication from it, 'pubkind'
> >> doesn't go back to 'empty'. Is that intentional behavior? If we do
> >> that, we can save the lookup of pg_publication_rel and
> >> pg_publication_schema in some cases, and we can switch the publication
> >> that was created as FOR SCHEMA to FOR TABLE and vice versa.
> >>
> > Do we really want to allow users to change a publication that is FOR
> > SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
> > postgres=# Alter Publication pub add table tbl1;
> > ERROR: publication "pub" is defined as FOR ALL TABLES
> > DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications.
>
> I think the strict separation between publication-for-tables vs.
> publication-for-schemas is a mistake. Why can't I have a publication
> that publishes tables t1, t2, t3, *and* schemas s1, s2, s3. Also note
> that we have a pending patch to add sequences support to logical
> replication. So eventually, a publication will be able to contain a
> bunch of different objects of different kinds.
Thanks for the feedback, now the same publication can handle schemas as well as tables, and can be extended further to other objects. This is handled in the v21 patch attached at [1]. It is supported by the following syntax:
CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA s1,s2,s3, TABLE t1,t2,t3;
[1] - https://www.postgresql.org/message-id/CALDaNm2iKJvSdCyh0S%2BwYgFjMNB4hu3kYjk%3DYrEkpqTJY9zW%2Bw%40mail.gmail.com
Regards,
Vignesh
On Wednesday, August 25, 2021 5:37 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v21 patch has the changes based on the new syntax and fixes > few of the other review comments provided by reviewers. > Thanks for your new patch. I saw the following warning when building, please have a look. publicationcmds.c: In function ‘ConvertPubObjSpecListToOidList’: publicationcmds.c:212:23: warning: ‘prevobjtype’ may be used uninitialized in this function [-Wmaybe-uninitialized] pubobj->pubobjtype = prevobjtype; ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~ Regards Tang
On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Aug 25, 2021 at 1:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Aug 23, 2021 at 11:16 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, Aug 17, 2021 at 6:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > > > > On Tue, Aug 17, 2021 at 6:40 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > >> On Mon, Aug 16, 2021 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > >>> Abstractly it'd be > > > > >>> > > > > >>> createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] > > > > >>> > > > > >>> cpitem := ALL TABLES | > > > > >>> TABLE name | > > > > >>> ALL TABLES IN SCHEMA name | > > > > >>> ALL SEQUENCES | > > > > >>> SEQUENCE name | > > > > >>> ALL SEQUENCES IN SCHEMA name | > > > > >>> name > > > > >>> > > > > >>> The grammar output would need some post-analysis to attribute the > > > > >>> right type to bare "name" items, but that doesn't seem difficult. > > > > > > > > >> That last bare "name" cpitem. looks like it would permit the following syntax: > > > > >> CREATE PUBLICATION pub FOR a,b,c; > > > > >> Was that intentional? > > > > > > > > > I think so. > > > > > > > > I had supposed that we could throw an error at the post-processing stage, > > > > or alternatively default to assuming that such names are tables. > > > > > > > > Now you could instead make the grammar work like > > > > > > > > cpitem := ALL TABLES | > > > > TABLE name [, ...] | > > > > ALL TABLES IN SCHEMA name [, ...] | > > > > ALL SEQUENCES | > > > > SEQUENCE name [, ...] | > > > > ALL SEQUENCES IN SCHEMA name [, ...] > > > > > > > > which would result in a two-level-list data structure. I'm not sure > > > > that this is better, as any sort of mistake would result in a very > > > > uninformative generic "syntax error" from Bison. Errors out of a > > > > post-processing stage could be more specific than that. > > > > > > I preferred the implementation in the lines Tom Lane had proposed at [1]. Is it ok if the implementation is somethinglike below: > > > CreatePublicationStmt: > > > CREATE PUBLICATION name FOR pub_obj_list opt_definition > > > { > > > CreatePublicationStmt *n = makeNode(CreatePublicationStmt); > > > n->pubname = $3; > > > n->options = $6; > > > n->pubobjects = (List *)$5; > > > $$ = (Node *)n; > > > } > > > ; > > > pub_obj_list: PublicationObjSpec > > > { $$ = list_make1($1); } > > > | pub_obj_list ',' PublicationObjSpec > > > { $$ = lappend($1, $3); } > > > ; > > > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */ > > > PublicationObjSpec: TABLE pubobj_expr > > > { ....} > > > | ALL TABLES IN_P SCHEMA pubobj_expr > > > { ....} > > > | pubobj_expr > > > { ....} > > > ; > > > pubobj_expr: > > > any_name > > > { ....} > > > | any_name '*' > > > { ....} > > > | ONLY any_name > > > { ....} > > > | ONLY '(' any_name ')' > > > { ....} > > > | CURRENT_SCHEMA > > > { ....} > > > ; > > > > "FOR ALL TABLES” (that includes all tables in the database) is missing > > in this syntax? > > "FOR ALL TABLES" is present in CreatePublicationStmt rule, sorry for > not including all of CreatePublicationStmt rule in the previous mail, > I thought of keeping the contents shorter: > > CreatePublicationStmt: > CREATE PUBLICATION name opt_definition > {....} > | CREATE PUBLICATION name FOR ALL TABLES opt_definition > {....} > | CREATE PUBLICATION name FOR pub_obj_list opt_definition > {....} > ; > > It is not in pub_obj_list as the user will be able to specify either > of "FOR ALL TABLES" or "FOR TABLE/ FOR ALL TABLES IN SCHEMA" along > with create publication. > > > > > > > I needed pubobj_expr to support the existing syntaxes supported by relation_expr and also to handle CURRENT_SCHEMAsupport in case of the "FOR ALL TABLES IN SCHEMA" feature. I changed the name to any_name to support objectslike "sch1.t1". > > > > I think that relation_expr also accepts objects like "sch1.t1", no? > > Earlier syntax only supported relations, the relations were parsed > into RangeVar datatype. The new feature supports schema for which only > the schema name is required. To keep the parsing rule common, I used > any_name which will store the dotted name into a list for both > relation and schema. I will later convert it into rangevar for > relation and schema oid for schema names during the processing and > create the publication. I felt relation_expr was able to handle dotted > names because of qualified_name having "ColId indirection", here > indirection rule takes care of handling the dotted names. > > > > I felt if a user specified "FOR ALL TABLES", the user should not be allowed to combine it with "FOR TABLE" and "FORALL TABLES IN SCHEMA" as "FOR ALL TABLES" anyway will include all the tables. > > > > I think so too. > > > > > Should we support the similar syntax in case of alter publication, like "ALTER PUBLICATION pub1 ADD TABLE t1,t2, ALLTABLES IN SCHEMA sch1, sch2" or shall we keep these separate like "ALTER PUBLICATION pub1 ADD TABLE t1, t2" and "ALTERPUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1, sch2". I preferred to keep it separate as we have kept ADD/DROP separatelywhich cannot be combined currently. > > > > If we support the former syntax, the latter two syntaxes are also > > supported. Why do we want to support only the latter separate two > > syntaxes? > > We can support either syntax, I was not sure which one is better. If > alter also should support similar syntax I can do it as a separate > patch so as to not increase the main patch size. Thoughts? I have implemented this in the 0003 patch, I have kept it separate to reduce the testing effort and also it will be easier if someone disagrees with the syntax. I will merge it to the main patch later based on the feedback. Attached v22 patch has the changes for the same. Thoughts? Regards, Vignesh
Attachment
On Thu, Aug 26, 2021 at 7:52 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Wednesday, August 25, 2021 5:37 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v21 patch has the changes based on the new syntax and fixes > > few of the other review comments provided by reviewers. > > > > Thanks for your new patch. I saw the following warning when building, please have a look. > > publicationcmds.c: In function ‘ConvertPubObjSpecListToOidList’: > publicationcmds.c:212:23: warning: ‘prevobjtype’ may be used uninitialized in this function [-Wmaybe-uninitialized] > pubobj->pubobjtype = prevobjtype; > ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~ Thanks for reporting this, I have fixed this in the v22 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm1rNDbGwoo0FC9vF1BmueUy__u1ZM5yYOjEQW1Of6zdWQ%40mail.gmail.com Regards, VIgnesh
On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote: > > I have implemented this in the 0003 patch, I have kept it separate to > reduce the testing effort and also it will be easier if someone > disagrees with the syntax. I will merge it to the main patch later > based on the feedback. Attached v22 patch has the changes for the > same. > Few comments on v22-0001-Added-schema-level-support-for-publication: ======================================================== 1. Why in publication_add_schema(), you are registering invalidation for all schema relations? It seems this is to allow rebuilding the publication info for decoding sessions. But that is not required as you are registering rel_sync_cache_publication_cb for publication_schema relation. In rel_sync_cache_publication_cb, we are marking replicate_valid as false for each entry which will allow publication info to be rebuilt in get_rel_sync_entry. I see that it is done for a single relation in the current code in function publication_add_relation but I guess that is also not required. You can once test this. If you still think it is required, can you please explain me why and then we can accordingly add some comments in the patch. Peter E., Sawada-San, can you please let me know if I am missing something in this regard? In the code, I see a comment "/* Invalidate relcache so that publication info is rebuilt. */" in function publication_add_relation() but I don't understand why it is required as per my explanation above? 2. + * Publication object type + */ +typedef enum PublicationObjSpecType +{ + PUBLICATIONOBJ_TABLE, /* Table type */ + PUBLICATIONOBJ_SCHEMA, /* Schema type */ + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */ Why add anything related to the sequence in this patch? 3. +get_object_address_publication_schema(List *object, bool missing_ok) +{ + ObjectAddress address; + char *pubname; + Publication *pub; + char *schemaname; + Oid schemaoid; + + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid); + + /* Fetch schema name and publication name from input list */ + schemaname = strVal(linitial(object)); + pubname = strVal(lsecond(object)); + + schemaoid = get_namespace_oid(schemaname, false); + + /* Now look up the pg_publication tuple */ + pub = GetPublicationByName(pubname, missing_ok); + if (!pub) + return address; Why the schema name handling is different from publication name? Why can't we pass missing_ok for schema api and handle it similar publication api? 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used in get_publication_name() whereas it is used for all other syscache searches in that function? 5. Don't we need to expose a view for publication schemas similar to pg_publication_tables? 6. publication_add_schema() { .. + /* Can't be system namespace */ + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" is a system schema", + get_namespace_name(schemaoid)), + errdetail("System schemas cannot be added to publications."))); + + /* Can't be temporary namespace */ + if (isAnyTempNamespace(schemaoid)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" is a temporary schema", + get_namespace_name(schemaoid)), + errdetail("Temporary schemas cannot be added to publications."))); .. } Can we change the first detail message as: "This operation is not supported for system schemas." and the second detail message as: "Temporary schemas cannot be replicated."? This is to make these messages similar to corresponding messages for relations in function check_publication_add_relation(). Can we move these checks to a separate function? -- With Regards, Amit Kapila.
On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have implemented this in the 0003 patch, I have kept it separate to > > reduce the testing effort and also it will be easier if someone > > disagrees with the syntax. I will merge it to the main patch later > > based on the feedback. Attached v22 patch has the changes for the > > same. > > > > Few comments on v22-0001-Added-schema-level-support-for-publication: > ======================================================== > 1. Why in publication_add_schema(), you are registering invalidation > for all schema relations? It seems this is to allow rebuilding the > publication info for decoding sessions. But that is not required as > you are registering rel_sync_cache_publication_cb for > publication_schema relation. In rel_sync_cache_publication_cb, we are > marking replicate_valid as false for each entry which will allow > publication info to be rebuilt in get_rel_sync_entry. > > I see that it is done for a single relation in the current code in > function publication_add_relation but I guess that is also not > required. You can once test this. If you still think it is required, > can you please explain me why and then we can accordingly add some > comments in the patch. I felt this is required for handling the following concurrency scenario: create schema sch1; create table sch1.t1(c1 int); insert into sch1.t1 values(10); update sch1.t1 set c1 = 11; # update will be successful and relation cache will update publication actions based on the current state i.e no publication create publication pub1 for all tables in schema sch1; # After the above publication is created the relations present in this schema should be invalidated so that the next update should fail. If the relations are not invalidated the updates will be successful based on previous publication actions. update sch1.t1 set c1 = 11; I will add comments to mention the above details. Thoughts? > Peter E., Sawada-San, can you please let me know if I am missing > something in this regard? In the code, I see a comment "/* Invalidate > relcache so that publication info is rebuilt. */" in function > publication_add_relation() but I don't understand why it is required > as per my explanation above? > > 2. > + * Publication object type > + */ > +typedef enum PublicationObjSpecType > +{ > + PUBLICATIONOBJ_TABLE, /* Table type */ > + PUBLICATIONOBJ_SCHEMA, /* Schema type */ > + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */ > > Why add anything related to the sequence in this patch? I will handle this in my next version. > 3. > +get_object_address_publication_schema(List *object, bool missing_ok) > +{ > + ObjectAddress address; > + char *pubname; > + Publication *pub; > + char *schemaname; > + Oid schemaoid; > + > + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid); > + > + /* Fetch schema name and publication name from input list */ > + schemaname = strVal(linitial(object)); > + pubname = strVal(lsecond(object)); > + > + schemaoid = get_namespace_oid(schemaname, false); > + > + /* Now look up the pg_publication tuple */ > + pub = GetPublicationByName(pubname, missing_ok); > + if (!pub) > + return address; > > Why the schema name handling is different from publication name? Why > can't we pass missing_ok for schema api and handle it similar > publication api? I will handle this in my next version. > 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used > in get_publication_name() whereas it is used for all other syscache > searches in that function? I will handle this in my next version. > 5. Don't we need to expose a view for publication schemas similar to > pg_publication_tables? pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally access pg_publication_rel and pg_publication_schema to get the corresponding tables. I felt we don't need a separate view for publication schemas. Thoughts? > 6. > publication_add_schema() > { > .. > + /* Can't be system namespace */ > + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("\"%s\" is a system schema", > + get_namespace_name(schemaoid)), > + errdetail("System schemas cannot be added to publications."))); > + > + /* Can't be temporary namespace */ > + if (isAnyTempNamespace(schemaoid)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("\"%s\" is a temporary schema", > + get_namespace_name(schemaoid)), > + errdetail("Temporary schemas cannot be added to publications."))); > .. > } > > Can we change the first detail message as: "This operation is not > supported for system schemas." and the second detail message as: > "Temporary schemas cannot be replicated."? This is to make these > messages similar to corresponding messages for relations in function > check_publication_add_relation(). Can we move these checks to a > separate function? I will handle this in my next version. Regards, Vignesh
On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > I have implemented this in the 0003 patch, I have kept it separate to > > > reduce the testing effort and also it will be easier if someone > > > disagrees with the syntax. I will merge it to the main patch later > > > based on the feedback. Attached v22 patch has the changes for the > > > same. > > > > > > > Few comments on v22-0001-Added-schema-level-support-for-publication: > > ======================================================== > > 1. Why in publication_add_schema(), you are registering invalidation > > for all schema relations? It seems this is to allow rebuilding the > > publication info for decoding sessions. But that is not required as > > you are registering rel_sync_cache_publication_cb for > > publication_schema relation. In rel_sync_cache_publication_cb, we are > > marking replicate_valid as false for each entry which will allow > > publication info to be rebuilt in get_rel_sync_entry. > > > > I see that it is done for a single relation in the current code in > > function publication_add_relation but I guess that is also not > > required. You can once test this. If you still think it is required, > > can you please explain me why and then we can accordingly add some > > comments in the patch. > > I felt this is required for handling the following concurrency scenario: > create schema sch1; > create table sch1.t1(c1 int); > insert into sch1.t1 values(10); > update sch1.t1 set c1 = 11; > # update will be successful and relation cache will update publication > actions based on the current state i.e no publication > create publication pub1 for all tables in schema sch1; > # After the above publication is created the relations present in this > schema should be invalidated so that the next update should fail. > What do you mean by update should fail? I think all the relations in RelationSyncCache via rel_sync_cache_publication_cb because you have updated pg_publication_schema and that should register syscache invalidation. > If > the relations are not invalidated the updates will be successful based > on previous publication actions. > update sch1.t1 set c1 = 11; > I think even without special relcache invalidations the relations should be invalidated because of syscache invalidation as mentioned in the previous point. Am I missing something here? > > > 5. Don't we need to expose a view for publication schemas similar to > > pg_publication_tables? > > pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL > TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally > access pg_publication_rel and pg_publication_schema to get the > corresponding tables. I felt we don't need a separate view for > publication schemas. Thoughts? > Don't you think some users might want to know all the schema names for a publication? I am not completely sure on this point but I think it is good to have information for users. It might be also useful to have pg_publication_objects where we can display object types (like table, schema, sequence, etc) and then object names. If you are not convinced then we can wait and see what others think about this. -- With Regards, Amit Kapila.
On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > I have implemented this in the 0003 patch, I have kept it separate to > > > > reduce the testing effort and also it will be easier if someone > > > > disagrees with the syntax. I will merge it to the main patch later > > > > based on the feedback. Attached v22 patch has the changes for the > > > > same. > > > > > > > > > > Few comments on v22-0001-Added-schema-level-support-for-publication: > > > ======================================================== > > > 1. Why in publication_add_schema(), you are registering invalidation > > > for all schema relations? It seems this is to allow rebuilding the > > > publication info for decoding sessions. But that is not required as > > > you are registering rel_sync_cache_publication_cb for > > > publication_schema relation. In rel_sync_cache_publication_cb, we are > > > marking replicate_valid as false for each entry which will allow > > > publication info to be rebuilt in get_rel_sync_entry. > > > > > > I see that it is done for a single relation in the current code in > > > function publication_add_relation but I guess that is also not > > > required. You can once test this. If you still think it is required, > > > can you please explain me why and then we can accordingly add some > > > comments in the patch. > > > > I felt this is required for handling the following concurrency scenario: > > create schema sch1; > > create table sch1.t1(c1 int); > > insert into sch1.t1 values(10); > > update sch1.t1 set c1 = 11; > > # update will be successful and relation cache will update publication > > actions based on the current state i.e no publication > > create publication pub1 for all tables in schema sch1; > > # After the above publication is created the relations present in this > > schema should be invalidated so that the next update should fail. > > > > What do you mean by update should fail? I think all the relations in > RelationSyncCache via rel_sync_cache_publication_cb because you have > updated pg_publication_schema and that should register syscache > invalidation. By update should fail, I meant the updates without setting replica identity before creating the decoding context. The scenario is like below (all in the same session, the subscription is not created): create schema sch1; create table sch1.t1(c1 int); insert into sch1.t1 values(10); # Before updating we will check CheckCmdReplicaIdentity, as there are no publications on this table rd_pubactions will be set accordingly in relcache entry. update sch1.t1 set c1 = 11; # Now we will create the publication after rd_pubactions has been set in the cache. Now when we create this publication we should invalidate the relations present in the schema, this is required so that when the next update happens, we should check the publication actions again in CheckCmdReplicaIdentity and fail the update which does not set replica identity after the publication is created. create publication pub1 for all tables in schema sch1; # After the above publication is created the relations present in this schema will be invalidated. Now we will check the publication actions again in CheckCmdReplicaIdentity and the update will fail. update sch1.t1 set c1 = 11; ERROR: cannot update table "t1" because it does not have a replica identity and publishes updates HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. The rel_sync_cache_publication_cb invalidations are registered once the decoding context is created. Since the decoding context is not created at this point of time we should explicitly invalidate all the relations present in this schema by calling InvalidatePublicationRels. > > If > > the relations are not invalidated the updates will be successful based > > on previous publication actions. > > update sch1.t1 set c1 = 11; > > > > I think even without special relcache invalidations the relations > should be invalidated because of syscache invalidation as mentioned in > the previous point. Am I missing something here? The rel_sync_cache_publication_cb invalidations are registered once the decoding context is created. Since the decoding context is not created at this point of time we should explicitly invalidate all the relations present in this schema by calling InvalidatePublicationRels. This is to prevent updates for which the replica identity is not set, I have mentioned more details above. > > > > > 5. Don't we need to expose a view for publication schemas similar to > > > pg_publication_tables? > > > > pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL > > TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally > > access pg_publication_rel and pg_publication_schema to get the > > corresponding tables. I felt we don't need a separate view for > > publication schemas. Thoughts? > > > > Don't you think some users might want to know all the schema names for > a publication? I am not completely sure on this point but I think it > is good to have information for users. It might be also useful to have > pg_publication_objects where we can display object types (like table, > schema, sequence, etc) and then object names. If you are not convinced > then we can wait and see what others think about this. Ok, I will add this view in the next version. Regards, Vignesh
On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote: > > I have implemented this in the 0003 patch, I have kept it separate to > reduce the testing effort and also it will be easier if someone > disagrees with the syntax. I will merge it to the main patch later > based on the feedback. Attached v22 patch has the changes for the > same. Just experimenting with the new syntax so far, and seeing some new messages and docs, I have the following suggestions for improvements: src/backend/commands/publicationcmds.c (1) BEFORE: for table/for all tables in schema should be specified before the object AFTER: FOR TABLE / FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s) (2) BEFORE: Tables cannot be added, dropped or set on FOR ALL TABLES publications. AFTER: Tables cannot be added to, dropped from, or set on FOR ALL TABLES publications. (3) BEFORE: Schemas cannot be added, dropped or set on FOR ALL TABLES publications. AFTER: Schemas cannot be added to, dropped from, or set on FOR ALL TABLES publications. v22-0002 doc/src/sgml/ref/create_publication.sgml (1) BEFORE: + Create a publication that publishes all changes for users and departments + table and that publishes all changes for all the tables present in the AFTER: + Create a publication that publishes all changes for tables "users" and + "departments" and that publishes all changes for all the tables present in the Regards, Greg Nancarrow Fujitsu Australia
On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > I have implemented this in the 0003 patch, I have kept it separate to reduce the > testing effort and also it will be easier if someone disagrees with the syntax. I > will merge it to the main patch later based on the feedback. Attached v22 patch > has the changes for the same. > Thoughts? Hi, Here are some comments for the new version patches. About 0001 1) + rel->relpersistence = RELPERSISTENCE_PERMANENT; It seems we don't need to set this since makeRangeVarFromNameList() already set it. 2) + if (!relids || !schemarelids) + tables = list_concat(relids, schemarelids); + else + tables = list_concat_unique_oid(relids, schemarelids); + } It seems we can simplify the above code like the following: tables = list_concat_unique_oid(relids, schemarelids); 3) + relids = GetPublicationRelations(pubform->oid, + PUBLICATION_PART_ALL); + schemarelids = GetAllSchemasPublicationRelations(pubform->oid, + PUBLICATION_PART_ALL); + relids = list_concat(relids, schemarelids); should we invoke list_concat_unique_oid here ? 4) + search_path = fetch_search_path(false); + if (search_path == NIL) /* nothing valid in search_path? */ It might be better to list_free(search_path) when not used. 5) + if (list_length(pubobj->name) == 1 && + (strcmp(relname, "CURRENT_SCHEMA") == 0)) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("invalid relation name at or near"), + parser_errposition(pstate, pubobj->location)); Maybe we don't need this check, because it will report an error in OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that message seems readable to me. About 0002 6) diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl index 0c84d87873..0a479dfe36 100644 --- a/src/test/subscription/t/001_rep_changes.pl +++ b/src/test/subscription/t/001_rep_changes.pl @@ -6,7 +6,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 32; +use Test::More tests => 46; I think it might be better to move these testcases create a separate perl file. About 0003 7) The v22-0003 seems simple and can remove lots of code in patch v22-0001, so maybe we can merge 0001 and 0003 into one patch ? Best regards, Hou zj
On Sun, Aug 29, 2021 at 7:22 AM vignesh C <vignesh21@gmail.com> wrote: > > On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > What do you mean by update should fail? I think all the relations in > > RelationSyncCache via rel_sync_cache_publication_cb because you have > > updated pg_publication_schema and that should register syscache > > invalidation. > > By update should fail, I meant the updates without setting replica > identity before creating the decoding context. The scenario is like > below (all in the same session, the subscription is not created): > create schema sch1; > create table sch1.t1(c1 int); > insert into sch1.t1 values(10); > # Before updating we will check CheckCmdReplicaIdentity, as there are > no publications on this table rd_pubactions will be set accordingly in > relcache entry. > update sch1.t1 set c1 = 11; > # Now we will create the publication after rd_pubactions has been set > in the cache. Now when we create this publication we should invalidate > the relations present in the schema, this is required so that when the > next update happens, we should check the publication actions again in > CheckCmdReplicaIdentity and fail the update which does not set > replica identity after the publication is created. > create publication pub1 for all tables in schema sch1; > # After the above publication is created the relations present in this > schema will be invalidated. Now we will check the publication actions > again in CheckCmdReplicaIdentity and the update will fail. > update sch1.t1 set c1 = 11; > ERROR: cannot update table "t1" because it does not have a replica > identity and publishes updates > HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. > Okay, I got it but let's add few comments in the code related to it. Also, I noticed that the code in InvalidatePublicationRels() already exists in AlterPublicationOptions(). You can try to refactor the existing code as a separate initial patch. BTW, I noticed that "for all tables", we don't register invalidations in the above scenario, and then later that causes conflict on the subscriber. I think that is a bug in the current code and we can deal with that separately. -- With Regards, Amit Kapila.
On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > I have implemented this in the 0003 patch, I have kept it separate to > reduce the testing effort and also it will be easier if someone > disagrees with the syntax. I will merge it to the main patch later > based on the feedback. Attached v22 patch has the changes for the > same. > Thoughts? > Thanks for your new patch. Here are some suggestions: 1. If a publication published a table and the schema where the table belonged to, the publication name would show twice when using '\d+' for the table. Maybe we should add some check to avoid the duplication. Thought? For example: create schema sch1; create table sch1.tbl(a int); create publication pub for table sch1.tbl, all tables in schema sch1; postgres=# \d+ sch1.tbl Table "sch1.tbl" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- a | integer | | | | plain | | | Publications: "pub" "pub" Access method: heap 2. doc/src/sgml/catalogs.sgml @@ -6169,6 +6174,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l publication instead of its own. </para></entry> </row> + </tbody> </tgroup> </table> It seems that we don't need this change. 3. src/bin/psql/tab-complete.c + /* Complete "CREATE PUBLICATION <name> FOR SCHEMA <schema>, ..." */ + else if (HeadMatches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA")) + COMPLETE_WITH_QUERY(Query_for_list_of_schemas + " UNION SELECT 'CURRENT_SCHEMA' " + "UNION SELECT 'WITH ('"); The comment should be updated to "FOR ALL TABLES IN SCHEMA". Regards Tang
On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote: > > I have implemented this in the 0003 patch, I have kept it separate to > reduce the testing effort and also it will be easier if someone > disagrees with the syntax. I will merge it to the main patch later > based on the feedback. Attached v22 patch has the changes for the > same. I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, TABLE sc1.test;" maintains the table separately and results in the following in the \dRp+ output: Tables: "sc1.test" Schemas: "sc1" and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" still leaves the "sc1.test" table in the publication. Is there a reason why we don't/can't support "ALTER SUBSCRIPTION ... SET ALL TABLES;"? (I know it wasn't supported before, but now "ALTER SUBSCRIPTION ... SET ALL TABLES IN SCHEMA ..." is being supported) I notice that the v22-0003 documentation updates for ALTER SUBSCRIPTION are missing - but you're probably waiting on all feedback before proceeding with that. Regards, Greg Nancarrow Fujitsu Australia
On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have implemented this in the 0003 patch, I have kept it separate to > > reduce the testing effort and also it will be easier if someone > > disagrees with the syntax. I will merge it to the main patch later > > based on the feedback. Attached v22 patch has the changes for the > > same. > > > > Few comments on v22-0001-Added-schema-level-support-for-publication: > ======================================================== > 1. Why in publication_add_schema(), you are registering invalidation > for all schema relations? It seems this is to allow rebuilding the > publication info for decoding sessions. But that is not required as > you are registering rel_sync_cache_publication_cb for > publication_schema relation. In rel_sync_cache_publication_cb, we are > marking replicate_valid as false for each entry which will allow > publication info to be rebuilt in get_rel_sync_entry. > > I see that it is done for a single relation in the current code in > function publication_add_relation but I guess that is also not > required. You can once test this. If you still think it is required, > can you please explain me why and then we can accordingly add some > comments in the patch. Added a comment for this. > 2. > + * Publication object type > + */ > +typedef enum PublicationObjSpecType > +{ > + PUBLICATIONOBJ_TABLE, /* Table type */ > + PUBLICATIONOBJ_SCHEMA, /* Schema type */ > + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */ > > Why add anything related to the sequence in this patch? This is removed. > 3. > +get_object_address_publication_schema(List *object, bool missing_ok) > +{ > + ObjectAddress address; > + char *pubname; > + Publication *pub; > + char *schemaname; > + Oid schemaoid; > + > + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid); > + > + /* Fetch schema name and publication name from input list */ > + schemaname = strVal(linitial(object)); > + pubname = strVal(lsecond(object)); > + > + schemaoid = get_namespace_oid(schemaname, false); > + > + /* Now look up the pg_publication tuple */ > + pub = GetPublicationByName(pubname, missing_ok); > + if (!pub) > + return address; > > Why the schema name handling is different from publication name? Why > can't we pass missing_ok for schema api and handle it similar > publication api? Modified to handle it similarly. > 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used > in get_publication_name() whereas it is used for all other syscache > searches in that function? Modified it to use missing_ok. > 5. Don't we need to expose a view for publication schemas similar to > pg_publication_tables? I have not handled in this version, I will do it in the next version. > 6. > publication_add_schema() > { > .. > + /* Can't be system namespace */ > + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("\"%s\" is a system schema", > + get_namespace_name(schemaoid)), > + errdetail("System schemas cannot be added to publications."))); > + > + /* Can't be temporary namespace */ > + if (isAnyTempNamespace(schemaoid)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("\"%s\" is a temporary schema", > + get_namespace_name(schemaoid)), > + errdetail("Temporary schemas cannot be added to publications."))); > .. > } > > Can we change the first detail message as: "This operation is not > supported for system schemas." and the second detail message as: > "Temporary schemas cannot be replicated."? This is to make these > messages similar to corresponding messages for relations in function > check_publication_add_relation(). Can we move these checks to a > separate function? Modified. Attached v23 patch has the fixes for the same. Regards, Vignesh
Attachment
On Mon, Aug 30, 2021 at 8:23 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have implemented this in the 0003 patch, I have kept it separate to
> > reduce the testing effort and also it will be easier if someone
> > disagrees with the syntax. I will merge it to the main patch later
> > based on the feedback. Attached v22 patch has the changes for the
> > same.
>
> Just experimenting with the new syntax so far, and seeing some new
> messages and docs, I have the following suggestions for improvements:
>
> src/backend/commands/publicationcmds.c
>
> (1)
> BEFORE:
> for table/for all tables in schema should be specified before the object
> AFTER:
> FOR TABLE / FOR ALL TABLES IN SCHEMA should be specified before the
> table/schema name(s)
Modified it to "FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s)", I felt we need not have space before and after "/", I had seen it is specified similarly in few other places.
> (2)
> BEFORE:
> Tables cannot be added, dropped or set on FOR ALL TABLES publications.
> AFTER:
> Tables cannot be added to, dropped from, or set on FOR ALL TABLES publications.
Modified
> (3)
> BEFORE:
> Schemas cannot be added, dropped or set on FOR ALL TABLES publications.
> AFTER:
> Schemas cannot be added to, dropped from, or set on FOR ALL TABLES publications.
Modified
> v22-0002
>
> doc/src/sgml/ref/create_publication.sgml
>
> (1)
> BEFORE:
> + Create a publication that publishes all changes for users and departments
> + table and that publishes all changes for all the tables present in the
> AFTER:
> + Create a publication that publishes all changes for tables "users" and
> + "departments" and that publishes all changes for all the tables
> present in the
Modified.
I have fixed these comments as part of v23 patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com
On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have implemented this in the 0003 patch, I have kept it separate to reduce the > > testing effort and also it will be easier if someone disagrees with the syntax. I > > will merge it to the main patch later based on the feedback. Attached v22 patch > > has the changes for the same. > > Thoughts? > > Hi, > > Here are some comments for the new version patches. > > About 0001 > 1) > + rel->relpersistence = RELPERSISTENCE_PERMANENT; > > It seems we don't need to set this since makeRangeVarFromNameList() > already set it. Modified > 2) > + if (!relids || !schemarelids) > + tables = list_concat(relids, schemarelids); > + else > + tables = list_concat_unique_oid(relids, schemarelids); > + } > > It seems we can simplify the above code like the following: > tables = list_concat_unique_oid(relids, schemarelids); Modified > 3) > + relids = GetPublicationRelations(pubform->oid, > + PUBLICATION_PART_ALL); > + schemarelids = GetAllSchemasPublicationRelations(pubform->oid, > + PUBLICATION_PART_ALL); > + relids = list_concat(relids, schemarelids); > > should we invoke list_concat_unique_oid here ? Modified > 4) > > + search_path = fetch_search_path(false); > + if (search_path == NIL) /* nothing valid in search_path? */ > > It might be better to list_free(search_path) when not used. Modified > 5) > + if (list_length(pubobj->name) == 1 && > + (strcmp(relname, "CURRENT_SCHEMA") == 0)) > + ereport(ERROR, > + errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("invalid relation name at or near"), > + parser_errposition(pstate, pubobj->location)); > > Maybe we don't need this check, because it will report an error in > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that > message seems readable to me. Allowing CURRENT_SCHEMA is required to support current schema for schema publications, currently I'm allowing this syntax during parsing and this error is thrown for relations later, this is done to keep the similar error as earlier before this feature support. I felt we can keep it like this to maintain the similar error. Thoughts? > About 0002 > 6) > > diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl > index 0c84d87873..0a479dfe36 100644 > --- a/src/test/subscription/t/001_rep_changes.pl > +++ b/src/test/subscription/t/001_rep_changes.pl > @@ -6,7 +6,7 @@ use strict; > use warnings; > use PostgresNode; > use TestLib; > -use Test::More tests => 32; > +use Test::More tests => 46; > > I think it might be better to move these testcases create a separate perl file. Modified, added to 025_rep_changes_for_schema.pl > About 0003 > 7) > The v22-0003 seems simple and can remove lots of code in patch v22-0001, so > maybe we can merge 0001 and 0003 into one patch ? I agree that the code becomes simpler, it reduces a lot of code. I had kept it like that as the testing effort might be more and also I was waiting if there was no objection for that syntax from anyone else. I will wait for a few more reviews and merge it to 0001 if there are no objections. I have fixed these comments as part of v23 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com Regards, Vignesh
On Mon, Aug 30, 2021 at 1:31 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Friday, August 27, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have implemented this in the 0003 patch, I have kept it separate to > > reduce the testing effort and also it will be easier if someone > > disagrees with the syntax. I will merge it to the main patch later > > based on the feedback. Attached v22 patch has the changes for the > > same. > > Thoughts? > > > > Thanks for your new patch. Here are some suggestions: > > 1. > If a publication published a table and the schema where the table belonged to, the > publication name would show twice when using '\d+' for the table. > Maybe we should add some check to avoid the duplication. Thought? > > For example: > > create schema sch1; > create table sch1.tbl(a int); > create publication pub for table sch1.tbl, all tables in schema sch1; > > postgres=# \d+ sch1.tbl > Table "sch1.tbl" > Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > a | integer | | | | plain | | | > Publications: > "pub" > "pub" > Access method: heap Modified > 2. doc/src/sgml/catalogs.sgml > @@ -6169,6 +6174,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l > publication instead of its own. > </para></entry> > </row> > + > </tbody> > </tgroup> > </table> > > It seems that we don't need this change. Modified > 3. src/bin/psql/tab-complete.c > > + /* Complete "CREATE PUBLICATION <name> FOR SCHEMA <schema>, ..." */ > + else if (HeadMatches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA")) > + COMPLETE_WITH_QUERY(Query_for_list_of_schemas > + " UNION SELECT 'CURRENT_SCHEMA' " > + "UNION SELECT 'WITH ('"); > > The comment should be updated to "FOR ALL TABLES IN SCHEMA". Modified. I have fixed these comments as part of v23 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com Regards, Vignesh
On Mon, Aug 30, 2021 at 2:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have implemented this in the 0003 patch, I have kept it separate to > > reduce the testing effort and also it will be easier if someone > > disagrees with the syntax. I will merge it to the main patch later > > based on the feedback. Attached v22 patch has the changes for the > > same. > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > TABLE sc1.test;" maintains the table separately and results in the > following in the \dRp+ output: > > Tables: > "sc1.test" > Schemas: > "sc1" > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" > still leaves the "sc1.test" table in the publication. I had intentionally implemented this way, the reason being it gives the flexibility to modify the publications based on the way the publication is created. My idea was that if a user specified a table/schema of the same schema while creating the publication, the user should be allowed to drop any of them at any time. In the above case if we don't maintain the results separately, users will not be able to drop the table from the publication at a later point of time. Thoughts? > Is there a reason why we don't/can't support "ALTER SUBSCRIPTION ... > SET ALL TABLES;"? > (I know it wasn't supported before, but now "ALTER SUBSCRIPTION ... > SET ALL TABLES IN SCHEMA ..." is being supported) I agree this can be implemented with the current design. I felt I can work on getting the current patches into a committable shape and then work on this after the current patch series is finished. Thoughts? > I notice that the v22-0003 documentation updates for ALTER > SUBSCRIPTION are missing - but you're probably waiting on all feedback > before proceeding with that. I have fixed this as part of v23 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com Regards, Vignesh
>
>
> Okay, I got it but let's add few comments in the code related to it.
> Also, I noticed that the code in InvalidatePublicationRels() already
> exists in AlterPublicationOptions(). You can try to refactor the
> existing code as a separate initial patch.
I have made these changes at the v23 patch attached at [1].
> BTW, I noticed that "for all tables", we don't register invalidations
> in the above scenario, and then later that causes conflict on the
> subscriber. I think that is a bug in the current code and we can deal
> with that separately.
I agree that the cache invalidation has been missed in case of "for all tables" publication, I have fixed these and posted a patch for the same at [2].
[1] - https://www.postgresql.org/message-id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm0zkQznFrxzHBoWZUGsf%3DnKSxhEZZhZ1eTDWLpFok6zZw%40mail.gmail.com
Regards,
Vigneshhas
On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > > TABLE sc1.test;" maintains the table separately and results in the > > following in the \dRp+ output: > > > > Tables: > > "sc1.test" > > Schemas: > > "sc1" > > > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" > > still leaves the "sc1.test" table in the publication. > > I had intentionally implemented this way, the reason being it gives > the flexibility to modify the publications based on the way the > publication is created. My idea was that if a user specified a > table/schema of the same schema while creating the publication, the > user should be allowed to drop any of them at any time. In the above > case if we don't maintain the results separately, users will not be > able to drop the table from the publication at a later point of time. > Thoughts? > Hmmm. I'm not sure it should work like that (but maybe I'm wrong - what do others think???). I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, TABLE sc1.test;" should silently just ignore the "TABLE sc1.test" part, as that is a table in schema sc1, so it's effectively a duplicate. Also, I noticed the following: postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, postgres-# TABLE sc1.test; CREATE PUBLICATION postgres=# \dRp+ Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -------+------------+---------+---------+---------+-----------+---------- gregn | f | t | t | t | t | f Tables: "sc1.test" Schemas: "sc1" postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; ALTER PUBLICATION postgres=# \dRp+ Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -------+------------+---------+---------+---------+-----------+---------- gregn | f | t | t | t | t | f Tables: "sc1.test" postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; ERROR: schema "sc1" is not part of the publication I think it seems odd to not drop table "sc1.test" from the publication after "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;". Also, after running that command again, it seems odd to report that schema sc1 is not part of the publication, when there remains one table from that schema in the publication. And shouldn't it say "tables from schema ... are not part of the publication" rather than "schema ... is not part of the publication"? I think the former is better and more accurate. Schemas can contain database objects other than tables. Similarly, I'm also thinking that in the "\dRp+" output, it should say "Tables from schemas:" instead of "Schemas:". Regards, Greg Nancarrow Fujitsu Australia
On Tue, Aug 31, 2021 at 10:50 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > > > TABLE sc1.test;" maintains the table separately and results in the > > > following in the \dRp+ output: > > > > > > Tables: > > > "sc1.test" > > > Schemas: > > > "sc1" > > > > > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" > > > still leaves the "sc1.test" table in the publication. > > > > I had intentionally implemented this way, the reason being it gives > > the flexibility to modify the publications based on the way the > > publication is created. My idea was that if a user specified a > > table/schema of the same schema while creating the publication, the > > user should be allowed to drop any of them at any time. In the above > > case if we don't maintain the results separately, users will not be > > able to drop the table from the publication at a later point of time. > > Thoughts? > > > > Hmmm. I'm not sure it should work like that (but maybe I'm wrong - > what do others think???). > I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > TABLE sc1.test;" should silently just ignore the "TABLE sc1.test" > part, as that is a table in schema sc1, so it's effectively a > duplicate. > I find the way it is implemented to be more intuitive as that gives users more flexibility to retain certain tables from the schema and appears to be exactly what users intended by the command. I don't think finding duplicates among different object lists (schema, table) is a good idea because tomorrow for some other objects the same thing can happen. It might be better to get some other opinions on this matter though. > Also, I noticed the following: > > postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > postgres-# TABLE sc1.test; > CREATE PUBLICATION > postgres=# \dRp+ > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -------+------------+---------+---------+---------+-----------+---------- > gregn | f | t | t | t | t | f > Tables: > "sc1.test" > Schemas: > "sc1" > > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; > ALTER PUBLICATION > postgres=# \dRp+ > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -------+------------+---------+---------+---------+-----------+---------- > gregn | f | t | t | t | t | f > Tables: > "sc1.test" > > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; > ERROR: schema "sc1" is not part of the publication > What will happen if you second time run the command as ALTER PUBLICATION pub1 DROP Table sc1.test? If that works, I think the behavior should be fine. -- With Regards, Amit Kapila.
On Tue, Aug 31, 2021 at 9:15 AM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Aug 30, 2021 at 12:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Okay, I got it but let's add few comments in the code related to it. > > Also, I noticed that the code in InvalidatePublicationRels() already > > exists in AlterPublicationOptions(). You can try to refactor the > > existing code as a separate initial patch. > > I have made these changes at the v23 patch attached at [1]. > * +/* + * Invalidate the relations. + */ +static inline void +InvalidatePublicationRels(List *relids) I don't see the need to make this an inline function. * On similar lines, the code in function GetPubPartitionOptionRelations seems to be extracted from existing function GetPublicationRelations(). Can't we move it into the 0001 patch? -- With Regards, Amit Kapila.
On Tue, Aug 31, 2021 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Aug 31, 2021 at 10:50 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > > > > TABLE sc1.test;" maintains the table separately and results in the > > > > following in the \dRp+ output: > > > > > > > > Tables: > > > > "sc1.test" > > > > Schemas: > > > > "sc1" > > > > > > > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" > > > > still leaves the "sc1.test" table in the publication. > > > > > > I had intentionally implemented this way, the reason being it gives > > > the flexibility to modify the publications based on the way the > > > publication is created. My idea was that if a user specified a > > > table/schema of the same schema while creating the publication, the > > > user should be allowed to drop any of them at any time. In the above > > > case if we don't maintain the results separately, users will not be > > > able to drop the table from the publication at a later point of time. > > > Thoughts? > > > > > > > Hmmm. I'm not sure it should work like that (but maybe I'm wrong - > > what do others think???). > > I thought that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > > TABLE sc1.test;" should silently just ignore the "TABLE sc1.test" > > part, as that is a table in schema sc1, so it's effectively a > > duplicate. > > > > I find the way it is implemented to be more intuitive as that gives > users more flexibility to retain certain tables from the schema and > appears to be exactly what users intended by the command. I don't > think finding duplicates among different object lists (schema, table) > is a good idea because tomorrow for some other objects the same thing > can happen. It might be better to get some other opinions on this > matter though. > > > Also, I noticed the following: > > > > postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > > postgres-# TABLE sc1.test; > > CREATE PUBLICATION > > postgres=# \dRp+ > > Publication pub1 > > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > > -------+------------+---------+---------+---------+-----------+---------- > > gregn | f | t | t | t | t | f > > Tables: > > "sc1.test" > > Schemas: > > "sc1" > > > > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; > > ALTER PUBLICATION > > postgres=# \dRp+ > > Publication pub1 > > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > > -------+------------+---------+---------+---------+-----------+---------- > > gregn | f | t | t | t | t | f > > Tables: > > "sc1.test" > > > > postgres=# ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; > > ERROR: schema "sc1" is not part of the publication > > > > What will happen if you second time run the command as ALTER > PUBLICATION pub1 DROP Table sc1.test? If that works, I think the > behavior should be fine. Alter publication drop table works fine: postgres=# ALTER PUBLICATION pub1 DROP table sc1.test ; ALTER PUBLICATION postgres=# \dRp+ Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root ---------+------------+---------+---------+---------+-----------+---------- vignesh | f | t | t | t | t | f (1 row) Regards, Vignesh
From Mon, Aug 30, 2021 11:26 PM vignesh C <vignesh21@gmail.com> wrote: > On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > 5) > > + if (list_length(pubobj->name) == 1 && > > + (strcmp(relname, "CURRENT_SCHEMA") == > 0)) > > + ereport(ERROR, > > + > errcode(ERRCODE_SYNTAX_ERROR), > > + errmsg("invalid relation > name at or near"), > > + > > + parser_errposition(pstate, pubobj->location)); > > > > Maybe we don't need this check, because it will report an error in > > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , > > and that message seems readable to me. > > Allowing CURRENT_SCHEMA is required to support current schema for schema > publications, currently I'm allowing this syntax during parsing and this error is > thrown for relations later, this is done to keep the similar error as earlier before > this feature support. I felt we can keep it like this to maintain the similar error. > Thoughts? Thanks for the explanation, I got the point. Here are some other comments for v23-000x patches. 1) @@ -6225,6 +6342,9 @@ describePublications(const char *pattern) bool has_pubtruncate; bool has_pubviaroot; + PQExpBufferData title; + printTableContent cont; + if (pset.sversion < 100000) { ... PQExpBufferData title; printTableOpt myopt = pset.popt.topt; printTableContent cont; Should we delete the inner declaration of 'title' and 'cont' ? 2) - /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */ + /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE/SCHEMA */ SCHEMA => ALL TABLES IN SCHEMA 3) + .description = "PUBLICATION SCHEMA", + .section = SECTION_POST_DATA, + .createStmt = query->data)); Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to describe the schema level table publication ? Because there could be some other type publication such as 'ALL SEQUENCES IN SCHEMA' in the future, it will be better to make it clear that we only publish table in schema in this patch. Best regards, Hou zj
On Tue, Aug 31, 2021 at 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I find the way it is implemented to be more intuitive as that gives > users more flexibility to retain certain tables from the schema and > appears to be exactly what users intended by the command. I don't > think finding duplicates among different object lists (schema, table) > is a good idea because tomorrow for some other objects the same thing > can happen. It might be better to get some other opinions on this > matter though. > I think that such functionality needs to be clearly documented (but currently the documentation doesn't sufficiently explain it). Yes, I would definitely like to hear other opinions on this. Note also that currently parts of the documentation are still referring to "ADD SCHEMA/DROP SCHEMA/SET SCHEMA" instead of the new syntax "ADD ALL TABLES IN SCHEMA/DROP ALL TABLES IN SCHEMA/SET ALL TABLES IN SCHEMA": e.g. v23-0003: doc/src/sgml/ref/alter_publication.sgml + The fourth, fifth and sixth variants of this command change which schemas + are part of the publication. The <literal>SET SCHEMA</literal> clause will + replace the list of schemas in the publication with the specified one. + The <literal>ADD SCHEMA</literal> and <literal>DROP SCHEMA</literal> clauses + will add and remove one or more schemas from the publication. Note that + adding schemas to a publication that is already subscribed to will require + a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on + the subscribing side in order to become effective. With the new syntax changes (SCHEMA -> ALL TABLES IN SCHEMA), it seems less intuitive that there are separate schema and table operations on the publication. I'd expect a lot of users to naturally think that "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication all tables that are in schema "sc1", which is not what it is currently doing. Since the syntax was changed to specifically refer to FOR ALL TABLES IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE sc1.test", so IMHO it's reasonable to remove duplicates here, rather than treating these as somehow separate ways of referencing the same table. One thing the current scheme doesn't allow is to publish all tables in a schema except for certain table(s) - and you can't achieve that by adding all tables from a schema to the publication and then selectively dropping some of those tables. I thought that this would be a more common pattern than adding separate tables from schemas that are already included as part of the publication, in order to "retain" them if "all tables from schema ..." are later dropped. postgres=# create schema sc1; CREATE SCHEMA postgres=# create table sc1.test(i int); CREATE TABLE postgres=# create publication pub1 for all tables in schema sc1; CREATE PUBLICATION postgres=# \dRp+ Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -------+------------+---------+---------+---------+-----------+---------- gregn | f | t | t | t | t | f Schemas: "sc1" postgres=# alter publication pub1 drop table sc1.test; ERROR: relation "test" is not part of the publication The above error message seems slightly misleading (as that table IS published by that publication) and also note the relation is not schema-qualified. Regards, Greg Nancarrow Fujitsu Australia
> On Monday, August 30, 2021 11:28 PM vignesh C <vignesh21@gmail.com> wrote: > > I have fixed these comments as part of v23 patch attached at [1]. > [1] - https://www.postgresql.org/message- > id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw > %40mail.gmail.com > Thanks for your new patch. Here are some comments on v23 patch. 1. doc/src/sgml/ref/alter_publication.sgml + <para> + Add some schemas to the publication: +<programlisting> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june; +</programlisting> + </para> This change seems to be added twice, both 0003 and 0004 patch have this change. 2. src/sgml/ref/create_publication.sgml There is the following description about "FOR TABLE" parameter: Only persistent base tables and partitioned tables can be part of a publication. Temporary tables, unlogged tables, foreign tables, materialized views, and regular views cannot be part of a publication. "FOR ALL TABLES IN SCHEMA" parameter also have restrictions, should we add some doc description for it? 3. When using '\dn+', I noticed that the list of publications only contains the publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it designed on purpose? (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES"). For example: create schema sch1; create table sch1.tbl(a int); create publication pub_schema for all tables in schema sch1; create publication pub_all_tables for all tables; postgres=# \d+ sch1.tbl Table "sch1.tbl" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- a | integer | | | | plain | | | Publications: "pub_all_tables" "pub_schema" Access method: heap postgres=# \dn+ sch1 List of schemas Name | Owner | Access privileges | Description ------+----------+-------------------+------------- sch1 | postgres | | Publications: "pub_schema" Regards Tang
On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Tue, Aug 31, 2021 at 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > I find the way it is implemented to be more intuitive as that gives > > users more flexibility to retain certain tables from the schema and > > appears to be exactly what users intended by the command. I don't > > think finding duplicates among different object lists (schema, table) > > is a good idea because tomorrow for some other objects the same thing > > can happen. It might be better to get some other opinions on this > > matter though. > > > > I think that such functionality needs to be clearly documented (but > currently the documentation doesn't sufficiently explain it). > I agree that we should document it clearly if we decide to go in this direction. > Yes, I would definitely like to hear other opinions on this. > > Note also that currently parts of the documentation are still > referring to "ADD SCHEMA/DROP SCHEMA/SET SCHEMA" instead of the new > syntax "ADD ALL TABLES IN SCHEMA/DROP ALL TABLES IN SCHEMA/SET ALL > TABLES IN SCHEMA": > > e.g. > v23-0003: > doc/src/sgml/ref/alter_publication.sgml > + The fourth, fifth and sixth variants of this command change which schemas > + are part of the publication. The <literal>SET SCHEMA</literal> clause will > + replace the list of schemas in the publication with the specified one. > + The <literal>ADD SCHEMA</literal> and <literal>DROP SCHEMA</literal> clauses > + will add and remove one or more schemas from the publication. Note that > + adding schemas to a publication that is already subscribed to will require > + a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on > + the subscribing side in order to become effective. > > With the new syntax changes (SCHEMA -> ALL TABLES IN SCHEMA), it seems > less intuitive that there are separate schema and table operations on > the publication. > I think the documentation could be improved w.r.t above points because it is quite clear that we support schema and tables separately. > I'd expect a lot of users to naturally think that "ALTER PUBLICATION > pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication > all tables that are in schema "sc1", which is not what it is currently > doing. > Since the syntax was changed to specifically refer to FOR ALL TABLES > IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring > to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE > sc1.test", so IMHO it's reasonable to remove duplicates here, rather > than treating these as somehow separate ways of referencing the same > table. > I see your point and if we decide to go this path then it is better to give an error than silently removing duplicates. > One thing the current scheme doesn't allow is to publish all tables in > a schema except for certain table(s) > True, we can always support that as a separate feature. Note that same is true for existing FOR ALL TABLES syntax where users could expect to leave few tables with syntax like FOR ALL TABLES EXCEPT t1,t2,... but we don't have that yet. > - and you can't achieve that by > adding all tables from a schema to the publication and then > selectively dropping some of those tables. I thought that this would > be a more common pattern than adding separate tables from schemas that > are already included as part of the publication, in order to "retain" > them if "all tables from schema ..." are later dropped. > > postgres=# create schema sc1; > CREATE SCHEMA > postgres=# create table sc1.test(i int); > CREATE TABLE > postgres=# create publication pub1 for all tables in schema sc1; > CREATE PUBLICATION > postgres=# \dRp+ > > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -------+------------+---------+---------+---------+-----------+---------- > gregn | f | t | t | t | t | f > Schemas: > "sc1" > > postgres=# alter publication pub1 drop table sc1.test; > ERROR: relation "test" is not part of the publication > > The above error message seems slightly misleading (as that table IS > published by that publication) and also note the relation is not > schema-qualified. > I think we should try to improve this message, maybe we can give detail message similar to what we give For ALL Tables case (DETAIL: Tables that are part of SCHEMA cannot be dropped independently). -- With Regards, Amit Kapila.
On Tue, Aug 31, 2021 at 1:41 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Aug 30, 2021 at 2:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Fri, Aug 27, 2021 at 4:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > I have implemented this in the 0003 patch, I have kept it separate to > > > reduce the testing effort and also it will be easier if someone > > > disagrees with the syntax. I will merge it to the main patch later > > > based on the feedback. Attached v22 patch has the changes for the > > > same. > > > > I notice that "CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1, > > TABLE sc1.test;" maintains the table separately and results in the > > following in the \dRp+ output: > > > > Tables: > > "sc1.test" > > Schemas: > > "sc1" > > > > and also then "ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1;" > > still leaves the "sc1.test" table in the publication. > > I had intentionally implemented this way, the reason being it gives > the flexibility to modify the publications based on the way the > publication is created. My idea was that if a user specified a > table/schema of the same schema while creating the publication, the > user should be allowed to drop any of them at any time. In the above > case if we don't maintain the results separately, users will not be > able to drop the table from the publication at a later point of time. > Thoughts? I think ALL should mean ALL. When you say CREATE PUBLICATION pub1 FOR ALL TABLES; you know it means ALL tables; When you say CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sc1; you know it means ALL tables in the schema sc1; Similarly, when you say ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sc1; I would expect it means to drop ALL tables in sc1 - not all of them sometimes but not all of them at other times or even none of them. ~~ I thought a motivation for this patch was to make it easy for the user to specify tables en-masse. e.g, saying FOR ALL TABLES IN SCHEMA sc1 is a convenience instead of having to specify every schema table explicitly. e.g. What if there are 100s of tables explicitly in a publication. 1. CREATE PUBLICATION pub FOR TABLE sc1.t1,sc1,t2,sc1.t3,....,sc1.t999; 2. ALTER PUBLICATION pub DROP ALL TABLES IN SCHEMA sc1; IIUC the current patch behaviour for step 2 will do nothing at all. That doesn't seem right to me. Where is the user-convenience factor for dropping tables here? ~~ If the rule was simply "ALL means ALL" that hardly even needs documentation to describe it. OTOH, current patch behaviour is quirky wrt how the publication was created and would need to be carefully documented. It is easy to imagine a user unfamiliar with how the publication was originally created will be confused when ALTER PUBLICATION DROP ALL TABLES IN SCHEMA sc1 still leaves some sc1 tables lurking. ~~ Schema objects are not part of the publication. Current only TABLES are in publications, so I thought that \dRp+ output would just be the of "Tables" in the publication. Schemas would not even be displayed at all (except in the table name). Consider that everything is just going to get messier when SEQUENCES are added. If you list Schemas like is done now then what's that going to look like later? I think you'd need to display many lists like - "Tables" and "Tables for Schemas" and "Sequences" and "Sequences for Schema"... IMO it's all beginning to sound like it would become overly complex ~~ For all the above reasons I think ALL should mean ALL, and also \dRp+ should display tables only. (and later table and sequences only) YMMV. ------ Kind Regards, Peter Smith Fujitsu Australia
On Wed, Sep 1, 2021 at 12:06 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Similarly, when you say ALTER PUBLICATION pub1 DROP ALL TABLES IN > SCHEMA sc1; I would expect it means to drop ALL tables in sc1 - not > all of them sometimes but not all of them at other times or even none > of them. > > ~~ > > I thought a motivation for this patch was to make it easy for the user > to specify tables en-masse. e.g, saying FOR ALL TABLES IN SCHEMA sc1 > is a convenience instead of having to specify every schema table > explicitly. > > e.g. What if there are 100s of tables explicitly in a publication. > 1. CREATE PUBLICATION pub FOR TABLE sc1.t1,sc1,t2,sc1.t3,....,sc1.t999; > 2. ALTER PUBLICATION pub DROP ALL TABLES IN SCHEMA sc1; > > IIUC the current patch behaviour for step 2 will do nothing at all. > That doesn't seem right to me. Where is the user-convenience factor > for dropping tables here? > It will give an error (ERROR: schema "sc1" is not part of the publication), we can probably add DETAIL and HINT message (like try dropping using DROP TABLE syntax) indicating the reason and what user need to do in this regard. I think if the user has specified tables individually instead of ALL TABLES IN SCHEMA, she should drop them individually rather than expecting them to be dropped via SCHEMA command. I think in a similar context you can also argue that we should have DROP ALL TABLES syntax to drop all the tables that were individually specified by the user with FOR TABLE command. I think this should be documented as well to avoid any confusion. > ~~ > > If the rule was simply "ALL means ALL" that hardly even needs > documentation to describe it. OTOH, current patch behaviour is quirky > wrt how the publication was created and would need to be carefully > documented. > > It is easy to imagine a user unfamiliar with how the publication was > originally created will be confused when ALTER PUBLICATION DROP ALL > TABLES IN SCHEMA sc1 still leaves some sc1 tables lurking. > > ~~ > > Schema objects are not part of the publication. Current only TABLES > are in publications, so I thought that \dRp+ output would just be the > of "Tables" in the publication. Schemas would not even be displayed at > all (except in the table name). > Hmm, I think this will lead to a lot of table names in output. I think displaying schema names separately is a better approach here. > Consider that everything is just going to get messier when SEQUENCES > are added. If you list Schemas like is done now then what's that going > to look like later? I think you'd need to display many lists like - > "Tables" and "Tables for Schemas" and "Sequences" and "Sequences for > Schema"... > I think that would be better than displaying all the tables and sequences as that will be very difficult for users to view/read. -- With Regards, Amit Kapila.
On Tue, Aug 31, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > I have made these changes at the v23 patch attached at [1]. > > > > * > +/* > + * Invalidate the relations. > + */ > +static inline void > +InvalidatePublicationRels(List *relids) > > I don't see the need to make this an inline function. Modified > * On similar lines, the code in function > GetPubPartitionOptionRelations seems to be extracted from existing > function GetPublicationRelations(). Can't we move it into the 0001 > patch? Modified Thanks for the comments, the attached v24 patch has the changes for the same. Regards, Vignesh
Attachment
- v24-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v24-0002-Added-schema-level-support-for-publication.patch
- v24-0003-Tests-and-documentation-for-schema-level-support.patch
- v24-0004-Alter-publication-syntax-enhancement-to-keep-it-.patch
- v24-0005-Implemented-pg_publication_objects-view.patch
On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Don't you think some users might want to know all the schema names for > a publication? I am not completely sure on this point but I think it > is good to have information for users. It might be also useful to have > pg_publication_objects where we can display object types (like table, > schema, sequence, etc) and then object names. If you are not convinced > then we can wait and see what others think about this. Thanks for the comment, this is handled in the v24 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com Regards, Vignesh
On Wed, Sep 1, 2021 at 6:58 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Here are some other comments for v23-000x patches. > > 1) > > @@ -6225,6 +6342,9 @@ describePublications(const char *pattern) > bool has_pubtruncate; > bool has_pubviaroot; > > + PQExpBufferData title; > + printTableContent cont; > + > if (pset.sversion < 100000) > { > ... > PQExpBufferData title; > printTableOpt myopt = pset.popt.topt; > printTableContent cont; > > Should we delete the inner declaration of 'title' and 'cont' ? Modified > 2) > - /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */ > + /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE/SCHEMA */ > > SCHEMA => ALL TABLES IN SCHEMA Modified > 3) > > + .description = "PUBLICATION SCHEMA", > + .section = SECTION_POST_DATA, > + .createStmt = query->data)); > > Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to describe > the schema level table publication ? Because there could be some other type > publication such as 'ALL SEQUENCES IN SCHEMA' in the future, it will be better > to make it clear that we only publish table in schema in this patch. Modified Thanks for the comments, the v24 patch attached at [1] handles the comments. [1] - https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com Regards, Vignesh
On Wed, Sep 1, 2021 at 11:14 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > > On Monday, August 30, 2021 11:28 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have fixed these comments as part of v23 patch attached at [1]. > > [1] - https://www.postgresql.org/message- > > id/CALDaNm0xmqJeQEfV5Wnj2BawM%3DsdFdfOXz5N%2BgGG3WB6k9%3Dtdw > > %40mail.gmail.com > > > > Thanks for your new patch. Here are some comments on v23 patch. > > 1. doc/src/sgml/ref/alter_publication.sgml > + <para> > + Add some schemas to the publication: > +<programlisting> > +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june; > +</programlisting> > + </para> > > This change seems to be added twice, both 0003 and 0004 patch have this change. Modified > 2. src/sgml/ref/create_publication.sgml > There is the following description about "FOR TABLE" parameter: > Only persistent base tables and partitioned tables can be part of a > publication. Temporary tables, unlogged tables, foreign tables, > materialized views, and regular views cannot be part of a publication. > > "FOR ALL TABLES IN SCHEMA" parameter also have restrictions, should we add > some doc description for it? Modified > 3. When using '\dn+', I noticed that the list of publications only contains the > publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it designed on purpose? > (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES"). > > For example: > create schema sch1; > create table sch1.tbl(a int); > create publication pub_schema for all tables in schema sch1; > create publication pub_all_tables for all tables; I'm not sure if it is intentional or not, Do you want to post the question in a separate thread and see if that should be handled? Thanks for the comments, the v24 patch attached at [1] handles the comments. [1] - https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com Regards, Vignesh
From Wed, Sep 1, 2021 2:36 PM Peter Smith <smithpb2250@gmail.com> wrote: > Schema objects are not part of the publication. Current only TABLES are in > publications, so I thought that \dRp+ output would just be the of "Tables" in > the publication. Schemas would not even be displayed at all (except in the > table name). I think one use case of schema level publication is it can automatically publish new table created in the shcema(same as ALL TABLE publication). So, IMO, \dRp+ should output Schema level publication separately to make the user aware of it. Best regards, Hou zj
On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote: > Below are few comments on v23. If you have already addressed anything in v24, then ignore it. 1. The commit message says: A new system table "pg_publication_schema" has been added, to maintain the schemas that the user wants to publish through the publication.". The alternative name for this system table could be "pg_publication_namespace". The reason why this alternative comes to my mind is that the current system table to store schema information is named pg_namespace. So shouldn't we be consistent here? What do others think about this? 2. In function check_publication_add_schema(), the primary error message should be "cannot add schema \"%s\" to publication". See check_publication_add_relation() for similar error messages. 3. +ObjectAddress +publication_add_schema(Oid pubid, Oid schemaoid, bool if_not_exists) Isn't it better to use 'schemaid' so that it is consistent with 'pubid'? 4. ConvertPubObjSpecListToOidList() { .. + schemaoid = linitial_oid(search_path); + nspname = get_namespace_name(schemaoid); + if (nspname == NULL) /* recently-deleted namespace? */ + ereport(ERROR, + errcode(ERRCODE_UNDEFINED_SCHEMA), + errmsg("no schema has been selected")); + + list_free(search_path); .. } You can free the memory for 'nspname' as that is not used afterward. 5. + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL); + + /* Invalidate relcache so that publication info is rebuilt. */ + InvalidatePublicationRels(schemaRels); It is better to write this comment above GetSchemaPublicationRelations, something like below: + /* Invalidate relcache so that publication info is rebuilt. */ + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL); + InvalidatePublicationRels(schemaRels); 6. +static List * +GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt, + Oid relid) I think it is better to name this function as GetPublicationPartOptRelations as that way it will be more consistent with existing functions and structure name PublicationPartOpt. 7. All the callers of PublicationAddSchemas() have a superuser check, then why there is again a check of owner/superuser in that function? 8. +/* + * Gets the list of FOR ALL TABLES IN SCHEMA publication oids associated with a + * specified schema oid + */ +List * +GetSchemaPublications(Oid schemaid) I find it a bit difficult to read this comment. Can we omit "FOR ALL TABLES IN SCHEMA" from this comment? 9. In the doc patch (v23-0003-Tests-and-documentation-for-schema-level-support), I see below line: <para> - To add a table to a publication, the invoking user must have ownership - rights on the table. The <command>FOR ALL TABLES</command> clause requires - the invoking user to be a superuser. + To add a table/schema to a publication, the invoking user must have + ownership rights on the table/schema. The <command>FOR ALL TABLES</command> + and <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking + user to be a superuser. Is it correct to specify the schema in the first line? AFAIU, all forms of schema addition requires superuser privilege. -- With Regards, Amit Kapila.
On Thu, Sep 2, 2021 at 6:50 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Wed, Sep 1, 2021 2:36 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Schema objects are not part of the publication. Current only TABLES are in > > publications, so I thought that \dRp+ output would just be the of "Tables" in > > the publication. Schemas would not even be displayed at all (except in the > > table name). > > I think one use case of schema level publication is it can automatically > publish new table created in the shcema(same as ALL TABLE publication). So, > IMO, \dRp+ should output Schema level publication separately to make the user > aware of it. OK. That is a fair point. ------ Kind Regards, Peter Smith. Fujitsu Australia
On Mon, Aug 30, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > > 5) > > + if (list_length(pubobj->name) == 1 && > > + (strcmp(relname, "CURRENT_SCHEMA") == 0)) > > + ereport(ERROR, > > + errcode(ERRCODE_SYNTAX_ERROR), > > + errmsg("invalid relation name at or near"), > > + parser_errposition(pstate, pubobj->location)); > > > > Maybe we don't need this check, because it will report an error in > > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that > > message seems readable to me. > > Allowing CURRENT_SCHEMA is required to support current schema for > schema publications, currently I'm allowing this syntax during parsing > and this error is thrown for relations later, this is done to keep the > similar error as earlier before this feature support. I felt we can > keep it like this to maintain the similar error. Thoughts? > I find this check quite ad-hoc in the code and I am not sure if we need to be consistent for the exact message in this case. So, I think it is better to remove it. > > > About 0003 > > 7) > > The v22-0003 seems simple and can remove lots of code in patch v22-0001, so > > maybe we can merge 0001 and 0003 into one patch ? > > I agree that the code becomes simpler, it reduces a lot of code. I had > kept it like that as the testing effort might be more and also I was > waiting if there was no objection for that syntax from anyone else. I > will wait for a few more reviews and merge it to 0001 if there are no > objections. > +1 to merge the patch as suggested by Hou-San. -- With Regards, Amit Kapila.
On Wed, Sep 1, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > > I'd expect a lot of users to naturally think that "ALTER PUBLICATION > > pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication > > all tables that are in schema "sc1", which is not what it is currently > > doing. > > Since the syntax was changed to specifically refer to FOR ALL TABLES > > IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring > > to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE > > sc1.test", so IMHO it's reasonable to remove duplicates here, rather > > than treating these as somehow separate ways of referencing the same > > table. > > > > I see your point and if we decide to go this path then it is better to > give an error than silently removing duplicates. > Today, I have thought about this point again and it seems better to give an error in this case and let the user take the action rather than silently removing such tables to avoid any confusion. -- With Regards, Amit Kapila.
On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Below are few comments on v23. If you have already addressed anything > in v24, then ignore it. > More Review comments (on latest version v24): ======================================= 1. + Oid psnspcid BKI_LOOKUP(pg_class); /* Oid of the schema */ +} FormData_pg_publication_schema; Why in the above structure you have used pg_class instead of pg_namespace? 2. GetSchemaPublicationRelations() uses two different ways to skip non-publishable relations in two while loops. Both are correct but I think it would be easier to follow if they use the same way and in this case I would prefer a check like if (is_publishable_class(relid, relForm)). The comments atop function could also be modified to :"Get the list of publishable relation oids for a specified schema.". I think it is better to write some comments on why you need to scan and loop twice. 3. The other point about GetSchemaPublicationRelations() is that I am afraid that it will collect duplicate relation oids in the final list when the partitioned table and child tables are in the same schema. 4. In GetRelationPublicationActions(), the handling related to partitions seems to be missing for the schema. It won't be able to take care of child tables when they are in a different schema than the parent table. 5. If I modify the search path to remove public schema then I get the below error message: postgres=# Create publication mypub for all tables in schema current_schema; ERROR: no schema has been selected I think this message is not very clear. How about changing to something like "current_schema doesn't contain any valid schema"? This message is used in more than one place, so let's keep it the same at all the places if you agree to change it. 6. How about naming ConvertPubObjSpecListToOidList() as ObjectsInPublicationToOids()? I see somewhat similarly named functions in the code like objectsInSchemaToOids, objectNamesToOids. 7. + /* + * Schema lock is held until the publication is created to prevent + * concurrent schema deletion. No need to unlock the schemas, the + * locks will be released automatically at the end of create + * publication command. + */ In this comment no need to say create publication command as that is implicit, we can just write ".... at the end of command". 8. Can you please extract changes like tab-completion, dump/restore in separate patches? This can help to review the core (backend) part of the patch in more detail. -- With Regards, Amit Kapila.
From Thur, Sep 2, 2021 2:33 PM vignesh C <vignesh21@gmail.com> wrote: > On Wed, Sep 1, 2021 at 6:58 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > Here are some other comments for v23-000x patches. > > 3) > > > > + .description = > "PUBLICATION SCHEMA", > > + .section = > SECTION_POST_DATA, > > + .createStmt > > + = query->data)); > > > > Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to > > describe the schema level table publication ? Because there could be > > some other type publication such as 'ALL SEQUENCES IN SCHEMA' in the > > future, it will be better to make it clear that we only publish table in schema in > this patch. > > Modified Thanks for updating the patch. I think we might also need to mention the publication object 'table' in the following types: 1) + /* OCLASS_PUBLICATION_SCHEMA */ + { + "publication schema", OBJECT_PUBLICATION_SCHEMA + }, 2) + PUBLICATIONOBJ_SCHEMA, /* Schema type */ + PUBLICATIONOBJ_UNKNOWN /* Unknown type */ +} PublicationObjSpecType; 3) + DO_PUBLICATION_SCHEMA, I think it might be to change the typename like XX_REL_IN_SCHEMA, and adjust the comments. Best regards, Hou zj
From Thur, Sep 2, 2021 7:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Below are few comments on v23. If you have already addressed anything in v24, > then ignore it. > > 1. The commit message says: A new system table "pg_publication_schema" > has been added, to maintain the schemas that the user wants to publish > through the publication.". The alternative name for this system table could be > "pg_publication_namespace". The reason why this alternative comes to my > mind is that the current system table to store schema information is named > pg_namespace. So shouldn't we be consistent here? > What do others think about this? Since the new system table refer to pg_namespace, so, personally, I am +1 for "pg_publication_namespace". Best regards, Hou zj
On Thursday, September 2, 2021 2:36 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Sep 1, 2021 at 11:14 AM tanghy.fnst@fujitsu.com > <tanghy.fnst@fujitsu.com> wrote: > > > > 3. When using '\dn+', I noticed that the list of publications only contains the > > publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it > designed on purpose? > > (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES"). > > > > For example: > > create schema sch1; > > create table sch1.tbl(a int); > > create publication pub_schema for all tables in schema sch1; > > create publication pub_all_tables for all tables; > > I'm not sure if it is intentional or not, Do you want to post the > question in a separate thread and see if that should be handled? > Sorry, maybe I didn't make my last question clearly enough. In HEAD(where schema level is not supported for publication), there is no publication information in the result of '\dn+'. With this schema patch, '\dn+' shows the publications related to the schema, but ALL TABLES publications are not shown. Do you think we should add ALL TABLES publications, too? Regards Tang
On Mon, Sep 6, 2021 at 9:53 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Thursday, September 2, 2021 2:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, Sep 1, 2021 at 11:14 AM tanghy.fnst@fujitsu.com > > <tanghy.fnst@fujitsu.com> wrote: > > > > > > 3. When using '\dn+', I noticed that the list of publications only contains the > > > publications for "SCHEMA", "FOR ALL TABLES" publications are not shown. Is it > > designed on purpose? > > > (The result of '\d+' lists the publications of "SCHEAME" and "FOR ALL TABLES"). > > > > > > For example: > > > create schema sch1; > > > create table sch1.tbl(a int); > > > create publication pub_schema for all tables in schema sch1; > > > create publication pub_all_tables for all tables; > > > > I'm not sure if it is intentional or not, Do you want to post the > > question in a separate thread and see if that should be handled? > > > > Sorry, maybe I didn't make my last question clearly enough. > > In HEAD(where schema level is not supported for publication), there is no publication > information in the result of '\dn+'. > > With this schema patch, '\dn+' shows the publications related to the schema, but ALL > TABLES publications are not shown. Do you think we should add ALL TABLES publications, too? > No, I don't think we need to display For All Tables publication under \dn+. It is already shown with \d+ <table_name> command. -- With Regards, Amit Kapila.
On Thu, Sep 2, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v24 patch has the changes for the same. > Just a note that these patches no longer apply due to a recent commit related to logical replication (0c6828fa98). Regards, Greg Nancarrow Fujitsu Australia
On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Below are few comments on v23. If you have already addressed anything > in v24, then ignore it. > > 1. The commit message says: A new system table "pg_publication_schema" > has been added, to maintain the schemas that the user wants to publish > through the publication.". The alternative name for this system table > could be "pg_publication_namespace". The reason why this alternative > comes to my mind is that the current system table to store schema > information is named pg_namespace. So shouldn't we be consistent here? > What do others think about this? Modified > 2. In function check_publication_add_schema(), the primary error > message should be "cannot add schema \"%s\" to publication". See > check_publication_add_relation() for similar error messages. Modified > 3. > +ObjectAddress > +publication_add_schema(Oid pubid, Oid schemaoid, bool if_not_exists) > > Isn't it better to use 'schemaid' so that it is consistent with 'pubid'? Modified > 4. > ConvertPubObjSpecListToOidList() > { > .. > + schemaoid = linitial_oid(search_path); > + nspname = get_namespace_name(schemaoid); > + if (nspname == NULL) /* recently-deleted namespace? */ > + ereport(ERROR, > + errcode(ERRCODE_UNDEFINED_SCHEMA), > + errmsg("no schema has been selected")); > + > + list_free(search_path); > .. > } > > You can free the memory for 'nspname' as that is not used afterward. I have removed get_namespace_name, no need to validate the schemaid, fetch_search_path will return valid schemaids. > 5. > + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL); > + > + /* Invalidate relcache so that publication info is rebuilt. */ > + InvalidatePublicationRels(schemaRels); > > It is better to write this comment above > GetSchemaPublicationRelations, something like below: > > + /* Invalidate relcache so that publication info is rebuilt. */ > + schemaRels = GetSchemaPublicationRelations(schemaoid, PUBLICATION_PART_ALL); > + InvalidatePublicationRels(schemaRels); Modified > 6. > +static List * > +GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt, > + Oid relid) > > I think it is better to name this function as > GetPublicationPartOptRelations as that way it will be more consistent > with existing functions and structure name PublicationPartOpt. Modified > 7. All the callers of PublicationAddSchemas() have a superuser check, > then why there is again a check of owner/superuser in that function? Modified to remove the check > 8. > +/* > + * Gets the list of FOR ALL TABLES IN SCHEMA publication oids associated with a > + * specified schema oid > + */ > +List * > +GetSchemaPublications(Oid schemaid) > > I find it a bit difficult to read this comment. Can we omit "FOR ALL > TABLES IN SCHEMA" from this comment? Modified. > 9. In the doc patch > (v23-0003-Tests-and-documentation-for-schema-level-support), I see > below line: > <para> > - To add a table to a publication, the invoking user must have ownership > - rights on the table. The <command>FOR ALL TABLES</command> clause requires > - the invoking user to be a superuser. > + To add a table/schema to a publication, the invoking user must have > + ownership rights on the table/schema. The <command>FOR ALL TABLES</command> > + and <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking > + user to be a superuser. > > Is it correct to specify the schema in the first line? AFAIU, all > forms of schema addition requires superuser privilege. That is not required, modified. Attached v25 patch has the changes for the same. Currently I have used a different way to parse which does not require "Add PublicationTable and PublicationRelInfo structs" committed changes, I have currently removed the changes in this patch. I will analyze further and use whichever parsing is better. I will handle this in the next version. Regards, Vignesh
Attachment
- v25-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v25-0002-Added-schema-level-support-for-publication.patch
- v25-0003-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v25-0004-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v25-0005-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v25-0006-Implemented-pg_publication_objects-view.patch
On Fri, Sep 3, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Aug 30, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Mon, Aug 30, 2021 at 9:10 AM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > 5) > > > + if (list_length(pubobj->name) == 1 && > > > + (strcmp(relname, "CURRENT_SCHEMA") == 0)) > > > + ereport(ERROR, > > > + errcode(ERRCODE_SYNTAX_ERROR), > > > + errmsg("invalid relation name at or near"), > > > + parser_errposition(pstate, pubobj->location)); > > > > > > Maybe we don't need this check, because it will report an error in > > > OpenTableList() anyway, "relation "CURRENT_SCHEMA" does not exist" , and that > > > message seems readable to me. > > > > Allowing CURRENT_SCHEMA is required to support current schema for > > schema publications, currently I'm allowing this syntax during parsing > > and this error is thrown for relations later, this is done to keep the > > similar error as earlier before this feature support. I felt we can > > keep it like this to maintain the similar error. Thoughts? > > > > I find this check quite ad-hoc in the code and I am not sure if we > need to be consistent for the exact message in this case. So, I think > it is better to remove it. Modified > > > About 0003 > > > 7) > > > The v22-0003 seems simple and can remove lots of code in patch v22-0001, so > > > maybe we can merge 0001 and 0003 into one patch ? > > > > I agree that the code becomes simpler, it reduces a lot of code. I had > > kept it like that as the testing effort might be more and also I was > > waiting if there was no objection for that syntax from anyone else. I > > will wait for a few more reviews and merge it to 0001 if there are no > > objections. > > > > +1 to merge the patch as suggested by Hou-San. Modified This is handled as part of v25 patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com Regards, Vignesh
On Mon, Sep 6, 2021 at 6:56 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Thur, Sep 2, 2021 7:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > Below are few comments on v23. If you have already addressed anything in v24, > > then ignore it. > > > > 1. The commit message says: A new system table "pg_publication_schema" > > has been added, to maintain the schemas that the user wants to publish > > through the publication.". The alternative name for this system table could be > > "pg_publication_namespace". The reason why this alternative comes to my > > mind is that the current system table to store schema information is named > > pg_namespace. So shouldn't we be consistent here? > > What do others think about this? > > Since the new system table refer to pg_namespace, so, personally, I am +1 for > "pg_publication_namespace". I have changed it as part of v25 patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com Regards, Vignesh
On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 2, 2021 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Sep 2, 2021 at 11:58 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > Below are few comments on v23. If you have already addressed anything > > in v24, then ignore it. > > > > More Review comments (on latest version v24): > ======================================= > 1. > + Oid psnspcid BKI_LOOKUP(pg_class); /* Oid of the schema */ > +} FormData_pg_publication_schema; > > Why in the above structure you have used pg_class instead of pg_namespace? It should be pg_namespace, I have changed it. > 2. GetSchemaPublicationRelations() uses two different ways to skip > non-publishable relations in two while loops. Both are correct but I > think it would be easier to follow if they use the same way and in > this case I would prefer a check like if (is_publishable_class(relid, > relForm)). The comments atop function could also be modified to :"Get > the list of publishable relation oids for a specified schema.". I > think it is better to write some comments on why you need to scan and > loop twice. Modified > 3. The other point about GetSchemaPublicationRelations() is that I am > afraid that it will collect duplicate relation oids in the final list > when the partitioned table and child tables are in the same schema. Modified it to prepare a list without duplicate relation ids. > 4. In GetRelationPublicationActions(), the handling related to > partitions seems to be missing for the schema. It won't be able to > take care of child tables when they are in a different schema than the > parent table. Modified > 5. > If I modify the search path to remove public schema then I get the > below error message: > postgres=# Create publication mypub for all tables in schema current_schema; > ERROR: no schema has been selected > > I think this message is not very clear. How about changing to > something like "current_schema doesn't contain any valid schema"? This > message is used in more than one place, so let's keep it the same at > all the places if you agree to change it. I would prefer to use the existing messages as we have used this in a few other places similarly. Thoughts? > 6. How about naming ConvertPubObjSpecListToOidList() as > ObjectsInPublicationToOids()? I see somewhat similarly named functions > in the code like objectsInSchemaToOids, objectNamesToOids. Modified > 7. > + /* > + * Schema lock is held until the publication is created to prevent > + * concurrent schema deletion. No need to unlock the schemas, the > + * locks will be released automatically at the end of create > + * publication command. > + */ > > In this comment no need to say create publication command as that is > implicit, we can just write ".... at the end of command". Modified > 8. Can you please extract changes like tab-completion, dump/restore in > separate patches? This can help to review the core (backend) part of > the patch in more detail. Modified This is handled as part of v25 patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com Regards, Vignesh
On Fri, Sep 3, 2021 at 4:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 1, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Sep 1, 2021 at 8:52 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > > > > > I'd expect a lot of users to naturally think that "ALTER PUBLICATION > > > pub1 DROP ALL TABLES IN SCHEMA sc1;" would drop from the publication > > > all tables that are in schema "sc1", which is not what it is currently > > > doing. > > > Since the syntax was changed to specifically refer to FOR ALL TABLES > > > IN SCHEMA rather than FOR SCHEMA, then now it's clear we're referring > > > to tables only, when specifying "... FOR ALL TABLES in sc1, TABLE > > > sc1.test", so IMHO it's reasonable to remove duplicates here, rather > > > than treating these as somehow separate ways of referencing the same > > > table. > > > > > > > I see your point and if we decide to go this path then it is better to > > give an error than silently removing duplicates. > > > > Today, I have thought about this point again and it seems better to > give an error in this case and let the user take the action rather > than silently removing such tables to avoid any confusion. I have handled this to throw an error. This is handled as part of v25 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2SytXy2TDnzzYkXWKgNp74ssPBXrkMXEyac1qVYSRkbw%40mail.gmail.com Regards, Vignesh
On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > 5. > > If I modify the search path to remove public schema then I get the > > below error message: > > postgres=# Create publication mypub for all tables in schema current_schema; > > ERROR: no schema has been selected > > > > I think this message is not very clear. How about changing to > > something like "current_schema doesn't contain any valid schema"? This > > message is used in more than one place, so let's keep it the same at > > all the places if you agree to change it. > > I would prefer to use the existing messages as we have used this in a > few other places similarly. Thoughts? > Yeah, I also see the same message in code but I think here usage is a bit different. If you see a similar SQL statement that causes the same error message then can you please give an example? Few comments on latest patch (v25-0002-Added-schema-level-support-for-publication): ===================================================================== 1. getPublicationSchemaInfo() .. + *nspname = get_namespace_name(pnform->pnnspcid); + if (!(*nspname)) + { + Oid schemaid = pnform->pnpubid; + + pfree(*pubname); + ReleaseSysCache(tup); + if (!missing_ok) + elog(ERROR, "cache lookup failed for schema %u", + schemaid); Why are you using pnform->pnpubid to get schemaid? I think you need pnform->pnnspcid here and it was like that in the previous version, not sure, why you changed it? 2. @@ -369,15 +531,20 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("publication \"%s\" is defined as FOR ALL TABLES", NameStr(pubform->pubname)), - errdetail("Tables cannot be added to or dropped from FOR ALL TABLES publications."))); + errdetail("Tables cannot be added to, dropped from, or set on FOR ALL TABLES publications."))); Why is this message changed? Have we changed anything related to this as part of this patch? 3. + Oid pnnspcid BKI_LOOKUP(pg_namespace); /* Oid of the schema */ +} FormData_pg_publication_namespace; + ... ... +DECLARE_UNIQUE_INDEX(pg_publication_namespace_pnnspcid_pnpubid_index, 8903, PublicationNamespacePnnspcidPnpubidIndexId, on pg_publication_namespace using btree(pnnspcid oid_ops, pnpubid oid_ops)); Let's use nspid instead nspcid at all places as before we refer that way in the code. The way you have done is also okay but it is better to be consistent with existing code. 4. Need to think of comments in GetSchemaPublicationRelations. + /* get all the ordinary tables present in schema schemaid */ .. Let's change the above comment to something like: "get all the relations present in the given schema" + + /* + * Get all relations that inherit from the partition table, directly or + * indirectly. + */ + scan = table_beginscan_catalog(classRel, keycount, key); Let's change the above comment to something like: "It is quite possible that some of the partitions are in a different schema than the parent table, so we need to get such partitions separately." 5. + if (list_member_oid(schemaidlist, relSchemaId)) + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot add relation \"%s.%s\" to publication", + get_namespace_name(relSchemaId), + RelationGetRelationName(rel)), + errdetail("Table's schema is already included as part of ALL TABLES IN SCHEMA option.")); I think the better errdetail would be: "Table's schema \"%s\" is already part of the publication." + if (list_member_oid(schemaidlist, relSchemaId)) + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot add schema \"%s\" to publication", + get_namespace_name(get_rel_namespace(tableOid))), + errdetail("Table \"%s.%s\" is part of publication, adding same schema \"%s\" is not supported", + get_namespace_name(get_rel_namespace(tableOid)), + get_rel_name(tableOid), + get_namespace_name(get_rel_namespace(tableOid)))); I think this errdetail message can also be improved. One idea could be: "Table \"%s\" in schema \"%s\" is already part of the publication, adding the same schema is not supported.". Do you or anyone else have better ideas? 6. I think instead of two different functions CheckRelschemaInSchemaList and CheckSchemaInRels, let's have a single function RelSchemaIsMemberOfSchemaList and have a boolean variable to distinguish the two cases. -- With Regards, Amit Kapila.
On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > 5. > > > If I modify the search path to remove public schema then I get the > > > below error message: > > > postgres=# Create publication mypub for all tables in schema current_schema; > > > ERROR: no schema has been selected > > > > > > I think this message is not very clear. How about changing to > > > something like "current_schema doesn't contain any valid schema"? This > > > message is used in more than one place, so let's keep it the same at > > > all the places if you agree to change it. > > > > I would prefer to use the existing messages as we have used this in a > > few other places similarly. Thoughts? > > > > Yeah, I also see the same message in code but I think here usage is a > bit different. If you see a similar SQL statement that causes the same > error message then can you please give an example? I was referring to the error message in create table postgres=# set search_path='non_existent_schema'; SET postgres=# create table t1(c1 int); ERROR: no schema has been selected to create in LINE 1: create table t1(c1 int); If it is not very useful in the case of creating a publication, then I can change it. Thoughts? Regards, Vignesh
On Wed, Sep 8, 2021 at 10:48 AM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > 5. > > > > If I modify the search path to remove public schema then I get the > > > > below error message: > > > > postgres=# Create publication mypub for all tables in schema current_schema; > > > > ERROR: no schema has been selected > > > > > > > > I think this message is not very clear. How about changing to > > > > something like "current_schema doesn't contain any valid schema"? This > > > > message is used in more than one place, so let's keep it the same at > > > > all the places if you agree to change it. > > > > > > I would prefer to use the existing messages as we have used this in a > > > few other places similarly. Thoughts? > > > > > > > Yeah, I also see the same message in code but I think here usage is a > > bit different. If you see a similar SQL statement that causes the same > > error message then can you please give an example? > > I was referring to the error message in create table > postgres=# set search_path='non_existent_schema'; > SET > postgres=# create table t1(c1 int); > ERROR: no schema has been selected to create in > LINE 1: create table t1(c1 int); > > If it is not very useful in the case of creating a publication, then I > can change it. Thoughts? > If you want to be consistent with the existing message then why did you left the trailing part (".... to create in") of the sentence? -- With Regards, Amit Kapila.
On Tue, Sep 7, 2021 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 7, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Sep 3, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > 5. > > > If I modify the search path to remove public schema then I get the > > > below error message: > > > postgres=# Create publication mypub for all tables in schema current_schema; > > > ERROR: no schema has been selected > > > > > > I think this message is not very clear. How about changing to > > > something like "current_schema doesn't contain any valid schema"? This > > > message is used in more than one place, so let's keep it the same at > > > all the places if you agree to change it. > > > > I would prefer to use the existing messages as we have used this in a > > few other places similarly. Thoughts? > > > > Yeah, I also see the same message in code but I think here usage is a > bit different. If you see a similar SQL statement that causes the same > error message then can you please give an example? Changed it to "no schema has been selected for CURRENT_SCHEMA" > Few comments on latest patch > (v25-0002-Added-schema-level-support-for-publication): > ===================================================================== > 1. > getPublicationSchemaInfo() > .. > + *nspname = get_namespace_name(pnform->pnnspcid); > + if (!(*nspname)) > + { > + Oid schemaid = pnform->pnpubid; > + > + pfree(*pubname); > + ReleaseSysCache(tup); > + if (!missing_ok) > + elog(ERROR, "cache lookup failed for schema %u", > + schemaid); > > Why are you using pnform->pnpubid to get schemaid? I think you need > pnform->pnnspcid here and it was like that in the previous version, > not sure, why you changed it? Modified > 2. > @@ -369,15 +531,20 @@ AlterPublicationTables(AlterPublicationStmt > *stmt, Relation rel, > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > errmsg("publication \"%s\" is defined as FOR ALL TABLES", > NameStr(pubform->pubname)), > - errdetail("Tables cannot be added to or dropped from FOR ALL TABLES > publications."))); > + errdetail("Tables cannot be added to, dropped from, or set on FOR > ALL TABLES publications."))); > > Why is this message changed? Have we changed anything related to this > as part of this patch? This change is not required in this patch, reverted > 3. > + Oid pnnspcid BKI_LOOKUP(pg_namespace); /* Oid of the schema */ > +} FormData_pg_publication_namespace; > + > ... > ... > +DECLARE_UNIQUE_INDEX(pg_publication_namespace_pnnspcid_pnpubid_index, > 8903, PublicationNamespacePnnspcidPnpubidIndexId, on > pg_publication_namespace using btree(pnnspcid oid_ops, pnpubid > oid_ops)); > > Let's use nspid instead nspcid at all places as before we refer that > way in the code. The way you have done is also okay but it is better > to be consistent with existing code. Modified > 4. Need to think of comments in GetSchemaPublicationRelations. > + /* get all the ordinary tables present in schema schemaid */ > .. > > Let's change the above comment to something like: "get all the > relations present in the given schema" > > + > + /* > + * Get all relations that inherit from the partition table, directly or > + * indirectly. > + */ > + scan = table_beginscan_catalog(classRel, keycount, key); > > > Let's change the above comment to something like: "It is quite > possible that some of the partitions are in a different schema than > the parent table, so we need to get such partitions separately." Modified > 5. > + if (list_member_oid(schemaidlist, relSchemaId)) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot add relation \"%s.%s\" to publication", > + get_namespace_name(relSchemaId), > + RelationGetRelationName(rel)), > + errdetail("Table's schema is already included as part of ALL TABLES > IN SCHEMA option.")); > > I think the better errdetail would be: "Table's schema \"%s\" is > already part of the publication." > > + if (list_member_oid(schemaidlist, relSchemaId)) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot add schema \"%s\" to publication", > + get_namespace_name(get_rel_namespace(tableOid))), > + errdetail("Table \"%s.%s\" is part of publication, adding same > schema \"%s\" is not supported", > + get_namespace_name(get_rel_namespace(tableOid)), > + get_rel_name(tableOid), > + get_namespace_name(get_rel_namespace(tableOid)))); > > I think this errdetail message can also be improved. One idea could > be: "Table \"%s\" in schema \"%s\" is already part of the publication, > adding the same schema is not supported.". Do you or anyone else have > better ideas? Modified > 6. I think instead of two different functions > CheckRelschemaInSchemaList and CheckSchemaInRels, let's have a single > function RelSchemaIsMemberOfSchemaList and have a boolean variable to > distinguish the two cases. Modified Thanks for the comments, the attached v26 patch has the changes for the same. Regards, VIgnesh
Attachment
- v26-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v26-0002-Added-schema-level-support-for-publication.patch
- v26-0003-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v26-0004-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v26-0005-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v26-0006-Implemented-pg_publication_objects-view.patch
On Mon, Sep 6, 2021 at 6:56 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Thur, Sep 2, 2021 2:33 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Sep 1, 2021 at 6:58 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > > > Here are some other comments for v23-000x patches. > > > 3) > > > > > > + .description = > > "PUBLICATION SCHEMA", > > > + .section = > > SECTION_POST_DATA, > > > + .createStmt > > > + = query->data)); > > > > > > Is it better to use something like 'PUBLICATION TABLES IN SCHEMA' to > > > describe the schema level table publication ? Because there could be > > > some other type publication such as 'ALL SEQUENCES IN SCHEMA' in the > > > future, it will be better to make it clear that we only publish table in schema in > > this patch. > > > > Modified > > Thanks for updating the patch. > > I think we might also need to mention the publication object 'table' in the > following types: > > 1) > + /* OCLASS_PUBLICATION_SCHEMA */ > + { > + "publication schema", OBJECT_PUBLICATION_SCHEMA > + }, > > 2) > + PUBLICATIONOBJ_SCHEMA, /* Schema type */ > + PUBLICATIONOBJ_UNKNOWN /* Unknown type */ > +} PublicationObjSpecType; > > 3) > + DO_PUBLICATION_SCHEMA, > > I think it might be to change the typename like XX_REL_IN_SCHEMA, > and adjust the comments. Thanks for the comments, this is handled in the v26 patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm3EwAVma8n4YpV1%2BQWiccuVPxpqNfbbrUU3s3XTHcTXew%40mail.gmail.com Regards, Vignesh
On Mon, Aug 16, 2021 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Smith <smithpb2250@gmail.com> writes: > > Then the question from Peter E. [2] "Why can't I have a publication > > that publishes tables t1, t2, t3, *and* schemas s1, s2, s3." would > > have an intuitive solution like: > > > CREATE PUBLICATION pub1 > > FOR TABLE t1,t2,t3 AND > > FOR ALL TABLES IN SCHEMA s1,s2,s3; > > That seems a bit awkward, since the existing precedent is > to use commas. > AFAICS, the closest to this proposal we have is Grant/Revoke syntax where we can give privilege on individual objects and all objects in the schema. Is that you are referring to existing precedent or something else? > We shouldn't need more than one FOR noise-word, > either. So I was imagining syntax more like, say, > > CREATE PUBLICATION pub1 FOR > TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2, > SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4; > > Abstractly it'd be > > createpub := CREATE PUBLICATION pubname FOR cpitem [, ... ] [ WITH ... ] > > cpitem := ALL TABLES | > TABLE name | > ALL TABLES IN SCHEMA name | > ALL SEQUENCES | > SEQUENCE name | > ALL SEQUENCES IN SCHEMA name | > name > > The grammar output would need some post-analysis to attribute the > right type to bare "name" items, but that doesn't seem difficult. > The current patch (v26-0002-Added-schema-level-support-for-publication at [1]) implements this syntax in roughly the way you have proposed here. But, one thing I find a bit awkward is how it needs to keep a separate flag to distinguish between names of different objects for the post-analysis phase. The reason is that in CREATE PUBLICATION syntax [2] one could supply additional decorators like *, ONLY with table name but the same shouldn't be allowed be with schema name or other object names. Is that okay or do you have any better ideas about the same? OTOH, if we implement something like Grant/Revoke where we can give privilege on individual objects and all objects in the schema but not in the same statement then such special flags won't be required to distinguish different object names and we can build something on the lines of current "privilege_target:" in gram.y. [1] - https://www.postgresql.org/message-id/CALDaNm3EwAVma8n4YpV1%2BQWiccuVPxpqNfbbrUU3s3XTHcTXew%40mail.gmail.com [2] - https://www.postgresql.org/docs/devel/sql-createpublication.html -- With Regards, Amit Kapila.
From Wed, Sept 8, 2021 7:44 PM vignesh C <vignesh21@gmail.com> wrote: > Modified > Thanks for the comments, the attached v26 patch has the changes for the same. Hi, Thanks for updating the patch, I have a suggestion for the gram.y. Currently, we have the following two members in PublicationObjSpec to distinguish between names of different objects for the post-analysis phase. >bool inh; >bool spl_rel_type_syn; I was thinking do we have another way to distinguish that which can make code smaller. I tried serval approaches and found a possible better approach. First, I refer to the design of Grant/Revoke syntax, it use two members 'targtype' and 'objtype' to mark different type. 'targtype' can be ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema level) .'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE or ... . I think if we follow this way, the code could be cleaner. Second, we can move the special relation expression into a separate rule 'speical_relation_expr' like the following, this can remove duplicate code used by pubobj_expr. ------ relation_expr: qualified_name {} | speical_relation_expr { $$ = $1; } ; speical_relation_expr: qualified_name '*' {} | ONLY qualified_name {} | ONLY '(' qualified_name ')' {} ; ------ Finally, the gram.y will look like the following. Personally, the code looks cleaner in this approach. ----- pubobj_expr: any_name { /* inheritance query, implicitly */ $$ = makeNode(PublicationObjSpec); $$->targettype = TARGETOBJ_UNKNOWN; $$->object = $1; } | special_relation_expr { $$ = makeNode(PublicationObjSpec); $$->targettype = TARGETOBJ_TABLE; $$->object = $1; } | CURRENT_SCHEMA { $$ = makeNode(PublicationObjSpec); $$->object = list_make1(makeString("CURRENT_SCHEMA")); $$->targettype = TARGETOBJ_SCHEMA; } ; /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */ PublicationObjSpec: TABLE pubobj_expr { $$ = $2; $$->pubobjtype = PUBLICATIONOBJ_TABLE_LIST; $$->location = @1; } | ALL TABLES IN_P SCHEMA pubobj_expr { $$ = $5; $$->pubobjtype = PUBLICATIONOBJ_SCHEMA_LIST; $$->location = @1; } | pubobj_expr { $$ = $1; $$->pubobjtype = PUBLICATIONOBJ_UNKNOWN; $$->location = @1; } ; Attach a diff patch based on v26-0002 patch, which change the corresponding code based on the design above. Please have a look. Best regards, Hou zj
Attachment
From Friday, September 10, 2021 10:33 AM Hou Zhijie<houzj.fnst@fujitsu.com> wrote: > From Wed, Sept 8, 2021 7:44 PM vignesh C <vignesh21@gmail.com> wrote: > > Modified > > Thanks for the comments, the attached v26 patch has the changes for the > same. > > Hi, > > Thanks for updating the patch, I have a suggestion for the gram.y. > > Currently, we have the following two members in PublicationObjSpec to > distinguish > between names of different objects for the post-analysis phase. > > >bool inh; > >bool spl_rel_type_syn; > > I was thinking do we have another way to distinguish that which can make code > smaller. I tried serval approaches and found a possible better approach. > > First, I refer to the design of Grant/Revoke syntax, it use two members > 'targtype' and 'objtype' to mark different type. 'targtype' can be > ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema > level) > .'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE > or > ... . I think if we follow this way, the code could be cleaner. > > Second, we can move the special relation expression into a separate rule > 'speical_relation_expr' like the following, this can remove duplicate code > used by pubobj_expr. > ------ > relation_expr: > qualified_name > {} > | speical_relation_expr > { > $$ = $1; > } > ; > speical_relation_expr: > qualified_name '*' > {} > | ONLY qualified_name > {} > | ONLY '(' qualified_name ')' > {} > ; > ------ > > Finally, the gram.y will look like the following. > Personally, the code looks cleaner in this approach. Besides, If we don't want to use a new flag to distinguish tablename and schemaname, We can only check the NodeTag to distinguish the difference. Attach two diff patches based on the latest schema patch which change the code with a flag and without a flag. Best regards, Hou zj
Attachment
On Fri, Sep 10, 2021 at 8:54 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Friday, September 10, 2021 10:33 AM Hou Zhijie<houzj.fnst@fujitsu.com> wrote: > > Besides, If we don't want to use a new flag to distinguish tablename and schemaname, > We can only check the NodeTag to distinguish the difference. > > Attach two diff patches based on the latest schema patch > which change the code with a flag and without a flag. > I would prefer a version without additional flags unless you think it is difficult to extend it in the future for other objects like sequences which as far as I can see shouldn't be the case. Is there a reason to define pubobj_name similar to any_name? If so, then please do add the comments. One reason I could think of is that any_name is not used for schema names currently which might have motivated you to define a separate naming convention for publication. -- With Regards, Amit Kapila.
From Friday, September 10, 2021 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie <houzj.fnst@fujitsu.com> wrote: > > > > From Friday, September 10, 2021 10:33 AM Hou > Zhijie<houzj.fnst@fujitsu.com> wrote: > > > > Besides, If we don't want to use a new flag to distinguish tablename and > schemaname, > > We can only check the NodeTag to distinguish the difference. > > > > Attach two diff patches based on the latest schema patch > > which change the code with a flag and without a flag. > > > > I would prefer a version without additional flags unless you think it > is difficult to extend it in the future for other objects like > sequences which as far as I can see shouldn't be the case. Ok, I agreed. > Is there a > reason to define pubobj_name similar to any_name? If so, then please > do add the comments. One reason I could think of is that any_name is > not used for schema names currently which might have motivated you to > define a separate naming convention for publication. When I used any_name, Bison reported that the dot('.') in rule attr would have a shift/reduce conflict with the dot('.') in rule indirection_el which also used in pubobj_expr. So, I declared a new rule which will directly use indirection_el to resolve the conflicts. Attach the without-flag version and add comments about the pubobj_name. Best regards, Hou zj
Attachment
On Fri, Sep 10, 2021 at 11:21 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Friday, September 10, 2021 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie <houzj.fnst@fujitsu.com> wrote: > > > > > > From Friday, September 10, 2021 10:33 AM Hou > > Zhijie<houzj.fnst@fujitsu.com> wrote: > > > > > > Besides, If we don't want to use a new flag to distinguish tablename and > > schemaname, > > > We can only check the NodeTag to distinguish the difference. > > > > > > Attach two diff patches based on the latest schema patch > > > which change the code with a flag and without a flag. > > > > > > > I would prefer a version without additional flags unless you think it > > is difficult to extend it in the future for other objects like > > sequences which as far as I can see shouldn't be the case. > > Ok, I agreed. > > > Is there a > > reason to define pubobj_name similar to any_name? If so, then please > > do add the comments. One reason I could think of is that any_name is > > not used for schema names currently which might have motivated you to > > define a separate naming convention for publication. > > When I used any_name, Bison reported that the dot('.') in rule attr > would have a shift/reduce conflict with the dot('.') in rule indirection_el > which also used in pubobj_expr. So, I declared a new rule which will directly > use indirection_el to resolve the conflicts. > > Attach the without-flag version and add comments about the pubobj_name. Thanks for the changes, the suggested changes make the parsing code simpler. I have merged the changes to the main patch. Attached v27 patch has the changes for the same. Regards, Vignesh
Attachment
- v27-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v27-0002-Added-schema-level-support-for-publication.patch
- v27-0003-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v27-0004-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v27-0005-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v27-0006-Implemented-pg_publication_objects-view.patch
On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the changes, the suggested changes make the parsing code > simpler. I have merged the changes to the main patch. Attached v27 > patch has the changes for the same. > +pubobj_name: ColId { $$ = list_make1(makeString($1)); } + | ColId indirection { $$ = lcons(makeString($1), $2); } ; I think "ColId indirection" should handle catalog and schema name as we are doing in qualified_name. See attached (this can be applied atop v27-0002). The one other improvement we can do here is to extract the common code from qualified_name (especially for "ColId indirection") and pubobj_name. What do you think? -- With Regards, Amit Kapila.
Attachment
On Sunday, September 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the changes, the suggested changes make the parsing code > simpler. I have merged the changes to the main patch. Attached v27 > patch has the changes for the same. > Thanks for your new patch. I had a look at the changes related to document and tried the patch. Here are several comments. 1. <para> You must own the publication to use <command>ALTER PUBLICATION</command>. Adding a table to a publication additionally requires owning that table. + The <literal>ADD SCHEMA</literal> and <literal>SET SCHEMA</literal> to a + publication requires the invoking user to be a superuser. To alter the owner, you must also be a direct or indirect member of the new owning role. The new owner must have <literal>CREATE</literal> privilege on the database. Also, the new owner of a <literal>FOR ALL TABLES</literal> publication must be a superuser. However, a superuser can change the ownership of a publication regardless of these restrictions. </para> ADD SCHEMA -> ADD ALL TABLES IN SCHEMA SET SCHEMA -> SET ALL TABLES IN SCHEMA 2. + <para> + ADD some tables and schemas to the publication: +<programlisting> +ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production; +</programlisting> + </para> ADD some tables and schemas to the publication: -> Add some tables and schemas to the publication: 3. + <para> + Drop some schema from the publication: +<programlisting> +ALTER PUBLICATION production_quarterly_publication DROP ALL TABLES IN SCHEMA production_july; +</programlisting> + </para> Drop some schema from the publication: -> Drop some schemas from the publication: 4. + The catalog <structname>pg_publication_namespace</structname> contains the + mapping between schemas and publications in the database. This is a + many-to-many mapping. There are two Spaces at the end of the paragraph. 5. Adding a table and the schema where the table belonged to is not supported. But it didn't report error message when I try to add them in the same statement by using 'ALTER PUBLICATION'. For example: postgres=# create publication pub; CREATE PUBLICATION postgres=# alter publication pub add all tables in schema s1, table s1.tbl; ALTER PUBLICATION postgres=# \dRp+ Publication pub Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root ----------+------------+---------+---------+---------+-----------+---------- postgres | f | t | t | t | t | f Tables: "s1.tbl" Tables from schemas: "s1" It didn't check if table 's1.tbl' is member of schema 's1'. 6. I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the list of all tables in schemas should be reset. The publication should only contain the tables and all tables in schemas which user specified. If user only specified all tables in schema, and didn't specify tables, the tables which used to be part of the publication should be dropped, too. But currently, if I didn't specify tables, the list of tables wouldn't be set to empty. Thoughts? For example: postgres=# create publication pub for table s1.tbl; CREATE PUBLICATION postgres=# alter publication pub set all tables in schema s2; ALTER PUBLICATION postgres=# \dRp+ Publication pub Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root ----------+------------+---------+---------+---------+-----------+---------- postgres | f | t | t | t | t | f Tables: "s1.tbl" Tables from schemas: "s2" Regards Tang
From Sun, Sept 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote: > On Fri, Sep 10, 2021 at 11:21 AM Hou Zhijie <houzj.fnst@fujitsu.com> wrote: > > Attach the without-flag version and add comments about the pubobj_name. > > Thanks for the changes, the suggested changes make the parsing code simpler. > I have merged the changes to the main patch. Attached v27 patch has the > changes for the same. Hi, I have some suggestions for the documents and comments of the new syntax. IMO, the document could be clearer in the following format. ------ Synopsis CREATE PUBLICATION name [ FOR ALL TABLES | FOR publication object [, ... ] ] [ WITH ( publication_parameter [= value] [, ... ] ) ] where publication object is one of: TABLE [ ONLY ] table_name [ * ] [, ... ] ALL TABLES IN SCHEMA { schema_name | CURRENT_SCHEMA } [, ... ] ------ Attach a diff(based on v27-*) which change the doc and comments like the following. Best regards, Hou zj
Attachment
On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > 6. > I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the > list of all tables in schemas should be reset. The publication should only > contain the tables and all tables in schemas which user specified. If user only > specified all tables in schema, and didn't specify tables, the tables which used > to be part of the publication should be dropped, too. But currently, if I didn't > specify tables, the list of tables wouldn't be set to empty. Thoughts? > I think we can go either way here but it seems like we should drop the tables in the case you mentioned. The idea is that the SET variant in ALTER PUBLICATION should replace the set of tables and schemas for the publication which seems to be in line with the current behavior where we replace the set of tables. Anyone else wants to weigh in on this? -- With Regards, Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes: > On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com > <tanghy.fnst@fujitsu.com> wrote: >> I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the >> list of all tables in schemas should be reset. The publication should only >> contain the tables and all tables in schemas which user specified. If user only >> specified all tables in schema, and didn't specify tables, the tables which used >> to be part of the publication should be dropped, too. But currently, if I didn't >> specify tables, the list of tables wouldn't be set to empty. Thoughts? > I think we can go either way here but it seems like we should drop the > tables in the case you mentioned. The idea is that the SET variant in > ALTER PUBLICATION should replace the set of tables and schemas for the > publication which seems to be in line with the current behavior where > we replace the set of tables. Yeah, I think it's sensible to define that there is just one SET variant that replaces both the list-of-tables and the list-of-schemas. (Of course, the syntax for it has to permit both lists to be written.) You could imagine having two independent SET commands for the two lists, but that seems fairly confusing. regards, tom lane
On Mon, Sep 13, 2021 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the changes, the suggested changes make the parsing code > > simpler. I have merged the changes to the main patch. Attached v27 > > patch has the changes for the same. > > > > +pubobj_name: ColId { $$ = list_make1(makeString($1)); } > + | ColId indirection { $$ = lcons(makeString($1), $2); } > ; > > I think "ColId indirection" should handle catalog and schema name as > we are doing in qualified_name. See attached (this can be applied atop > v27-0002). The one other improvement we can do here is to extract the > common code from qualified_name (especially for "ColId indirection") > and pubobj_name. What do you think? > One more suggestion for changes in gram.y: @@ -12430,7 +12476,14 @@ relation_expr: $$->inh = true; $$->alias = NULL; } - | qualified_name '*' + | special_relation_expr + { + $$ = $1; + } + ; + +special_relation_expr: + qualified_name '*' I am not sure if naming the above rule as special_relation_expr is a good idea. Can we name it as extended_relation_expr? -- With Regards, Amit Kapila.
From Tue, Sept 14, 2021 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > > > 6. > > I think if I use 'ALTER PUBLICATION ... SET', both the list of tables > > and the list of all tables in schemas should be reset. The publication > > should only contain the tables and all tables in schemas which user > > specified. If user only specified all tables in schema, and didn't > > specify tables, the tables which used to be part of the publication > > should be dropped, too. But currently, if I didn't specify tables, the list of > tables wouldn't be set to empty. Thoughts? > > > > I think we can go either way here but it seems like we should drop the tables in > the case you mentioned. The idea is that the SET variant in ALTER PUBLICATION > should replace the set of tables and schemas for the publication which seems > to be in line with the current behavior where we replace the set of tables. > > Anyone else wants to weigh in on this? I agree that the one SET variant should replaces both the list-of-tables and the list-of-schemas. Best regards, Hou zj
On Mon, Sep 13, 2021 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Sep 12, 2021 at 8:43 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the changes, the suggested changes make the parsing code > > simpler. I have merged the changes to the main patch. Attached v27 > > patch has the changes for the same. > > > > +pubobj_name: ColId { $$ = list_make1(makeString($1)); } > + | ColId indirection { $$ = lcons(makeString($1), $2); } > ; > > I think "ColId indirection" should handle catalog and schema name as > we are doing in qualified_name. See attached (this can be applied atop > v27-0002). The one other improvement we can do here is to extract the > common code from qualified_name (especially for "ColId indirection") > and pubobj_name. What do you think? I have handled this in the patch attached. Regards, Vignesh
Attachment
- v28-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v28-0002-Added-schema-level-support-for-publication.patch
- v28-0003-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v28-0004-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v28-0005-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v28-0006-Implemented-pg_publication_objects-view.patch
On Mon, Sep 13, 2021 at 7:06 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Sunday, September 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the changes, the suggested changes make the parsing code > > simpler. I have merged the changes to the main patch. Attached v27 > > patch has the changes for the same. > > > > Thanks for your new patch. I had a look at the changes related to document and tried > the patch. Here are several comments. > > 1. > <para> > You must own the publication to use <command>ALTER PUBLICATION</command>. > Adding a table to a publication additionally requires owning that table. > + The <literal>ADD SCHEMA</literal> and <literal>SET SCHEMA</literal> to a > + publication requires the invoking user to be a superuser. > To alter the owner, you must also be a direct or indirect member of the new > owning role. The new owner must have <literal>CREATE</literal> privilege on > the database. Also, the new owner of a <literal>FOR ALL TABLES</literal> > publication must be a superuser. However, a superuser can change the > ownership of a publication regardless of these restrictions. > </para> > > ADD SCHEMA > -> > ADD ALL TABLES IN SCHEMA > > SET SCHEMA > -> > SET ALL TABLES IN SCHEMA Modified > 2. > + <para> > + ADD some tables and schemas to the publication: > +<programlisting> > +ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production; > +</programlisting> > + </para> > > ADD some tables and schemas to the publication: > -> > Add some tables and schemas to the publication: Modified > 3. > + <para> > + Drop some schema from the publication: > +<programlisting> > +ALTER PUBLICATION production_quarterly_publication DROP ALL TABLES IN SCHEMA production_july; > +</programlisting> > + </para> > > Drop some schema from the publication: > -> > Drop some schemas from the publication: Modified > 4. > + The catalog <structname>pg_publication_namespace</structname> contains the > + mapping between schemas and publications in the database. This is a > + many-to-many mapping. > > There are two Spaces at the end of the paragraph. I felt this is ok, as both single space and double space are used at various places. > 5. > Adding a table and the schema where the table belonged to is not supported. But > it didn't report error message when I try to add them in the same statement by > using 'ALTER PUBLICATION'. > > For example: > postgres=# create publication pub; > CREATE PUBLICATION > postgres=# alter publication pub add all tables in schema s1, table s1.tbl; > ALTER PUBLICATION > postgres=# \dRp+ > Publication pub > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > ----------+------------+---------+---------+---------+-----------+---------- > postgres | f | t | t | t | t | f > Tables: > "s1.tbl" > Tables from schemas: > "s1" > > It didn't check if table 's1.tbl' is member of schema 's1'. Modified. > 6. > I think if I use 'ALTER PUBLICATION ... SET', both the list of tables and the > list of all tables in schemas should be reset. The publication should only > contain the tables and all tables in schemas which user specified. If user only > specified all tables in schema, and didn't specify tables, the tables which used > to be part of the publication should be dropped, too. But currently, if I didn't > specify tables, the list of tables wouldn't be set to empty. Thoughts? > > For example: > postgres=# create publication pub for table s1.tbl; > CREATE PUBLICATION > postgres=# alter publication pub set all tables in schema s2; > ALTER PUBLICATION > postgres=# \dRp+ > Publication pub > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > ----------+------------+---------+---------+---------+-----------+---------- > postgres | f | t | t | t | t | f > Tables: > "s1.tbl" > Tables from schemas: > "s2" Modified I have fixed the comments in the v28 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0OudeDeFN7bSWPro0hgKx%3D1zPgcNFWnvU_G6w3mDPX0Q%40mail.gmail.com Regards, Vignesh
On Tue, Sep 14, 2021 at 6:31 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Sun, Sept 12, 2021 11:13 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Sep 10, 2021 at 11:21 AM Hou Zhijie <houzj.fnst@fujitsu.com> wrote: > > > Attach the without-flag version and add comments about the pubobj_name. > > > > Thanks for the changes, the suggested changes make the parsing code simpler. > > I have merged the changes to the main patch. Attached v27 patch has the > > changes for the same. > > Hi, > > I have some suggestions for the documents and comments of the new syntax. > > IMO, the document could be clearer in the following format. > ------ > Synopsis > CREATE PUBLICATION name > [ FOR ALL TABLES > | FOR publication object [, ... ] ] > [ WITH ( publication_parameter [= value] [, ... ] ) ] > > where publication object is one of: > > TABLE [ ONLY ] table_name [ * ] [, ... ] > ALL TABLES IN SCHEMA { schema_name | CURRENT_SCHEMA } [, ... ] > ------ > > Attach a diff(based on v27-*) which change the doc and comments like the > following. Thanks for the comments and the changes, I have made a few changes and merged it into the v28 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0OudeDeFN7bSWPro0hgKx%3D1zPgcNFWnvU_G6w3mDPX0Q%40mail.gmail.com Regards, Vignesh
On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote: > > I have handled this in the patch attached. > Few comments: ============= 1. + * CREATE PUBLICATION FOR pub_obj [, pub_obj2] [WITH options] + * + * pub_obj is one of: + * + * TABLE table [, table2] .. .. - * ALTER PUBLICATION name ADD TABLE table [, table2] + * ALTER PUBLICATION name ADD pub_obj [, pub_obj ...] * - * ALTER PUBLICATION name DROP TABLE table [, table2] + * ALTER PUBLICATION name DROP pub_obj [, pub_obj ...] * - * ALTER PUBLICATION name SET TABLE table [, table2] + * ALTER PUBLICATION name SET pub_obj [, pub_obj ...] In all the above places, the object names mentioned in square brackets are not consistent. I suggest using [, ...] everywhere as that is what we are using in docs as well. 2. +/* + * Check if the relation schema is member of the schema list. + */ +static void +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck) Can we change the above comment as: "Check if any of the given relation's schema is a member of the given schema list."? 3. + errmsg("cannot add relation \"%s.%s\" to publication", + get_namespace_name(relSchemaId), + RelationGetRelationName(rel)), + errdetail("Table's schema \"%s\" is already part of the publication.", + get_namespace_name(relSchemaId))); This and other parts of error messages in +RelSchemaIsMemberOfSchemaList are not aligned. I think you can now run pgindent on your patches that will solve the indentation issues in the patch. 4. AlterPublicationSchemas() { .. + /* + * If the table option was not specified remove the existing tables + * from the publication. + */ + if (!tables) + { + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); + PublicationDropTables(pubform->oid, rels, false, true); + } + + /* Identify which schemas should be dropped */ + delschemas = list_difference_oid(oldschemaids, schemaidlist); + + /* And drop them */ + PublicationDropSchemas(pubform->oid, delschemas, true); Here, you have neither locked tables to be dropped nor schemas. I think both need to be locked as we do for tables in similar code in AlterPublicationTables(). Can you please test via debugger what happens if we try to drop without taking lock here and concurrently try to drop the actual object? It should give some error. If we decide to lock here then we should be able to pass the list of relations to PublicationDropTables() instead of Oids which would then obviate the need for any change to that function. Similarly don't we need to lock schemas before dropping them in AlterPublicationTables()? 5. +/* + * Find the ObjectAddress for a publication tables in schema. The first + * element of the object parameter is the schema name, the second is the + * publication name. + */ +static ObjectAddress +get_object_address_publication_schema(List *object, bool missing_ok) The first part of the above comment is not clear. Can we write it as: "Find the ObjectAddress for a publication schema. .."? 6. +List * +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt) +{ + Relation classRel; + ScanKeyData key[3]; + TableScanDesc scan; + HeapTuple tuple; + List *result = NIL; + int keycount = 0; + + Assert(schemaid != InvalidOid); Isn't it better to use OidIsValid() in the above assert? 7. @@ -974,6 +974,7 @@ EventTriggerSupportsObjectType(ObjectType obtype) case OBJECT_PROCEDURE: case OBJECT_PUBLICATION: case OBJECT_PUBLICATION_REL: + case OBJECT_PUBLICATION_REL_IN_NAMESPACE: case OBJECT_ROUTINE: case OBJECT_RULE: case OBJECT_SCHEMA: @@ -1050,6 +1051,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass) case OCLASS_EXTENSION: case OCLASS_POLICY: case OCLASS_PUBLICATION: + case OCLASS_PUBLICATION_NAMESPACE: case OCLASS_PUBLICATION_REL: case OCLASS_SUBSCRIPTION: case OCLASS_TRANSFORM: @@ -2127,6 +2129,7 @@ stringify_grant_objtype(ObjectType objtype) case OBJECT_POLICY: case OBJECT_PUBLICATION: case OBJECT_PUBLICATION_REL: + case OBJECT_PUBLICATION_REL_IN_NAMESPACE: case OBJECT_ROLE: case OBJECT_RULE: case OBJECT_STATISTIC_EXT: @@ -2209,6 +2212,7 @@ stringify_adefprivs_objtype(ObjectType objtype) case OBJECT_POLICY: case OBJECT_PUBLICATION: case OBJECT_PUBLICATION_REL: + case OBJECT_PUBLICATION_REL_IN_NAMESPACE: What is the reason for using different names for object_class and object_type? Normally, we use the same. Is it making things clear in any place? 8. + if (stmt->action == DEFELEM_ADD) + { + List *pubschemas = GetPublicationSchemas(pubid); + + /* check if the relation is member of the schema list specified */ + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false); + + /* + * Check if the relation is member of the existing schema in the + * publication. + */ + RelSchemaIsMemberOfSchemaList(rels, pubschemas, false); Isn't it better to concat the list of schemas and then check the membership of relations once? -- With Regards, Amit Kapila.
On Tue, Sep 14, 2021 at 6:38 PM vignesh C <vignesh21@gmail.com> wrote: > > I have handled this in the patch attached. > Regarding the following function in the v28-0002 patch: +/* + * Check if the relation schema is member of the schema list. + */ +static void +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck) I think this function is not well named or commented, and I don't like how the "schemacheck" bool parameter determines the type of objects in the "rels" list. I would suggest you simply split this function into two separate functions, corresponding to each of the blocks of the "if-else" within the for-loop of the existing RelSchemaIsMemberOfSchemaList function. The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function name implies a boolean return value, so seems misleading. So I think the names of the two functions that I am suggesting should be "CheckXXXXNotAlreadyInPublication" or something similar. Regards, Greg Nancarrow Fujitsu Australia
On Tuesday, September 14, 2021 4:39 PM vignesh C <vignesh21@gmail.com> wrote: > > I have handled this in the patch attached. Thanks for updating the patch. Here are some comments. 1) +static void +AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel, ... + /* + * If the table option was not specified remove the existing tables + * from the publication. + */ + if (!tables) + { + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); + PublicationDropTables(pubform->oid, rels, false, true); + } It seems not natural to drop tables in AlterPublication*Schemas*, I think we'd better do it in AlterPublicationTables. 2) static void AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel, ... + /* + * If ALL TABLES IN SCHEMA option was not specified remove the + * existing schemas from the publication. + */ + List *pubschemas = GetPublicationSchemas(pubid); + PublicationDropSchemas(pubform->oid, pubschemas, false); Same as 1), Is it better to modify the schema list in AlterPublicationSchemas ? 3) static void AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel, ... /* check if the relation is member of the schema list specified */ RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false); IIRC, The check here is to check the specified tables and schemas in the command. Personally, this seems a common operation which can be placed in function AlterPublication(). If we move this check to AlterPublication() and if comment 1) and 2) makes sense to you, then we don't need the new function parameters in AlterPublicationTables() and AlterPublicationSchemas(). Best regards, Hou zj
On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have handled this in the patch attached. > > > > 4. > AlterPublicationSchemas() > { > .. > + /* > + * If the table option was not specified remove the existing tables > + * from the publication. > + */ > + if (!tables) > + { > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > + PublicationDropTables(pubform->oid, rels, false, true); > + } > + > + /* Identify which schemas should be dropped */ > + delschemas = list_difference_oid(oldschemaids, schemaidlist); > + > + /* And drop them */ > + PublicationDropSchemas(pubform->oid, delschemas, true); > > Here, you have neither locked tables to be dropped nor schemas. I > think both need to be locked as we do for tables in similar code in > AlterPublicationTables(). Can you please test via debugger what > happens if we try to drop without taking lock here and concurrently > try to drop the actual object? It should give some error. If we decide > to lock here then we should be able to pass the list of relations to > PublicationDropTables() instead of Oids which would then obviate the > need for any change to that function. > > Similarly don't we need to lock schemas before dropping them in > AlterPublicationTables()? > I think there is one more similar locking problem. AlterPublicationSchemas() { .. + if (stmt->action == DEFELEM_ADD) + { + List *rels; + + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true); ... ... } Here, we don't have a lock on the relation. So, what if the relation is concurrently dropped after you get the rel list by GetPublicationRelations? -- With Regards, Amit Kapila.
On Wed, Sep 15, 2021 at 4:45 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Tue, Sep 14, 2021 at 6:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have handled this in the patch attached. > > > > Regarding the following function in the v28-0002 patch: > > +/* > + * Check if the relation schema is member of the schema list. > + */ > +static void > +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck) > > I think this function is not well named or commented, and I don't like > how the "schemacheck" bool parameter determines the type of objects in > the "rels" list. > I think after fixing the comments in my previous email, the rels list will become the same for this function but surely the extra parameter is required for giving object-specific errors. > I would suggest you simply split this function into two separate > functions, corresponding to each of the blocks of the "if-else" within > the for-loop of the existing RelSchemaIsMemberOfSchemaList function. > The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function > name implies a boolean return value, so seems misleading. > So I think the names of the two functions that I am suggesting should > be "CheckXXXXNotAlreadyInPublication" or something similar. > I think if we write individual functions then we need to add new functions as and when we add new object types like sequences. The other idea could be to keep a single function like now say CheckObjSchemaNotAlreadyInPublication and instead of the bool parameter as the patch has now, we can keep an enum parameter "add_obj_type" for 'rel', 'schema', 'sequence'. We can either use exiting enum PublicationObjSpecType or define a new one for the same. -- With Regards, Amit Kapila.
On Thu, Sep 16, 2021 at 8:59 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, September 14, 2021 4:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have handled this in the patch attached. > > Thanks for updating the patch. > Here are some comments. > > 1) > +static void > +AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel, > ... > + /* > + * If the table option was not specified remove the existing tables > + * from the publication. > + */ > + if (!tables) > + { > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > + PublicationDropTables(pubform->oid, rels, false, true); > + } > > > It seems not natural to drop tables in AlterPublication*Schemas*, > I think we'd better do it in AlterPublicationTables. I felt keeping the current way keeps it better to avoid additional checks. Thoughts? > 2) > static void > AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel, > ... > + /* > + * If ALL TABLES IN SCHEMA option was not specified remove the > + * existing schemas from the publication. > + */ > + List *pubschemas = GetPublicationSchemas(pubid); > + PublicationDropSchemas(pubform->oid, pubschemas, false); > > Same as 1), Is it better to modify the schema list in AlterPublicationSchemas ? This is similar to above. > 3) > static void > AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel, > ... > /* check if the relation is member of the schema list specified */ > RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false); > > IIRC, The check here is to check the specified tables and schemas in the > command. Personally, this seems a common operation which can be placed in > function AlterPublication(). If we move this check to AlterPublication() and if > comment 1) and 2) makes sense to you, then we don't need the new function > parameters in AlterPublicationTables() and AlterPublicationSchemas(). I felt we can keep the checks as is currently, else we will have to extra checks outside and addition calls for conversion from oid to Relation like: if (stmt->options) AlterPublicationOptions(pstate, stmt, rel, tup); else { if (relations) { if (stmt->action != DEFELEM_DROP) { List *rels = OpenTableList(relations); /* check if relation is member of the schema list specified */ RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false); CloseTableList(rels); } AlterPublicationTables(stmt, rel, tup, relations, list_length(schemaidlist)); } if (schemaidlist) AlterPublicationSchemas(stmt, rel, tup, schemaidlist, list_length(relations)); } Thoughts? Regards, Vignesh
On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have handled this in the patch attached. > > > > Few comments: > ============= > 1. > + * CREATE PUBLICATION FOR pub_obj [, pub_obj2] [WITH options] > + * > + * pub_obj is one of: > + * > + * TABLE table [, table2] > .. > .. > - * ALTER PUBLICATION name ADD TABLE table [, table2] > + * ALTER PUBLICATION name ADD pub_obj [, pub_obj ...] > * > - * ALTER PUBLICATION name DROP TABLE table [, table2] > + * ALTER PUBLICATION name DROP pub_obj [, pub_obj ...] > * > - * ALTER PUBLICATION name SET TABLE table [, table2] > + * ALTER PUBLICATION name SET pub_obj [, pub_obj ...] > > In all the above places, the object names mentioned in square brackets > are not consistent. I suggest using [, ...] everywhere as that is what > we are using in docs as well. Modified > 2. > +/* > + * Check if the relation schema is member of the schema list. > + */ > +static void > +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck) > > Can we change the above comment as: "Check if any of the given > relation's schema is a member of the given schema list."? Modified > 3. > + errmsg("cannot add relation \"%s.%s\" to publication", > + get_namespace_name(relSchemaId), > + RelationGetRelationName(rel)), > + errdetail("Table's schema \"%s\" is already part of the publication.", > + get_namespace_name(relSchemaId))); > > This and other parts of error messages in > +RelSchemaIsMemberOfSchemaList are not aligned. I think you can now > run pgindent on your patches that will solve the indentation issues in > the patch. Changed by running pgindent. > 4. > AlterPublicationSchemas() > { > .. > + /* > + * If the table option was not specified remove the existing tables > + * from the publication. > + */ > + if (!tables) > + { > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > + PublicationDropTables(pubform->oid, rels, false, true); > + } > + > + /* Identify which schemas should be dropped */ > + delschemas = list_difference_oid(oldschemaids, schemaidlist); > + > + /* And drop them */ > + PublicationDropSchemas(pubform->oid, delschemas, true); > > Here, you have neither locked tables to be dropped nor schemas. I > think both need to be locked as we do for tables in similar code in > AlterPublicationTables(). Can you please test via debugger what > happens if we try to drop without taking lock here and concurrently > try to drop the actual object? It should give some error. If we decide > to lock here then we should be able to pass the list of relations to > PublicationDropTables() instead of Oids which would then obviate the > need for any change to that function. > > Similarly don't we need to lock schemas before dropping them in > AlterPublicationTables()? we will get the following error, if concurrently dropped from another session during debugging: postgres=# alter publication pub1 set all tables in schema sch2; ERROR: cache lookup failed for publication table 16418 Modified to add locking > 5. > +/* > + * Find the ObjectAddress for a publication tables in schema. The first > + * element of the object parameter is the schema name, the second is the > + * publication name. > + */ > +static ObjectAddress > +get_object_address_publication_schema(List *object, bool missing_ok) > > The first part of the above comment is not clear. Can we write it as: > "Find the ObjectAddress for a publication schema. .."? Modified > 6. > +List * > +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt) > +{ > + Relation classRel; > + ScanKeyData key[3]; > + TableScanDesc scan; > + HeapTuple tuple; > + List *result = NIL; > + int keycount = 0; > + > + Assert(schemaid != InvalidOid); > > Isn't it better to use OidIsValid() in the above assert? Modified > 7. > @@ -974,6 +974,7 @@ EventTriggerSupportsObjectType(ObjectType obtype) > case OBJECT_PROCEDURE: > case OBJECT_PUBLICATION: > case OBJECT_PUBLICATION_REL: > + case OBJECT_PUBLICATION_REL_IN_NAMESPACE: > case OBJECT_ROUTINE: > case OBJECT_RULE: > case OBJECT_SCHEMA: > @@ -1050,6 +1051,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass) > case OCLASS_EXTENSION: > case OCLASS_POLICY: > case OCLASS_PUBLICATION: > + case OCLASS_PUBLICATION_NAMESPACE: > case OCLASS_PUBLICATION_REL: > case OCLASS_SUBSCRIPTION: > case OCLASS_TRANSFORM: > @@ -2127,6 +2129,7 @@ stringify_grant_objtype(ObjectType objtype) > case OBJECT_POLICY: > case OBJECT_PUBLICATION: > case OBJECT_PUBLICATION_REL: > + case OBJECT_PUBLICATION_REL_IN_NAMESPACE: > case OBJECT_ROLE: > case OBJECT_RULE: > case OBJECT_STATISTIC_EXT: > @@ -2209,6 +2212,7 @@ stringify_adefprivs_objtype(ObjectType objtype) > case OBJECT_POLICY: > case OBJECT_PUBLICATION: > case OBJECT_PUBLICATION_REL: > + case OBJECT_PUBLICATION_REL_IN_NAMESPACE: > > What is the reason for using different names for object_class and > object_type? Normally, we use the same. Is it making things clear in > any place? I thought it might be easier to review, I have changed it to the standard way of naming object_class and object_type. Renamed it to OBJECT_PUBLICATION_NAMESPACE. > 8. > + if (stmt->action == DEFELEM_ADD) > + { > + List *pubschemas = GetPublicationSchemas(pubid); > + > + /* check if the relation is member of the schema list specified */ > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false); > + > + /* > + * Check if the relation is member of the existing schema in the > + * publication. > + */ > + RelSchemaIsMemberOfSchemaList(rels, pubschemas, false); > > Isn't it better to concat the list of schemas and then check the > membership of relations once? Modified. Attached v29 patch has the fixes for the same. Regards, Vignesh
Attachment
- v29-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v29-0002-Added-schema-level-support-for-publication.patch
- v29-0003-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v29-0004-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v29-0005-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v29-0006-Implemented-pg_publication_objects-view.patch
On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Sep 14, 2021 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > I have handled this in the patch attached. > > > > > > > 4. > > AlterPublicationSchemas() > > { > > .. > > + /* > > + * If the table option was not specified remove the existing tables > > + * from the publication. > > + */ > > + if (!tables) > > + { > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > > + PublicationDropTables(pubform->oid, rels, false, true); > > + } > > + > > + /* Identify which schemas should be dropped */ > > + delschemas = list_difference_oid(oldschemaids, schemaidlist); > > + > > + /* And drop them */ > > + PublicationDropSchemas(pubform->oid, delschemas, true); > > > > Here, you have neither locked tables to be dropped nor schemas. I > > think both need to be locked as we do for tables in similar code in > > AlterPublicationTables(). Can you please test via debugger what > > happens if we try to drop without taking lock here and concurrently > > try to drop the actual object? It should give some error. If we decide > > to lock here then we should be able to pass the list of relations to > > PublicationDropTables() instead of Oids which would then obviate the > > need for any change to that function. > > > > Similarly don't we need to lock schemas before dropping them in > > AlterPublicationTables()? > > > > I think there is one more similar locking problem. > AlterPublicationSchemas() > { > .. > + if (stmt->action == DEFELEM_ADD) > + { > + List *rels; > + > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true); > ... > ... > } > > Here, we don't have a lock on the relation. So, what if the relation > is concurrently dropped after you get the rel list by > GetPublicationRelations? This works fine without locking even after concurrent drop, I felt this works because of MVCC. Regards, Vignesh
On Thu, Sep 16, 2021 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 15, 2021 at 4:45 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Tue, Sep 14, 2021 at 6:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > I have handled this in the patch attached. > > > > > > > Regarding the following function in the v28-0002 patch: > > > > +/* > > + * Check if the relation schema is member of the schema list. > > + */ > > +static void > > +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck) > > > > I think this function is not well named or commented, and I don't like > > how the "schemacheck" bool parameter determines the type of objects in > > the "rels" list. > > > > I think after fixing the comments in my previous email, the rels list > will become the same for this function but surely the extra parameter > is required for giving object-specific errors. > > > I would suggest you simply split this function into two separate > > functions, corresponding to each of the blocks of the "if-else" within > > the for-loop of the existing RelSchemaIsMemberOfSchemaList function. > > The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function > > name implies a boolean return value, so seems misleading. > > So I think the names of the two functions that I am suggesting should > > be "CheckXXXXNotAlreadyInPublication" or something similar. > > > > I think if we write individual functions then we need to add new > functions as and when we add new object types like sequences. The > other idea could be to keep a single function like now say > CheckObjSchemaNotAlreadyInPublication and instead of the bool > parameter as the patch has now, we can keep an enum parameter > "add_obj_type" for 'rel', 'schema', 'sequence'. We can either use > exiting enum PublicationObjSpecType or define a new one for the same. Modified the function name and changed the parameter to PublicationObjSpecType. The changes are present at the v29 patch posted at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm1Wb%3D_HGd85wp2WM%2BfLc-8PSJ824TOZEJ6nDz3akWTidw%40mail.gmail.com Regards, Vignesh
On Fri, Sep 17, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > 4. > > AlterPublicationSchemas() > > { > > .. > > + /* > > + * If the table option was not specified remove the existing tables > > + * from the publication. > > + */ > > + if (!tables) > > + { > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > > + PublicationDropTables(pubform->oid, rels, false, true); > > + } > > + > > + /* Identify which schemas should be dropped */ > > + delschemas = list_difference_oid(oldschemaids, schemaidlist); > > + > > + /* And drop them */ > > + PublicationDropSchemas(pubform->oid, delschemas, true); > > > > Here, you have neither locked tables to be dropped nor schemas. I > > think both need to be locked as we do for tables in similar code in > > AlterPublicationTables(). Can you please test via debugger what > > happens if we try to drop without taking lock here and concurrently > > try to drop the actual object? It should give some error. If we decide > > to lock here then we should be able to pass the list of relations to > > PublicationDropTables() instead of Oids which would then obviate the > > need for any change to that function. > > > > Similarly don't we need to lock schemas before dropping them in > > AlterPublicationTables()? > > we will get the following error, if concurrently dropped from another > session during debugging: > postgres=# alter publication pub1 set all tables in schema sch2; > ERROR: cache lookup failed for publication table 16418 > Modified to add locking > But you haven't followed my other suggestion related to PublicationDropTables(). I don't think after doing this, you need to pass 'true' as the last parameter to PublicationDropTables. In fact, you can remove that parameter altogether or in other words, we don't need any change in PublicationDropTables for this patch. Is there a reason why we shouldn't make this change? Few other comments: =================== 1. The ordering of lock acquisition for schema and relation in AlterPublicationSchemas() and AlterPublicationTables() is opposite which would generally lead to deadlock but it won't here because we acquire share lock on the schema. But, I think it may still be better to keep the locking order the same and it might help us to keep schema and relation code separate 2. One more thing, I think one can concurrently add-relation for a particular schema and that particular schema. To protect that AlterPublication should acquire an exclusive lock similar to how we do in AlterSubscription. 3. + /* + * If the table option was not specified remove the existing tables + * from the publication. + */ + if (!relsSpecified) + { + List *relations = NIL; + List *tables = NIL; + + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); + tables = RelationOidsToRangevars(rels); + relations = OpenTableList(tables); One problem with using OpenTableList here is that it might try to lock inherited children twice. Also, you don't need to first convert to rangevar for locking relations, you can directly use table_open here. 4. + | extended_relation_expr + { + $$ = makeNode(PublicationObjSpec); + $$->object = (Node *)$1; + } + | CURRENT_SCHEMA + { + $$ = makeNode(PublicationObjSpec); + $$->object = (Node *)makeString("CURRENT_SCHEMA"); + } ; -publication_for_tables: - FOR TABLE publication_table_list +/* This can be either a schema or relation name. */ +pubobj_name: + ColId { - $$ = (Node *) $3; + $$ = (Node *) makeString($1); } - | FOR ALL TABLES + | ColId indirection { - $$ = (Node *) makeInteger(true); + $$ = (Node *) makeRangeVarFromQualifiedName($1, $2, @1, yyscanner); } In some places, you have given space after (Node *) and at other places, there is no space. Isn't it better to be consistent? 5. +/* This can be either a schema or relation name. */ +pubobj_name: Here, we can modify the comment as "This can be either a schema or relation name. For relations, the inheritance will be implicit." And then remove the inheritance related comment from code below: + /* inheritance query, implicitly */ + $$ = makeNode(PublicationObjSpec); + $$->object = $1; -- With Regards, Amit Kapila.
On Fri, Sep 17, 2021 at 5:40 PM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > I think there is one more similar locking problem. > > AlterPublicationSchemas() > > { > > .. > > + if (stmt->action == DEFELEM_ADD) > > + { > > + List *rels; > > + > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true); > > ... > > ... > > } > > > > Here, we don't have a lock on the relation. So, what if the relation > > is concurrently dropped after you get the rel list by > > GetPublicationRelations? > > This works fine without locking even after concurrent drop, I felt > this works because of MVCC. > Can you share the exact scenario you have tested? I think here it can give a wrong error because it might access invalid cache entry, so I think a lock is required here. Also, as said before, this might help to make the rel list consistent in function CheckObjSchemaNotAlreadyInPublication(). -- With Regards, Amit Kapila.
On Mon, Sep 20, 2021 at 3:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 17, 2021 at 5:40 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > I think there is one more similar locking problem. > > > AlterPublicationSchemas() > > > { > > > .. > > > + if (stmt->action == DEFELEM_ADD) > > > + { > > > + List *rels; > > > + > > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > > > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true); > > > ... > > > ... > > > } > > > > > > Here, we don't have a lock on the relation. So, what if the relation > > > is concurrently dropped after you get the rel list by > > > GetPublicationRelations? > > > > This works fine without locking even after concurrent drop, I felt > > this works because of MVCC. > > > > Can you share the exact scenario you have tested? I think here it can > give a wrong error because it might access invalid cache entry, so I > think a lock is required here. Also, as said before, this might help > to make the rel list consistent in function > CheckObjSchemaNotAlreadyInPublication(). This is the scenario I tried: create schema sch1; create table sch1.t1 (c1 int); create publication pub1 for table sch1.t1; alter publication pub1 add all tables in schema sch1; -- concurrently drop table sch1.t1 from another session. I will add the locking and changing of CheckObjSchemaNotAlreadyInPublication in the next version. Regards, Vignesh
On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v29 patch has the fixes for the same. > Some minor comments on the v29-0002 patch: (1) In get_object_address_publication_schema(), the error message: + errmsg("publication tables of schema \"%s\" in publication \"%s\" does not exist", isn't grammatically correct. It should probably be: + errmsg("publication tables of schema \"%s\" in publication \"%s\" do not exist", (2) getPublicationSchemaInfo() function header. "string" should be "strings" BEFORE: + * nspname. Both pubname and nspname are palloc'd string which will be freed by AFTER: + * nspname. Both pubname and nspname are palloc'd strings which will be freed by (3) getPublicationSchemaInfo() In the case of "if (!(*nspname))", the following line should probably be added before returning false: *pubname = NULL; (4) GetAllSchemasPublicationRelations() function header Shouldn't: + * Gets the list of all relations published by FOR ALL TABLES IN SCHEMA + * publication(s). actually say: + * Gets the list of all relations published by a FOR ALL TABLES IN SCHEMA + * publication. since it is for a specified publication? (5) I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of checking "checkobjtype" each loop iteration, wouldn't it be better to just use the same for-loop in each IF block? Regards, Greg Nancarrow Fujitsu Australia
On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v29 patch has the fixes for the same. > > > > Some minor comments on the v29-0002 patch: > > (1) > In get_object_address_publication_schema(), the error message: > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" > does not exist", > > isn't grammatically correct. It should probably be: > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do > not exist", "does not exist" is used across the file. Should we keep it like that to maintain consistency. Thoughts? > (2) > getPublicationSchemaInfo() function header. > "string" should be "strings" > > BEFORE: > + * nspname. Both pubname and nspname are palloc'd string which will be freed by > AFTER: > + * nspname. Both pubname and nspname are palloc'd strings which will > be freed by I will change it in the next version. > (3) > getPublicationSchemaInfo() > > In the case of "if (!(*nspname))", the following line should probably > be added before returning false: > > *pubname = NULL; In case of failure we return false and don't access it. I felt we could keep it as it is. Thoughts? > (4) > GetAllSchemasPublicationRelations() function header > > Shouldn't: > > + * Gets the list of all relations published by FOR ALL TABLES IN SCHEMA > + * publication(s). > > actually say: > > + * Gets the list of all relations published by a FOR ALL TABLES IN SCHEMA > + * publication. > > since it is for a specified publication? I will change it in the next version. > (5) > I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of > checking "checkobjtype" each loop iteration, wouldn't it be better to > just use the same for-loop in each IF block? I will be changing it to: static void CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist, PublicationObjSpecType checkobjtype) { ListCell *lc; foreach(lc, rels) { Relation rel = (Relation) lfirst(lc); Oid relSchemaId = RelationGetNamespace(rel); if (list_member_oid(schemaidlist, relSchemaId)) { if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA) ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("cannot add schema \"%s\" to publication", get_namespace_name(relSchemaId)), errdetail("Table \"%s\" in schema \"%s\" is already part of the publication, adding the same schema is not supported.", RelationGetRelationName(rel), get_namespace_name(relSchemaId))); else if (checkobjtype == PUBLICATIONOBJ_TABLE) ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("cannot add relation \"%s.%s\" to publication", get_namespace_name(relSchemaId), RelationGetRelationName(rel)), errdetail("Table's schema \"%s\" is already part of the publication.", get_namespace_name(relSchemaId))); } } } After the change checkobjtype will be checked only once in case of error. Regards, Vignesh
On Tue, Sep 21, 2021 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote: > > > (1) > > In get_object_address_publication_schema(), the error message: > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" > > does not exist", > > > > isn't grammatically correct. It should probably be: > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do > > not exist", > > "does not exist" is used across the file. Should we keep it like that > to maintain consistency. Thoughts? > When it's singular, "does not exist" is correct. I think currently only this case exists in the publication code. e.g. "publication \"%s\" does not exist" "publication relation \"%s\" in publication \"%s\" does not exist" But "publication tables" is plural, so it needs to say "do not exist" rather than "does not exist". > > > > In the case of "if (!(*nspname))", the following line should probably > > be added before returning false: > > > > *pubname = NULL; > > In case of failure we return false and don't access it. I felt we > could keep it as it is. Thoughts? > OK then, I might be being a bit pedantic. (I was just thinking, strictly speaking, we probably shouldn't be writing into the caller's pass-by-reference parameters in the case false is returned) > > (5) > > I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of > > checking "checkobjtype" each loop iteration, wouldn't it be better to > > just use the same for-loop in each IF block? > > I will be changing it to: > static void > CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist, > PublicationObjSpecType checkobjtype) > { > ListCell *lc; > > foreach(lc, rels) > { > Relation rel = (Relation) lfirst(lc); > Oid relSchemaId = RelationGetNamespace(rel); > > if (list_member_oid(schemaidlist, relSchemaId)) > { > if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA) > ereport(ERROR, > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("cannot add schema \"%s\" to publication", > get_namespace_name(relSchemaId)), > errdetail("Table \"%s\" in schema \"%s\" is already part > of the publication, adding the same schema is not supported.", > RelationGetRelationName(rel), > get_namespace_name(relSchemaId))); > else if (checkobjtype == PUBLICATIONOBJ_TABLE) > ereport(ERROR, > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("cannot add relation \"%s.%s\" to publication", > get_namespace_name(relSchemaId), > RelationGetRelationName(rel)), > errdetail("Table's schema \"%s\" is already part of the > publication.", > get_namespace_name(relSchemaId))); > } > } > } > After the change checkobjtype will be checked only once in case of error. > OK. One thing related to this code is the following: i) postgres=# create publication pub1 for all tables in schema sch1, table sch1.test; ERROR: cannot add relation "sch1.test" to publication DETAIL: Table's schema "sch1" is already part of the publication. ii) postgres=# create publication pub1 for table sch1.test, all tables in schema sch1; ERROR: cannot add relation "sch1.test" to publication DETAIL: Table's schema "sch1" is already part of the publication. Notice that in case (ii), the same error message is used, but the order of items to be "added" to the publication is the reverse of case (i), and really implies the table "sch1.test" was added first, but this is not reflected by the error message. So it seems slightly odd to say the schema is already part of the publication, when the table was actually listed first. I'm wondering if this can be improved? One idea I had was the following more generic type of message, but I'm not 100% happy with the wording: DETAIL: Schema "sch1" and one of its tables can't separately be part of the publication. Regards, Greg Nancarrow Fujitsu Australia
On Mon, Sep 20, 2021 at 3:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 17, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > 4. > > > AlterPublicationSchemas() > > > { > > > .. > > > + /* > > > + * If the table option was not specified remove the existing tables > > > + * from the publication. > > > + */ > > > + if (!tables) > > > + { > > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > > > + PublicationDropTables(pubform->oid, rels, false, true); > > > + } > > > + > > > + /* Identify which schemas should be dropped */ > > > + delschemas = list_difference_oid(oldschemaids, schemaidlist); > > > + > > > + /* And drop them */ > > > + PublicationDropSchemas(pubform->oid, delschemas, true); > > > > > > Here, you have neither locked tables to be dropped nor schemas. I > > > think both need to be locked as we do for tables in similar code in > > > AlterPublicationTables(). Can you please test via debugger what > > > happens if we try to drop without taking lock here and concurrently > > > try to drop the actual object? It should give some error. If we decide > > > to lock here then we should be able to pass the list of relations to > > > PublicationDropTables() instead of Oids which would then obviate the > > > need for any change to that function. > > > > > > Similarly don't we need to lock schemas before dropping them in > > > AlterPublicationTables()? > > > > we will get the following error, if concurrently dropped from another > > session during debugging: > > postgres=# alter publication pub1 set all tables in schema sch2; > > ERROR: cache lookup failed for publication table 16418 > > Modified to add locking > > > > But you haven't followed my other suggestion related to > PublicationDropTables(). I don't think after doing this, you need to > pass 'true' as the last parameter to PublicationDropTables. In fact, > you can remove that parameter altogether or in other words, we don't > need any change in PublicationDropTables for this patch. Is there a > reason why we shouldn't make this change? Modified. > Few other comments: > =================== > 1. The ordering of lock acquisition for schema and relation in > AlterPublicationSchemas() and AlterPublicationTables() is opposite > which would generally lead to deadlock but it won't here because we > acquire share lock on the schema. But, I think it may still be better > to keep the locking order the same and it might help us to keep schema > and relation code separate Modified > 2. One more thing, I think one can concurrently add-relation for a > particular schema and that particular schema. To protect that > AlterPublication should acquire an exclusive lock similar to how we do > in AlterSubscription. Modified > 3. > + /* > + * If the table option was not specified remove the existing tables > + * from the publication. > + */ > + if (!relsSpecified) > + { > + List *relations = NIL; > + List *tables = NIL; > + > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > + tables = RelationOidsToRangevars(rels); > + relations = OpenTableList(tables); > > One problem with using OpenTableList here is that it might try to lock > inherited children twice. Also, you don't need to first convert to > rangevar for locking relations, you can directly use table_open here. Modified > 4. > + | extended_relation_expr > + { > + $$ = makeNode(PublicationObjSpec); > + $$->object = (Node *)$1; > + } > + | CURRENT_SCHEMA > + { > + $$ = makeNode(PublicationObjSpec); > + $$->object = (Node *)makeString("CURRENT_SCHEMA"); > + } > ; > > -publication_for_tables: > - FOR TABLE publication_table_list > +/* This can be either a schema or relation name. */ > +pubobj_name: > + ColId > { > - $$ = (Node *) $3; > + $$ = (Node *) makeString($1); > } > - | FOR ALL TABLES > + | ColId indirection > { > - $$ = (Node *) makeInteger(true); > + $$ = (Node *) makeRangeVarFromQualifiedName($1, $2, @1, yyscanner); > } > > In some places, you have given space after (Node *) and at other > places, there is no space. Isn't it better to be consistent? I have changed it to "(Node *)" without space in *.y, I did not change in *.c as I noticed it is used with space like "(Node *) " in other places of *.c files. > 5. > +/* This can be either a schema or relation name. */ > +pubobj_name: > > Here, we can modify the comment as "This can be either a schema or > relation name. For relations, the inheritance will be implicit." Modified And > then remove the inheritance related comment from code below: > > + /* inheritance query, implicitly */ > + $$ = makeNode(PublicationObjSpec); > + $$->object = $1; Modified. Attached v30 patch has the fixes for the same. Regards, Vignesh
Attachment
- v30-0001-Made-the-existing-relation-cache-invalidation-an.patch
- v30-0002-Added-schema-level-support-for-publication.patch
- v30-0003-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v30-0004-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v30-0005-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v30-0006-Implemented-pg_publication_objects-view.patch
On Mon, Sep 20, 2021 at 4:20 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Sep 20, 2021 at 3:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Sep 17, 2021 at 5:40 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Thu, Sep 16, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > I think there is one more similar locking problem. > > > > AlterPublicationSchemas() > > > > { > > > > .. > > > > + if (stmt->action == DEFELEM_ADD) > > > > + { > > > > + List *rels; > > > > + > > > > + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT); > > > > + RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true); > > > > ... > > > > ... > > > > } > > > > > > > > Here, we don't have a lock on the relation. So, what if the relation > > > > is concurrently dropped after you get the rel list by > > > > GetPublicationRelations? > > > > > > This works fine without locking even after concurrent drop, I felt > > > this works because of MVCC. > > > > > > > Can you share the exact scenario you have tested? I think here it can > > give a wrong error because it might access invalid cache entry, so I > > think a lock is required here. Also, as said before, this might help > > to make the rel list consistent in function > > CheckObjSchemaNotAlreadyInPublication(). > > This is the scenario I tried: > create schema sch1; > create table sch1.t1 (c1 int); > create publication pub1 for table sch1.t1; > alter publication pub1 add all tables in schema sch1; -- concurrently > drop table sch1.t1 from another session. > > I will add the locking and changing of > CheckObjSchemaNotAlreadyInPublication in the next version. I have made the changes for the above at the v30 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm3aFtXpkD4m28-ENG9F4faBEVdGNUrEhgKV0pHr2S_C2g%40mail.gmail.com Regards, Vignesh
On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v29 patch has the fixes for the same. > > > > Some minor comments on the v29-0002 patch: > > (1) > In get_object_address_publication_schema(), the error message: > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" > does not exist", > > isn't grammatically correct. It should probably be: > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do > not exist", Modified > (2) > getPublicationSchemaInfo() function header. > "string" should be "strings" > > BEFORE: > + * nspname. Both pubname and nspname are palloc'd string which will be freed by > AFTER: > + * nspname. Both pubname and nspname are palloc'd strings which will > be freed by Modified > (3) > getPublicationSchemaInfo() > > In the case of "if (!(*nspname))", the following line should probably > be added before returning false: > > *pubname = NULL; I left it as it is, as we don't access it in case of return false. > (4) > GetAllSchemasPublicationRelations() function header > > Shouldn't: > > + * Gets the list of all relations published by FOR ALL TABLES IN SCHEMA > + * publication(s). > > actually say: > > + * Gets the list of all relations published by a FOR ALL TABLES IN SCHEMA > + * publication. > > since it is for a specified publication? Modified > (5) > I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of > checking "checkobjtype" each loop iteration, wouldn't it be better to > just use the same for-loop in each IF block? Modified I have made the changes for the above at the v30 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm3aFtXpkD4m28-ENG9F4faBEVdGNUrEhgKV0pHr2S_C2g%40mail.gmail.com Regards, Vignesh
On Wed, Sep 22, 2021 at 4:02 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v30 patch has the fixes for the same. > Thanks for all the patch updates. I have some suggested updates to the v30-0005 documentation patch: doc/src/sgml/ref/alter_publication.sgml (1) I'm thinking it might be better to simplify the description, because it's a bit wordy and difficult to read with the "all tables in schema" bits. Suggested update is below (thoughts?): BEFORE: + The first three variants change which tables and/or all tables in schema are + part of the publication. The <literal>SET</literal> clause will replace + the list of tables and/or all tables in schema in the publication with the + specified one, the existing tables and all tables in schema that were + present in the publication will be removed. The <literal>ADD</literal> + clause will add one or more tables and/or all tables in schema to the + publication. The <literal>DROP</literal> clauses will remove one or more + tables and/or all tables in schema from the publication. Note that adding + tables and/or all tables in schema to a publication that is already + subscribed to will require a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> + action on the subscribing side in order to become effective. AFTER: + The first three variants change which tables/schemas are + part of the publication. The <literal>SET</literal> clause will replace + the list of tables/schemas in the publication with the + specified list; the existing tables/schemas that were + present in the publication will be removed. The <literal>ADD</literal> + clause will add one or more tables/schemas to the + publication. The <literal>DROP</literal> clauses will remove one or more + tables/schemas from the publication. Note that adding + tables/schemas to a publication that is already + subscribed to will require a <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> + action on the subscribing side in order to become effective. doc/src/sgml/ref/create_publication.sgml (2) I suggest an update similar to the following: BEFORE: + Specifying a table that is part of schema specified in + <literal>FOR ALL TABLES IN SCHEMA</literal> option is not supported. AFTER: + Specifying a table that is part of a schema already included in the publication is not supported. (3) I find the following description a little unclear: + <para> + Specifying a schema along with schema's table specified as part of + <literal>FOR TABLE</literal> option is not supported. + </para> Perhaps the following would be better: + <para> + Specifying a schema that contains a table already included in the + publication is not supported. + </para> (4) Minor fix: BEFORE: + rights on the table. The <command>FOR ALL TABLES</command> and + <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking + user to be a superuser. AFTER: + rights on the table. The <command>FOR ALL TABLES</command> and + <command>FOR ALL TABLES IN SCHEMA</command> clauses require the invoking + user to be a superuser. Regards, Greg Nancarrow Fujitsu Australia
On Wednesday, September 22, 2021 2:02 AM vignesh C <vignesh21@gmail.com> wrote: > Attached v30 patch has the fixes for the same. Thanks for updating the patches. I have one comment. @@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt) ... + if (list_length(relations)) + { ... + /* remove the existing schemas from the publication */ + PublicationDropSchemas(pubform->oid, delschemas, false); ... + } After more thoughts on it, I still don't think drop all the schemas under " if (list_length(relations))" is a good idea. I think 1) we'd better keep schema and relation code separate. 2) if we support other type object(SEQUENCE) I the future and only SET xx SEQUENCE, I think the above logic won't work because both relations and schemaidlist will be NIL. Same as the logic of drop all tables under " if (list_length(schemaidlist))". Thoughs ? Best regards, Hou zj
On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote: > > > Attached v30 patch has the fixes for the same. > Thank you for updating the patches. Here are random comments on v30-0002 patch: + + if (stmt->action == DEFELEM_SET && !list_length(schemaidlist)) + { + delschemas = GetPublicationSchemas(pubform->oid); + LockSchemaList(delschemas); + } I think "list_length(schemaidlist) > 0" would be more readable. --- This patch introduces some new static functions to publicationcmds.c but some have function prototypes but some don't. As far as I checked, ObjectsInPublicationToOids() CheckObjSchemaNotAlreadyInPublication() GetAlterPublicationDelRelations() AlterPublicationSchemas() CheckPublicationAlterTables() CheckPublicationAlterSchemas() LockSchemaList() OpenReliIdList() PublicationAddSchemas() PublicationDropSchemas() are newly introduced but only four functions: OpenReliIdList() LockSchemaList() PublicationAddSchemas() PublicationDropSchemas() have function prototypes. Actually, there already are functions that don't have their prototype in publicationcmds.c. But it seems better to clarify what kind of functions don't need to have a prototype at least in this file. --- ISTM we can inline the contents of three functions: GetAlterPublicationDelRelations(), CheckPublicationAlterTables(), and CheckPublicationAlterSchemas(). These have only one caller and ISTM makes the readability worse. I think it's not necessary to make functions for them. --- + * This is dispatcher function for AlterPublicationOptions, + * AlterPublicationSchemas and AlterPublicationTables. As this comment mentioned, AlterPublication() calls AlterPublicationTables() and AlterPublicationSchemas() but this function also a lot of pre-processing such as building the list and some checks, depending on stmt->action before calling these two functions. And those two functions also perform some operation depending on stmt->action. So ISTM it's better to move those pre-processing to these two functions and have AlterPublication() just call these two functions. What do you think? --- +List * +GetAllSchemasPublicationRelations(Oid puboid, PublicationPartOpt pub_partopt) Since this function gets all relations in the schema publication, I think GetAllSchemaPublicationRelations() would be better as a function name (removing 's' before 'P'). --- + if (!IsA(node, String)) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("invalid schema name at or near"), + parser_errposition(pstate, pubobj->location)); The error message should mention where the invalid schema name is at or near. Also, In the following example, the error position in the error message seems not to be where the invalid schemaname s.s is: postgres(1:47707)=# create publication p for all tables in schema s.s; ERROR: invalid schema name at or near LINE 1: create publication p for all tables in schema s.s; ^ --- + if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE) + { + if (IsA(node, RangeVar)) + *rels = lappend(*rels, (RangeVar *) node); + else if (IsA(node, String)) + { + RangeVar *rel = makeRangeVar(NULL, strVal(node), + pubobj->location); + + *rels = lappend(*rels, rel); + } + } + else if (pubobj->pubobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA) + { (snip) + /* Filter out duplicates if user specifies "sch1, sch1" */ + *schemas = list_append_unique_oid(*schemas, schemaid); + } Do we need to filter out duplicates also in PUBLICATIONOBJ_TABLE case since users can specify things like "TABLE tbl, tbl, tbl"? --- + if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to add or set schemas"))); Why do we require the superuser privilege only for ADD and SET but not for DROP? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Wed, Sep 22, 2021 at 8:02 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wednesday, September 22, 2021 2:02 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v30 patch has the fixes for the same. > > Thanks for updating the patches. > > I have one comment. > @@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt) > ... > + if (list_length(relations)) > + { > ... > + /* remove the existing schemas from the publication */ > + PublicationDropSchemas(pubform->oid, delschemas, false); > ... > + } > > After more thoughts on it, I still don't think drop all the schemas under " if > (list_length(relations))" is a good idea. I think 1) we'd better keep schema > and relation code separate. > How do you suggest changing it? -- With Regards, Amit Kapila.
On Wednesday, September 22, 2021 11:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > --- > + if (!IsA(node, String)) > + ereport(ERROR, > + errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("invalid schema > name at or near"), > + > parser_errposition(pstate, pubobj->location)); > > The error message should mention where the invalid schema name is at > or near. Also, In the following example, the error position in the > error message seems not to be where the invalid schemaname s.s is: > > postgres(1:47707)=# create publication p for all tables in schema s.s; > ERROR: invalid schema name at or near > LINE 1: create publication p for all tables in schema s.s; > ^ > I noticed this, too. And I think it could be fixed by the following change, thoughts? --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -9681,7 +9681,7 @@ PublicationObjSpec: TABLE pubobj_expr { $$ = $5; $$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA; - $$->location = @1; + $$->location = @5; } | pubobj_expr { Besides, about this change in tab-complete.c: + else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "SCHEMA")) + COMPLETE_WITH_QUERY(Query_for_list_of_schemas + " UNION SELECT 'CURRENT_SCHEMA'"); It should be "ALL TABLES IN SCHEMA" not "SCHEMA" at the first line, right? Regards Tang
On Tue, Sep 21, 2021 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Attached v29 patch has the fixes for the same. > > > > > > > Some minor comments on the v29-0002 patch: > > > > (1) > > In get_object_address_publication_schema(), the error message: > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" > > does not exist", > > > > isn't grammatically correct. It should probably be: > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do > > not exist", > > Modified > I still see the old message in v30. But I have a different suggestion for this message. How about changing it to: "publication schema \"%s\" in publication \"%s\" does not exist"? This will make it similar to other messages and I don't see the need here to add 'tables' as we have it in grammar. -- With Regards, Amit Kapila.
On Wed, Sep 22, 2021 1:29 PMAmit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Sep 22, 2021 at 8:02 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Wednesday, September 22, 2021 2:02 AM vignesh C > <vignesh21@gmail.com> wrote: > > > Attached v30 patch has the fixes for the same. > > > > Thanks for updating the patches. > > > > I have one comment. > > @@ -474,7 +707,75 @@ AlterPublication(ParseState *pstate, > > AlterPublicationStmt *stmt) ... > > + if (list_length(relations)) > > + { > > ... > > + /* remove the existing schemas from the publication > */ > > + PublicationDropSchemas(pubform->oid, > > + delschemas, false); > > ... > > + } > > > > After more thoughts on it, I still don't think drop all the schemas > > under " if (list_length(relations))" is a good idea. I think 1) we'd > > better keep schema and relation code separate. > > > > How do you suggest changing it? Personally, I think we'd better move the code about changing publication's tablelist into AlterPublicationTables and the code about changing publication's schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset did, the difference is the SET action, I suggest we drop all the tables in function AlterPublicationTables when user only set schemas and drop all the schema in AlterPublicationSchemas when user only set tables. In this approach, we can keep schema and relation code separate and don't need to worry about the locking order. Attach a top-up patch which refactor the code like above. Thoughts ? Best regards, Hou zj
Attachment
On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > > How do you suggest changing it? > > Personally, I think we'd better move the code about changing publication's > tablelist into AlterPublicationTables and the code about changing publication's > schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset > did, the difference is the SET action, I suggest we drop all the tables in > function AlterPublicationTables when user only set schemas and drop all the > schema in AlterPublicationSchemas when user only set tables. In this approach, > we can keep schema and relation code separate and don't need to worry > about the locking order. > > Attach a top-up patch which refactor the code like above. > Thoughts ? > Sounds like a good idea. Is it possible to incorporate the existing CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions into your suggested update? I think it might tidy up the error-checking a bit. Regards, Greg Nancarrow Fujitsu Australia
On Wed, Sep 22, 2021 at 5:03 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Personally, I think we'd better move the code about changing publication's > tablelist into AlterPublicationTables and the code about changing publication's > schemalist into AlterPublicationSchemas. It's similar to what the v29-patchset > did, the difference is the SET action, I suggest we drop all the tables in > function AlterPublicationTables when user only set schemas and drop all the > schema in AlterPublicationSchemas when user only set tables. In this approach, > we can keep schema and relation code separate and don't need to worry > about the locking order. > > Attach a top-up patch which refactor the code like above. > Good suggestion. I think it would still be better if we can move the checks related to superuser and puballtables into a separate function that gets called before taking a lock on publication. -- With Regards, Amit Kapila.
On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote: > > > --- > This patch introduces some new static functions to publicationcmds.c > but some have function prototypes but some don't. As far as I checked, > > ObjectsInPublicationToOids() > CheckObjSchemaNotAlreadyInPublication() > GetAlterPublicationDelRelations() > AlterPublicationSchemas() > CheckPublicationAlterTables() > CheckPublicationAlterSchemas() > LockSchemaList() > OpenReliIdList() > PublicationAddSchemas() > PublicationDropSchemas() > > are newly introduced but only four functions: > > OpenReliIdList() > LockSchemaList() > PublicationAddSchemas() > PublicationDropSchemas() > > have function prototypes. Actually, there already are functions that > don't have their prototype in publicationcmds.c. But it seems better > to clarify what kind of functions don't need to have a prototype at > least in this file. > I think if the function is defined after its use then we declare it at the top. Do you prefer to declare all static functions to allow ease of usage? Do you have something else in mind? > > --- > + if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser()) > + ereport(ERROR, > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be superuser to add or > set schemas"))); > > Why do we require the superuser privilege only for ADD and SET but not for DROP? > For Add/Set of for all tables of Schema is similar to all tables publication requirement. For Drop, I don't think it is mandatory to allow only to superuser. The same applies to Alter Publication ... Drop table case where you don't need to be table owner whereas, for Add, you need to be. We had a discussion on these points in this thread. See [1] and some emails prior to it. [1] - https://www.postgresql.org/message-id/CAA4eK1KqhUBHbcpT92VMPvUUDgGvyOK0ekXOwjNR6L%3DY_bcsGw%40mail.gmail.com -- With Regards, Amit Kapila.
From Thurs, Sep 23, 2021 12:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Sep 22, 2021 at 5:03 PM Hou Zhijie <houzj.fnst@fujitsu.com> wrote: > > > > Personally, I think we'd better move the code about changing publication's > > tablelist into AlterPublicationTables and the code about changing > publication's > > schemalist into AlterPublicationSchemas. It's similar to what the > v29-patchset > > did, the difference is the SET action, I suggest we drop all the tables in > > function AlterPublicationTables when user only set schemas and drop all the > > schema in AlterPublicationSchemas when user only set tables. In this > approach, > > we can keep schema and relation code separate and don't need to worry > > about the locking order. > > > > Attach a top-up patch which refactor the code like above. > > > > Good suggestion. I think it would still be better if we can move the > checks related to superuser and puballtables into a separate function > that gets called before taking a lock on publication. I agreed. I noticed v30-0001 has been committed with some minor changes, and the V30-0002 patchset need to be rebased accordingly. Attach a rebased version patch set to make cfbot happy. Also Attach the two top-up patches which refactor the code as suggested. (top-up patch 1 is to keep schema and table code separate, top-up patch 2 is to move some cheap check into a function and invoke it before locking.) Best regards, Hou zj
Attachment
- top-up-1-schema-refactor.patch
- top-up-2-move-check-into-a-separate-function.patch
- v31-0001-Added-schema-level-support-for-publication.patch
- v31-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v31-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v31-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v31-0005-Implemented-pg_publication_objects-view.patch
On Thur, Sep 23, 2021 11:06 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > > > > > How do you suggest changing it? > > > > Personally, I think we'd better move the code about changing > > publication's tablelist into AlterPublicationTables and the code about > > changing publication's schemalist into AlterPublicationSchemas. It's > > similar to what the v29-patchset did, the difference is the SET > > action, I suggest we drop all the tables in function > > AlterPublicationTables when user only set schemas and drop all the > > schema in AlterPublicationSchemas when user only set tables. In this > > approach, we can keep schema and relation code separate and don't need to > worry about the locking order. > > > > Attach a top-up patch which refactor the code like above. > > Thoughts ? > > > > Sounds like a good idea. > Is it possible to incorporate the existing > CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions > into your suggested update? > I think it might tidy up the error-checking a bit. I agreed we can put the check about ALL TABLE and superuser into a function like what the v30-patchset did. But I have some hesitations about the code related to CheckObjSchemaNotAlreadyInPublication(). Currently, we need to open and lock the table before invoking the CheckObjxxx function, ISTM we'd better open the table in function AlterPublicationTables. Maybe we can wait for the author's(Vignesh) opinion. Best regards, Hou zj
On Thu, Sep 23, 2021 at 5:02 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > Sounds like a good idea. > > Is it possible to incorporate the existing > > CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions > > into your suggested update? > > I think it might tidy up the error-checking a bit. > > I agreed we can put the check about ALL TABLE and superuser into a function > like what the v30-patchset did. But I have some hesitations about the code > related to CheckObjSchemaNotAlreadyInPublication(). Currently, we need to open > and lock the table before invoking the CheckObjxxx function, ISTM we'd better > open the table in function AlterPublicationTables. Maybe we can wait for the > author's(Vignesh) opinion. > Yes, I think you're right, the code related to CheckObjSchemaNotAlreadyInPublication() should be left where it is (according to your schema refactor patch). Regards, Greg Nancarrow Fujitsu Australia
On Wed, Sep 22, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Wed, Sep 22, 2021 at 4:02 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v30 patch has the fixes for the same. > > > > Thanks for all the patch updates. > I have some suggested updates to the v30-0005 documentation patch: > > doc/src/sgml/ref/alter_publication.sgml > (1) > I'm thinking it might be better to simplify the description, because > it's a bit wordy and difficult to read with the "all tables in schema" > bits. > Suggested update is below (thoughts?): > > BEFORE: > + The first three variants change which tables and/or all tables in schema are > + part of the publication. The <literal>SET</literal> clause will replace > + the list of tables and/or all tables in schema in the publication with the > + specified one, the existing tables and all tables in schema that were > + present in the publication will be removed. The <literal>ADD</literal> > + clause will add one or more tables and/or all tables in schema to the > + publication. The <literal>DROP</literal> clauses will remove one or more > + tables and/or all tables in schema from the publication. Note that adding > + tables and/or all tables in schema to a publication that is already > + subscribed to will require a <literal>ALTER SUBSCRIPTION ... > REFRESH PUBLICATION</literal> > + action on the subscribing side in order to become effective. > AFTER: > + The first three variants change which tables/schemas are > + part of the publication. The <literal>SET</literal> clause will replace > + the list of tables/schemas in the publication with the > + specified list; the existing tables/schemas that were > + present in the publication will be removed. The <literal>ADD</literal> > + clause will add one or more tables/schemas to the > + publication. The <literal>DROP</literal> clauses will remove one or more > + tables/schemas from the publication. Note that adding > + tables/schemas to a publication that is already > + subscribed to will require a <literal>ALTER SUBSCRIPTION ... > REFRESH PUBLICATION</literal> > + action on the subscribing side in order to become effective. Modified > > doc/src/sgml/ref/create_publication.sgml > (2) > I suggest an update similar to the following: > > BEFORE: > + Specifying a table that is part of schema specified in > + <literal>FOR ALL TABLES IN SCHEMA</literal> option is not supported. > AFTER: > + Specifying a table that is part of a schema already included in > the publication is not supported. This doc content is for the following example: create publication pub1 for all tables in schema sch1, table sch1.t1; It is about the schema specified in all tables in schema option along with table option. I think the existing content is better. > (3) > I find the following description a little unclear: > > + <para> > + Specifying a schema along with schema's table specified as part of > + <literal>FOR TABLE</literal> option is not supported. > + </para> > > Perhaps the following would be better: > > + <para> > + Specifying a schema that contains a table already included in the > + publication is not supported. > + </para> Similar to above > (4) > Minor fix: > BEFORE: > + rights on the table. The <command>FOR ALL TABLES</command> and > + <command>FOR ALL TABLES IN SCHEMA</command> clause requires the invoking > + user to be a superuser. > AFTER: > + rights on the table. The <command>FOR ALL TABLES</command> and > + <command>FOR ALL TABLES IN SCHEMA</command> clauses require the invoking > + user to be a superuser. Modified Attached v32 patch has the fixes for the same. Regards, Vignesh
Attachment
- v32-0001-Added-schema-level-support-for-publication.patch
- v32-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v32-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v32-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v32-0005-Implemented-pg_publication_objects-view.patch
On Tue, Sep 21, 2021 at 6:05 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Tue, Sep 21, 2021 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > (1) > > > In get_object_address_publication_schema(), the error message: > > > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" > > > does not exist", > > > > > > isn't grammatically correct. It should probably be: > > > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do > > > not exist", > > > > "does not exist" is used across the file. Should we keep it like that > > to maintain consistency. Thoughts? > > > > When it's singular, "does not exist" is correct. > I think currently only this case exists in the publication code. > e.g. > "publication \"%s\" does not exist" > "publication relation \"%s\" in publication \"%s\" does not exist" > > But "publication tables" is plural, so it needs to say "do not exist" > rather than "does not exist". > > > > > > > In the case of "if (!(*nspname))", the following line should probably > > > be added before returning false: > > > > > > *pubname = NULL; > > > > In case of failure we return false and don't access it. I felt we > > could keep it as it is. Thoughts? > > > > OK then, I might be being a bit pedantic. > (I was just thinking, strictly speaking, we probably shouldn't be > writing into the caller's pass-by-reference parameters in the case > false is returned) > > > > (5) > > > I'm wondering, in CheckObjSchemaNotAlreadyInPublication(), instead of > > > checking "checkobjtype" each loop iteration, wouldn't it be better to > > > just use the same for-loop in each IF block? > > > > I will be changing it to: > > static void > > CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist, > > PublicationObjSpecType checkobjtype) > > { > > ListCell *lc; > > > > foreach(lc, rels) > > { > > Relation rel = (Relation) lfirst(lc); > > Oid relSchemaId = RelationGetNamespace(rel); > > > > if (list_member_oid(schemaidlist, relSchemaId)) > > { > > if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA) > > ereport(ERROR, > > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > errmsg("cannot add schema \"%s\" to publication", > > get_namespace_name(relSchemaId)), > > errdetail("Table \"%s\" in schema \"%s\" is already part > > of the publication, adding the same schema is not supported.", > > RelationGetRelationName(rel), > > get_namespace_name(relSchemaId))); > > else if (checkobjtype == PUBLICATIONOBJ_TABLE) > > ereport(ERROR, > > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > errmsg("cannot add relation \"%s.%s\" to publication", > > get_namespace_name(relSchemaId), > > RelationGetRelationName(rel)), > > errdetail("Table's schema \"%s\" is already part of the > > publication.", > > get_namespace_name(relSchemaId))); > > } > > } > > } > > After the change checkobjtype will be checked only once in case of error. > > > > OK. > > One thing related to this code is the following: > > i) > postgres=# create publication pub1 for all tables in schema sch1, > table sch1.test; > ERROR: cannot add relation "sch1.test" to publication > DETAIL: Table's schema "sch1" is already part of the publication. > > ii) > postgres=# create publication pub1 for table sch1.test, all tables in > schema sch1; > ERROR: cannot add relation "sch1.test" to publication > DETAIL: Table's schema "sch1" is already part of the publication. > > Notice that in case (ii), the same error message is used, but the > order of items to be "added" to the publication is the reverse of case > (i), and really implies the table "sch1.test" was added first, but > this is not reflected by the error message. So it seems slightly odd > to say the schema is already part of the publication, when the table > was actually listed first. > I'm wondering if this can be improved? > > One idea I had was the following more generic type of message, but I'm > not 100% happy with the wording: > > DETAIL: Schema "sch1" and one of its tables can't separately be > part of the publication. This is common code applicable for the following scenarios: i) create publication pub1 for all tables in schema sch1, table sch1.test; ii) create publication pub1 for table sch1.test, all tables in schema sch1; iii) create publication pub1 for table sch1.test; alter publication pub1 add all tables in schema sch1; I have changed it to make it suitable for all the cases. ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("cannot add relation \"%s.%s\" to publication", get_namespace_name(relSchemaId), RelationGetRelationName(rel)), errdetail("Table's schema \"%s\" is already part of the publication or part of the specified schema list.", get_namespace_name(relSchemaId))); The v32 patch attached at [1] handles the above. [1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com Regards, Vignesh
On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Attached v30 patch has the fixes for the same. > > > > Thank you for updating the patches. > > Here are random comments on v30-0002 patch: > > + > + if (stmt->action == DEFELEM_SET && > !list_length(schemaidlist)) > + { > + delschemas = > GetPublicationSchemas(pubform->oid); > + LockSchemaList(delschemas); > + } > > I think "list_length(schemaidlist) > 0" would be more readable. We have refactored the patch which has removed these changes. > --- > This patch introduces some new static functions to publicationcmds.c > but some have function prototypes but some don't. As far as I checked, > > ObjectsInPublicationToOids() > CheckObjSchemaNotAlreadyInPublication() > GetAlterPublicationDelRelations() > AlterPublicationSchemas() > CheckPublicationAlterTables() > CheckPublicationAlterSchemas() > LockSchemaList() > OpenReliIdList() > PublicationAddSchemas() > PublicationDropSchemas() > > are newly introduced but only four functions: > > OpenReliIdList() > LockSchemaList() > PublicationAddSchemas() > PublicationDropSchemas() > > have function prototypes. Actually, there already are functions that > don't have their prototype in publicationcmds.c. But it seems better > to clarify what kind of functions don't need to have a prototype at > least in this file. My thoughts are the same as that Amit had replied at [1]. > --- > ISTM we can inline the contents of three functions: > GetAlterPublicationDelRelations(), CheckPublicationAlterTables(), and > CheckPublicationAlterSchemas(). These have only one caller and ISTM > makes the readability worse. I think it's not necessary to make > functions for them. We have refactored the patch which has removed these changes. > --- > + * This is dispatcher function for AlterPublicationOptions, > + * AlterPublicationSchemas and AlterPublicationTables. > > As this comment mentioned, AlterPublication() calls > AlterPublicationTables() and AlterPublicationSchemas() but this > function also a lot of pre-processing such as building the list and > some checks, depending on stmt->action before calling these two > functions. And those two functions also perform some operation > depending on stmt->action. So ISTM it's better to move those > pre-processing to these two functions and have AlterPublication() just > call these two functions. What do you think? We have refactored the patch which has removed these changes. > --- > +List * > +GetAllSchemasPublicationRelations(Oid puboid, PublicationPartOpt pub_partopt) > > Since this function gets all relations in the schema publication, I > think GetAllSchemaPublicationRelations() would be better as a function > name (removing 's' before 'P'). Modified > --- > + if (!IsA(node, String)) > + ereport(ERROR, > + errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("invalid schema > name at or near"), > + > parser_errposition(pstate, pubobj->location)); > > The error message should mention where the invalid schema name is at > or near. Also, In the following example, the error position in the > error message seems not to be where the invalid schemaname s.s is: > postgres(1:47707)=# create publication p for all tables in schema s.s; > ERROR: invalid schema name at or near > LINE 1: create publication p for all tables in schema s.s; > ^ Modified > --- > + if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE) > + { > + if (IsA(node, RangeVar)) > + *rels = lappend(*rels, (RangeVar *) node); > + else if (IsA(node, String)) > + { > + RangeVar *rel = makeRangeVar(NULL, > strVal(node), > + > pubobj->location); > + > + *rels = lappend(*rels, rel); > + } > + } > + else if (pubobj->pubobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA) > + { > (snip) > + /* Filter out duplicates if user specifies > "sch1, sch1" */ > + *schemas = list_append_unique_oid(*schemas, schemaid); > + } > > Do we need to filter out duplicates also in PUBLICATIONOBJ_TABLE case > since users can specify things like "TABLE tbl, tbl, tbl"? Currently the handling of tables is taken care at OpenTableList: ..... /* * Filter out duplicates if user specifies "foo, foo". * * Note that this algorithm is known to not be very efficient (O(N^2)) * but given that it only works on list of tables given to us by user * it's deemed acceptable. */ if (list_member_oid(relids, myrelid)) { table_close(rel, ShareUpdateExclusiveLock); continue; } ..... > --- > + if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser()) > + ereport(ERROR, > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be superuser to add or > set schemas"))); > > Why do we require the superuser privilege only for ADD and SET but not for DROP? Amit had replied with the comments for this at [1]. The v32 patch attached at [2] has the fixes for the above. [1] - https://www.postgresql.org/message-id/CAA4eK1KmccaVdKFrwKLXhewDt6rSCD2msOoYbWWWxYK5%3DbX5cg%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com Regards, Vignesh
On Wed, Sep 22, 2021 at 11:27 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Wednesday, September 22, 2021 11:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > --- > > + if (!IsA(node, String)) > > + ereport(ERROR, > > + errcode(ERRCODE_SYNTAX_ERROR), > > + errmsg("invalid schema > > name at or near"), > > + > > parser_errposition(pstate, pubobj->location)); > > > > The error message should mention where the invalid schema name is at > > or near. Also, In the following example, the error position in the > > error message seems not to be where the invalid schemaname s.s is: > > > > postgres(1:47707)=# create publication p for all tables in schema s.s; > > ERROR: invalid schema name at or near > > LINE 1: create publication p for all tables in schema s.s; > > ^ > > > > I noticed this, too. And I think it could be fixed by the following change, thoughts? I fixed it by updating the location at pubobj_expr > Besides, about this change in tab-complete.c: > > + else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "SCHEMA")) > + COMPLETE_WITH_QUERY(Query_for_list_of_schemas > + " UNION SELECT 'CURRENT_SCHEMA'"); > > It should be "ALL TABLES IN SCHEMA" not "SCHEMA" at the first line, right? Modified. The v32 patch attached at [1] handles the above. [1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com Regards. Vignesh
On Wed, Sep 22, 2021 at 11:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 21, 2021 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Sep 21, 2021 at 9:03 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > > > On Fri, Sep 17, 2021 at 10:09 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Attached v29 patch has the fixes for the same. > > > > > > > > > > Some minor comments on the v29-0002 patch: > > > > > > (1) > > > In get_object_address_publication_schema(), the error message: > > > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" > > > does not exist", > > > > > > isn't grammatically correct. It should probably be: > > > > > > + errmsg("publication tables of schema \"%s\" in publication \"%s\" do > > > not exist", > > > > Modified > > > > I still see the old message in v30. But I have a different suggestion > for this message. How about changing it to: "publication schema \"%s\" > in publication \"%s\" does not exist"? This will make it similar to > other messages and I don't see the need here to add 'tables' as we > have it in grammar. Modified The v32 patch attached at [1] handles the above. [1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com Regards, Vignesh
On Thu, Sep 23, 2021 at 12:22 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > From Thurs, Sep 23, 2021 12:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 22, 2021 at 5:03 PM Hou Zhijie <houzj.fnst@fujitsu.com> wrote: > > > > > > Personally, I think we'd better move the code about changing publication's > > > tablelist into AlterPublicationTables and the code about changing > > publication's > > > schemalist into AlterPublicationSchemas. It's similar to what the > > v29-patchset > > > did, the difference is the SET action, I suggest we drop all the tables in > > > function AlterPublicationTables when user only set schemas and drop all the > > > schema in AlterPublicationSchemas when user only set tables. In this > > approach, > > > we can keep schema and relation code separate and don't need to worry > > > about the locking order. > > > > > > Attach a top-up patch which refactor the code like above. > > > > > > > Good suggestion. I think it would still be better if we can move the > > checks related to superuser and puballtables into a separate function > > that gets called before taking a lock on publication. > > I agreed. > > I noticed v30-0001 has been committed with some minor changes, and the V30-0002 > patchset need to be rebased accordingly. Attach a rebased version patch set to > make cfbot happy. Also Attach the two top-up patches which refactor the code as > suggested. (top-up patch 1 is to keep schema and table code separate, top-up > patch 2 is to move some cheap check into a function and invoke it before > locking.) Thanks for the patches, the changes simplifies alterpublications code and handles the drop object in a better way. I have merged it to 0001 patch at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com Regards, Vignesh
On Thu, Sep 23, 2021 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 22, 2021 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Sep 22, 2021 at 3:02 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > --- > > This patch introduces some new static functions to publicationcmds.c > > but some have function prototypes but some don't. As far as I checked, > > > > ObjectsInPublicationToOids() > > CheckObjSchemaNotAlreadyInPublication() > > GetAlterPublicationDelRelations() > > AlterPublicationSchemas() > > CheckPublicationAlterTables() > > CheckPublicationAlterSchemas() > > LockSchemaList() > > OpenReliIdList() > > PublicationAddSchemas() > > PublicationDropSchemas() > > > > are newly introduced but only four functions: > > > > OpenReliIdList() > > LockSchemaList() > > PublicationAddSchemas() > > PublicationDropSchemas() > > > > have function prototypes. Actually, there already are functions that > > don't have their prototype in publicationcmds.c. But it seems better > > to clarify what kind of functions don't need to have a prototype at > > least in this file. > > > > I think if the function is defined after its use then we declare it at > the top. Do you prefer to declare all static functions to allow ease > of usage? Do you have something else in mind? I prefer to declare all static functions since if we have a function prototype we don't need to include the change about the function in future (unrelated) commits that might add a new function which uses the function and is defined before their declarations. But it seems to me that the policy varies per file. For instance, all functions in vacuumlazy.c have their function prototype but functions in publicationcmds.c seems not. I'm not going to insist on that so please ignore this comment. > > > > > --- > > + if ((action == DEFELEM_ADD || action == DEFELEM_SET) && !superuser()) > > + ereport(ERROR, > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > + errmsg("must be superuser to add or > > set schemas"))); > > > > Why do we require the superuser privilege only for ADD and SET but not for DROP? > > > > For Add/Set of for all tables of Schema is similar to all tables > publication requirement. For Drop, I don't think it is mandatory to > allow only to superuser. The same applies to Alter Publication ... > Drop table case where you don't need to be table owner whereas, for > Add, you need to be. We had a discussion on these points in this > thread. See [1] and some emails prior to it. Thank you for sharing the link. That makes sense. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Thu, Sep 23, 2021 at 12:32 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Thur, Sep 23, 2021 11:06 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Wed, Sep 22, 2021 at 9:33 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > > > How do you suggest changing it? > > > > > > Personally, I think we'd better move the code about changing > > > publication's tablelist into AlterPublicationTables and the code about > > > changing publication's schemalist into AlterPublicationSchemas. It's > > > similar to what the v29-patchset did, the difference is the SET > > > action, I suggest we drop all the tables in function > > > AlterPublicationTables when user only set schemas and drop all the > > > schema in AlterPublicationSchemas when user only set tables. In this > > > approach, we can keep schema and relation code separate and don't need to > > worry about the locking order. > > > > > > Attach a top-up patch which refactor the code like above. > > > Thoughts ? > > > > > > > Sounds like a good idea. > > Is it possible to incorporate the existing > > CheckPublicationAlterTables() and CheckPublicationAlterSchemas() functions > > into your suggested update? > > I think it might tidy up the error-checking a bit. > > I agreed we can put the check about ALL TABLE and superuser into a function > like what the v30-patchset did. But I have some hesitations about the code > related to CheckObjSchemaNotAlreadyInPublication(). Currently, we need to open > and lock the table before invoking the CheckObjxxx function, ISTM we'd better > open the table in function AlterPublicationTables. Maybe we can wait for the > author's(Vignesh) opinion. I felt keeping the code related to CheckObjSchemaNotAlreadyInPublication as it is in AlterPublicationTables and AlterPublicationSchemas is better. Regards, Vignesh
On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote: > Attached v33 patch has the preprocess_pubobj_list review comment fix suggested by Alvaro at [1]. The v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has the grammar changes as suggested by Alvaro at [1]. If we agree this is better, I will merge this into the 0001 patch. [1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql Regards, Vignesh
Attachment
- v33-0001-Added-schema-level-support-for-publication.patch
- v33-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v33-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v33-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v33-0005-Implemented-pg_publication_objects-view.patch
- v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch
On Monday, September 27, 2021 12:32 PM vignesh C <vignesh21@gmail.com> wrote: > On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v33 patch has the preprocess_pubobj_list review comment fix > suggested by Alvaro at [1]. The > v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has > the grammar changes as suggested by Alvaro at [1]. If we agree this is better, I > will merge this into the 0001 patch. > [1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql Hi, The grammar change basically looks good to me. Only one suggestion is that it will be better to add some more comments in gram.y to describe the rule PublicationObjSpec. Because it's a new style syntax in postgresql, people might wonder how the code work and why we choose this design when they first time see this rule in gram.y. Maybe something like the following: +/* + * FOR TABLE and FOR ALL TABLES IN SCHEMA specifications + * + * This rule parses publication object with and without keyword prefix. + * + * The actual type of the object without keyword prefix depends on the previous + * one with keyword prefix. It will be preprocessed in preprocess_pubobj_list(). + * + * For the object without keyword prefix, we cannot just use relation_expr here, + * because some extended expression in relation_expr cannot be used as a + * schemaname and we cannot differentiate it. So, we extract the rules from + * relation_expr here. + */ PublicationObjSpec: TABLE relation_expr ... My words might not be good, but I think it will be better to add some comments to explain a bit about the code in gram.y. Thoughts ? Best regards, Hou zj
On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v33 patch has the preprocess_pubobj_list review comment fix > suggested by Alvaro at [1]. A minor point I noticed in the v33-0002 patch, in the code added to the listSchemas() function of src/bin/psql/describe.c, shouldn't it "return false" (not true) if PSQLexec() fails? Also, since the PQExpBufferData buf is re-used in the added code, it's handling is a little inconsistent to similar existing code. See below for suggested update. Regards, Greg Nancarrow Fujitsu Australia diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 953e1f52cf..1d28809050 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -5077,9 +5077,11 @@ listSchemas(const char *pattern, bool verbose, bool showSystem) appendPQExpBufferStr(&buf, "ORDER BY 1;"); res = PSQLexec(buf.data); - termPQExpBuffer(&buf); if (!res) + { + termPQExpBuffer(&buf); return false; + } myopt.nullPrint = NULL; myopt.title = _("List of schemas"); @@ -5100,7 +5102,10 @@ listSchemas(const char *pattern, bool verbose, bool showSystem) pattern); result = PSQLexec(buf.data); if (!result) - return true; + { + termPQExpBuffer(&buf); + return false; + } else pub_schema_tuples = PQntuples(result); @@ -5132,6 +5137,7 @@ listSchemas(const char *pattern, bool verbose, bool showSystem) printQuery(res, &myopt, pset.queryFout, false, pset.logfile); + termPQExpBuffer(&buf); PQclear(res); /* Free the memory allocated for the footer */
On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v33 patch has the preprocess_pubobj_list review comment fix > suggested by Alvaro at [1]. In the v33-0003 patch, there's a couple of error-case tests that have comments copied from success-case tests: +-- should be able to add table to schema publication ... +-- should be able to drop table from schema publication ... These should be changed to something similar to that used for other error-case tests, like: +-- fail - can't add a table of the same schema to the schema publication +-- fail - can't drop a table from the schema publication which isn't in the publication Also, for the following error: ERROR: cannot add ... to publication DETAIL: Table's schema "xxxx" is already part of the publication or part of the specified schema list. there needs to be a test case to test the "... or part of the specified schema list" case. Regards, Greg Nancarrow Fujitsu Australia
On Monday, September 27, 2021 1:32 PM, vignesh C <vignesh21@gmail.com> wrote: >Attached v33 patch has the preprocess_pubobj_list review comment fix >suggested by Alvaro at [1]. The >v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has >the grammar changes as suggested by Alvaro at [1]. If we agree this is >better, I will merge this into the 0001 patch. >[1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql About the schema patch, I think a schema and a table which belongs to this schema shouldn't be specified at the same time. But what if someone uses "ALTER TABLE ... SET SCHEMA ..." after "CREATE PUBLICATION"? For example: create schema sch1; create schema sch2; create table sch2.t (a int); create publication pub1 for all tables in schema sch1, table sch2.t; alter table sch2.t set schema sch1; postgres=# \dRp+ Publication pub1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root ----------+------------+---------+---------+---------+-----------+------ ----------+------------+---------+---------+---------+-----------+---- postgres | f | t | t | t | t | f Tables: "sch1.t" Tables from schemas: "sch1" Table t has been output twice. I think this should not be supported, should we do something for this scenario? Regards Tang
On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Monday, September 27, 2021 12:32 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Sep 24, 2021 at 11:46 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > Attached v33 patch has the preprocess_pubobj_list review comment fix > > suggested by Alvaro at [1]. The > > v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has > > the grammar changes as suggested by Alvaro at [1]. If we agree this is better, I > > will merge this into the 0001 patch. > > [1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql > > Hi, > > The grammar change basically looks good to me. Only one suggestion is that it > will be better to add some more comments in gram.y to describe the rule > PublicationObjSpec. Because it's a new style syntax in postgresql, people might > wonder how the code work and why we choose this design when they first time > see this rule in gram.y. > > Maybe something like the following: > > +/* > + * FOR TABLE and FOR ALL TABLES IN SCHEMA specifications > + * > + * This rule parses publication object with and without keyword prefix. > + * > + * The actual type of the object without keyword prefix depends on the previous > + * one with keyword prefix. It will be preprocessed in preprocess_pubobj_list(). > + * > + * For the object without keyword prefix, we cannot just use relation_expr here, > + * because some extended expression in relation_expr cannot be used as a > + * schemaname and we cannot differentiate it. So, we extract the rules from > + * relation_expr here. > + */ > PublicationObjSpec: > TABLE relation_expr > ... > > My words might not be good, but I think it will be better to add some comments > to explain a bit about the code in gram.y. Thoughts ? Modified. Attached v34 patch has the changes for the same. Regards, Vignesh
Attachment
- v34-0001-Added-schema-level-support-for-publication.patch
- v34-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v34-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v34-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v34-0005-Implemented-pg_publication_objects-view.patch
On Mon, Sep 27, 2021 at 2:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v33 patch has the preprocess_pubobj_list review comment fix > > suggested by Alvaro at [1]. > > A minor point I noticed in the v33-0002 patch, in the code added to > the listSchemas() function of src/bin/psql/describe.c, shouldn't it > "return false" (not true) if PSQLexec() fails? > Also, since the PQExpBufferData buf is re-used in the added code, it's > handling is a little inconsistent to similar existing code. > See below for suggested update. Modified This is handled in the v34 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com Regards, Vignesh
On Mon, Sep 27, 2021 at 4:51 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Mon, Sep 27, 2021 at 2:32 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v33 patch has the preprocess_pubobj_list review comment fix > > suggested by Alvaro at [1]. > > In the v33-0003 patch, there's a couple of error-case tests that have > comments copied from success-case tests: > > +-- should be able to add table to schema publication > ... > +-- should be able to drop table from schema publication > ... > > These should be changed to something similar to that used for other > error-case tests, like: > > +-- fail - can't add a table of the same schema to the schema publication > +-- fail - can't drop a table from the schema publication which isn't > in the publication Modified > Also, for the following error: > > ERROR: cannot add ... to publication > DETAIL: Table's schema "xxxx" is already part of the publication > or part of the specified schema list. > > there needs to be a test case to test the "... or part of the > specified schema list" case. Added the test This is handled in the v34 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com Regards, Vignesh
On Tue, Sep 28, 2021 at 4:35 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, September 27, 2021 1:32 PM, vignesh C <vignesh21@gmail.com> wrote: > > >Attached v33 patch has the preprocess_pubobj_list review comment fix > >suggested by Alvaro at [1]. The > >v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has > >the grammar changes as suggested by Alvaro at [1]. If we agree this is > >better, I will merge this into the 0001 patch. > >[1] - https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql > > About the schema patch, I think a schema and a table which belongs to this schema shouldn't be specified at the same time. > But what if someone uses "ALTER TABLE ... SET SCHEMA ..." after "CREATE PUBLICATION"? > > For example: > > create schema sch1; > create schema sch2; > create table sch2.t (a int); > create publication pub1 for all tables in schema sch1, table sch2.t; alter table sch2.t set schema sch1; > > postgres=# \dRp+ > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > ----------+------------+---------+---------+---------+-----------+------ > ----------+------------+---------+---------+---------+-----------+---- > postgres | f | t | t | t | t | f > Tables: > "sch1.t" > Tables from schemas: > "sch1" > > Table t has been output twice. > I think this should not be supported, should we do something for this scenario? Yes this should not be supported, we should throw an error in this case. This is handled in the v34 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com Regards, Vignesh
On Monday, Tuesday, September 28, 2021 10:49 PM, vignesh C <vignesh21@gmail.com> wrote: > > Yes this should not be supported, we should throw an error in this case. > This is handled in the v34 patch attached at [1]. > [1] - https://www.postgresql.org/message- > id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.g > mail.com > Thanks for fixing it. I confirmed the error can be output as expected. Here is a problem related to publish_via_partition_root option when using this patch. With this option on, I think pg_get_publication_tables function gave an unexcepted result and the subscriber would get dual data during table sync. For example: (I used pg_publication_tables view to make it looks clearer) create schema sch1; create table sch1.tbl1 (a int) partition by range ( a ); create table sch1.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10); create table sch1.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20); create table sch1.tbl1_part3 partition of sch1.tbl1 for values from (20) to (30); create publication pub for all tables in schema sch1 with(publish_via_partition_root=1); postgres=# select * from pg_publication_tables where pubname='pub'; pubname | schemaname | tablename ---------+------------+------------ pub | sch1 | tbl1_part1 pub | sch1 | tbl1_part2 pub | sch1 | tbl1_part3 pub | sch1 | tbl1 (4 rows) It shows both the partitioned table and its leaf partitions. But the result of FOR ALL TABLES publication couldn't show the leaf partitions. postgres=# create publication pub_all for all tables with(publish_via_partition_root=1); CREATE PUBLICATION postgres=# select * from pg_publication_tables where pubname='pub_all'; pubname | schemaname | tablename ---------+------------+----------- pub_all | sch1 | tbl1 (1 row) How about make the following change to avoid it? I tried it and it also fixed dual data issue during table sync. diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c index 04e785b192..4e8ccdabc6 100644 --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c @@ -632,7 +632,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt) Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple); Oid relid = relForm->oid; - if (is_publishable_class(relid, relForm)) + if (is_publishable_class(relid, relForm) && + !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT)) result = lappend_oid(result, relid); } Regards Tang
On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote: > Attached v34 patch has the changes for the same. Thanks for updating the patch. Here are a few comments. 1) + * ALL TABLES IN SCHEMA schema [[, ...] [[ -> [ 2) + /* ALTER PUBLICATION ... ADD/DROP TABLE/ALL TABLES IN SCHEMA parameters */ The two '/' seems a bit unclear and it doesn't mention the SET case. Maybe we can write like: /* parameters used for ALTER PUBLICATION ... ADD/DROP/SET publication objects */ 3) + /* + * Check if setting the relation to a different schema will result in the + * publication having schema and same schema's table in the publication. + */ + if (stmt->objectType == OBJECT_TABLE) + { + ListCell *lc; + List *schemaPubids = GetSchemaPublications(nspOid); + foreach(lc, schemaPubids) + { + Oid pubid = lfirst_oid(lc); + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), + relid)) + ereport(ERROR, How about we check this case like the following ? List *schemaPubids = GetSchemaPublications(nspOid); List *relPubids = GetRelationPublications(RelationGetRelid(rel)); if (list_intersection(schemaPubids, relPubids)) ereport(ERROR, ... Best regards, Hou zj
On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v34 patch has the changes for the same. > > 3) > + /* > + * Check if setting the relation to a different schema will result in the > + * publication having schema and same schema's table in the publication. > + */ > + if (stmt->objectType == OBJECT_TABLE) > + { > + ListCell *lc; > + List *schemaPubids = GetSchemaPublications(nspOid); > + foreach(lc, schemaPubids) > + { > + Oid pubid = lfirst_oid(lc); > + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), > + relid)) > + ereport(ERROR, > > How about we check this case like the following ? > > List *schemaPubids = GetSchemaPublications(nspOid); > List *relPubids = GetRelationPublications(RelationGetRelid(rel)); > if (list_intersection(schemaPubids, relPubids)) > ereport(ERROR, ... > Won't this will allow changing one of the partitions for which only partitioned table is part of the target schema? And then probably we won't be able to provide the exact publication in the error message if we followed the above? -- With Regards, Amit Kapila.
On Tue, Sep 28, 2021 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > Attached v34 patch has the changes for the same. > Few comments on v34-0001-Added-schema-level-support-for-publication ========================================================== 1. + * This rule parses publication object with and without keyword prefix. I think we should write it as: "This rule parses publication objects with and without keyword prefixes." 2. + * For the object without keyword prefix, we cannot just use relation_expr here, + * because some extended expression in relation_expr cannot be used as a /expression/expressions 3. +/* + * Process pubobjspec_list to check for errors in any of the objects and + * convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType + * type. 4. + /* + * Check if setting the relation to a different schema will result in the + * publication having schema and same schema's table in the publication. + */ + if (stmt->objectType == OBJECT_TABLE) + { + ListCell *lc; + List *schemaPubids = GetSchemaPublications(nspOid); + foreach(lc, schemaPubids) + { + Oid pubid = lfirst_oid(lc); + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), + relid)) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot move table \"%s\" to schema \"%s\"", + RelationGetRelationName(rel), stmt->newschema), + errdetail("Altering table will result in having schema \"%s\" and same schema's table \"%s\" in the publication \"%s\" which is not supported.", + stmt->newschema, + RelationGetRelationName(rel), + get_publication_name(pubid, false))); + } + } Let's slightly change the comment as: "Check that setting the relation to a different schema won't result in the publication having schema and the same schema's table." and errdetail as: "The schema \"%s\" and same schema's table \"%s\" cannot be part of the same publication \"%s\"." Maybe it is better to specify that this will disallow the partition table case. 5. ObjectsInPublicationToOids() { .. + pubobj = (PublicationObjSpec *) lfirst(cell); + if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE) It is better to keep an empty line between above two lines. 6. List *schemaPubids = GetSchemaPublications(nspOid); foreach(lc, schemaPubids) .. Oid pubid = lfirst_oid(lc); if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), Add an empty line between each of the above two lines. 7. + /* + * Schema lock is held until the publication is altered to prevent + * concurrent schema deletion. No need to unlock the schemas, the locks + * will be released automatically at the end of alter publication command. + */ + LockSchemaList(schemaidlist); I think it is better to add a similar comment at other places where this function is called. And we can shorten the comment atop LockSchemaList to something like: "The schemas specified in the schema list are locked in AccessShareLock mode in order to prevent concurrent schema deletion." 8. In CreatePublication(), the check if (stmt->for_all_tables) can be the first check and then in else if we can process tables and schemas. 9. AlterPublication() { .. + /* Lock the publication so nobody else can do anything with it. */ + LockDatabaseObject(PublicationRelationId, pubform->oid, 0, + AccessExclusiveLock); I think it is better to say why we need this lock. So, can we change the comment to something like: "Lock the publication so nobody else can do anything with it. This prevents concurrent alter to add table(s) that were already going to become part of the publication by adding corresponding schema(s) via this command and similarly it will prevent the concurrent addition of schema(s) for which there is any corresponding table being added by this command." -- With Regards, Amit Kapila.
On Wed, Sep 29, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > 4. > + /* > + * Check if setting the relation to a different schema will result in the > + * publication having schema and same schema's table in the publication. > + */ > + if (stmt->objectType == OBJECT_TABLE) > + { > + ListCell *lc; > + List *schemaPubids = GetSchemaPublications(nspOid); > + foreach(lc, schemaPubids) > + { > + Oid pubid = lfirst_oid(lc); > + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), > + relid)) > + ereport(ERROR, > + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot move table \"%s\" to schema \"%s\"", > + RelationGetRelationName(rel), stmt->newschema), > + errdetail("Altering table will result in having schema \"%s\" and > same schema's table \"%s\" in the publication \"%s\" which is not > supported.", > + stmt->newschema, > + RelationGetRelationName(rel), > + get_publication_name(pubid, false))); > + } > + } > > Let's slightly change the comment as: "Check that setting the relation > to a different schema won't result in the publication having schema > and the same schema's table." and errdetail as: "The schema \"%s\" and > same schema's table \"%s\" cannot be part of the same publication > \"%s\"." > Since this code is in AlterTableNamespace() and the relation being checked may or may not be part of a publication, I'd use "a publication" instead of "the publication" in the comment. Also, I'd say that we're doing the check because the mentioned combination is not supported. i.e. "Check that setting the relation to a different schema won't result in a publication having both a schema and the same schema's table, as this is not supported." Regards, Greg Nancarrow Fujitsu Australia
On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com> > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote: > > > Attached v34 patch has the changes for the same. > > > > 3) > > + /* > > + * Check if setting the relation to a different schema will result in the > > + * publication having schema and same schema's table in the > publication. > > + */ > > + if (stmt->objectType == OBJECT_TABLE) > > + { > > + ListCell *lc; > > + List *schemaPubids = > GetSchemaPublications(nspOid); > > + foreach(lc, schemaPubids) > > + { > > + Oid pubid = lfirst_oid(lc); > > + if (list_member_oid(GetPublicationRelations(pubid, > PUBLICATION_PART_ALL), > > + relid)) > > + ereport(ERROR, > > > > How about we check this case like the following ? > > > > List *schemaPubids = GetSchemaPublications(nspOid); > > List *relPubids = GetRelationPublications(RelationGetRelid(rel)); > > if (list_intersection(schemaPubids, relPubids)) > > ereport(ERROR, ... > > > > Won't this will allow changing one of the partitions for which only partitioned > table is part of the target schema? I think it still disallow changing partition's schema to the published one. I tested with the following SQLs. ----- create schema sch1; create schema sch2; create schema sch3; create table sch1.tbl1 (a int) partition by range ( a ); create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101); create table sch3.tbl1_part2 partition of sch1.tbl1 for values from (101) to (200); create publication pub for ALL TABLES IN schema sch1, TABLE sch2.tbl1_part1; alter table sch2.tbl1_part1 set schema sch1; ---* It will report an error here * ----- Did I miss something ? Best regards, Hou zj
On Wed, Sep 29, 2021 at 11:59 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com> > > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v34 patch has the changes for the same. > > > > > > 3) > > > + /* > > > + * Check if setting the relation to a different schema will result in the > > > + * publication having schema and same schema's table in the > > publication. > > > + */ > > > + if (stmt->objectType == OBJECT_TABLE) > > > + { > > > + ListCell *lc; > > > + List *schemaPubids = > > GetSchemaPublications(nspOid); > > > + foreach(lc, schemaPubids) > > > + { > > > + Oid pubid = lfirst_oid(lc); > > > + if (list_member_oid(GetPublicationRelations(pubid, > > PUBLICATION_PART_ALL), > > > + relid)) > > > + ereport(ERROR, > > > > > > How about we check this case like the following ? > > > > > > List *schemaPubids = GetSchemaPublications(nspOid); > > > List *relPubids = GetRelationPublications(RelationGetRelid(rel)); > > > if (list_intersection(schemaPubids, relPubids)) > > > ereport(ERROR, ... > > > > > > > Won't this will allow changing one of the partitions for which only partitioned > > table is part of the target schema? > > I think it still disallow changing partition's schema to the published one. > I tested with the following SQLs. > ----- > create schema sch1; > create schema sch2; > create schema sch3; > > create table sch1.tbl1 (a int) partition by range ( a ); > create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101); > create table sch3.tbl1_part2 partition of sch1.tbl1 for values from (101) to (200); > create publication pub for ALL TABLES IN schema sch1, TABLE sch2.tbl1_part1; > alter table sch2.tbl1_part1 set schema sch1; > ---* It will report an error here * > ----- > Use all steps before "create publication" and then try below. These will give an error with the patch proposed but if I change it to what you are proposing then it won't give an error. create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1; alter table sch3.tbl1_part2 set schema sch2; But now again thinking about it, I am not sure if we really want to give error in this case. What do you think? Also, if we use list_intersection trick, then how will we tell the publication due to which this problem has occurred, or do you think we should leave that as an exercise for the user? -- With Regards, Amit Kapila.
On Wed, Sep 29, 2021 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Sep 29, 2021 at 11:59 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com> > > > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> > wrote: > > > > > Attached v34 patch has the changes for the same. > > > > How about we check this case like the following ? > > > > > > > > List *schemaPubids = GetSchemaPublications(nspOid); > > > > List *relPubids = GetRelationPublications(RelationGetRelid(rel)); > > > > if (list_intersection(schemaPubids, relPubids)) > > > > ereport(ERROR, ... > > > > > > > > > > Won't this will allow changing one of the partitions for which only > > > partitioned table is part of the target schema? > > > > I think it still disallow changing partition's schema to the published one. > > I tested with the following SQLs. > > ----- > > create schema sch1; > > create schema sch2; > > create schema sch3; > > > > create table sch1.tbl1 (a int) partition by range ( a ); create table > > sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101); > > create table sch3.tbl1_part2 partition of sch1.tbl1 for values from > > (101) to (200); create publication pub for ALL TABLES IN schema sch1, > > TABLE sch2.tbl1_part1; alter table sch2.tbl1_part1 set schema sch1; > > ---* It will report an error here * > > ----- > > > > Use all steps before "create publication" and then try below. These will give an > error with the patch proposed but if I change it to what you are proposing then > it won't give an error. > create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1; alter > table sch3.tbl1_part2 set schema sch2; > > But now again thinking about it, I am not sure if we really want to give error in > this case. What do you think? Personally, I think we can allow the above case. Because if user specify the partitioned table in the publication like above, they cannot drop the partition separately. And the partitioned table is the actual one in pg_publication_rel. So, I think allowing this case seems won't make people feel confused. Besides, in the current patch, we have allowed similar case in CREATE/ALTER PUBLICATION cases. In this SQL: "create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1;", one of the partitions ' sch2.tbl1_part1' is from schema 'sch2' which is published. It might be better to make the behavior consistent. > Also, if we use list_intersection trick, then how will > we tell the publication due to which this problem has occurred, or do you think > we should leave that as an exercise for the user? I thought list_intersection will return a puboids list in which the puboid exists in both input list. We can choose the first puboid and output it in the error message which seems the same as the current patch. But I noticed list_intersection doesn't support T_OidList, so we might need to search the puboid Manually. Like: foreach(cell, relPubids) { if (list_member_oid(schemaPubids, lfirst_oid(cell))) ereport(ERROR, errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot move table \"%s\" to schema \"%s\"", RelationGetRelationName(rel), stmt->newschema), errdetail("Altering table will result in having schema \"%s\" and same schema's table \"%s\" in thepublication \"%s\" which is not supported.", stmt->newschema, RelationGetRelationName(rel), get_publication_name(lfirst_oid(cell), false))); } Best regards, Hou zj
On Wed, Sep 29, 2021 at 5:48 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wed, Sep 29, 2021 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 29, 2021 at 11:59 AM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > > > On Wed, Sep 29, 2021 12:34 PM Amit Kapila <amit.kapila16@gmail.com> > > > > On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com > > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> > > wrote: > > > > > > Attached v34 patch has the changes for the same. > > > > > How about we check this case like the following ? > > > > > > > > > > List *schemaPubids = GetSchemaPublications(nspOid); > > > > > List *relPubids = GetRelationPublications(RelationGetRelid(rel)); > > > > > if (list_intersection(schemaPubids, relPubids)) > > > > > ereport(ERROR, ... > > > > > > > > > > > > > Won't this will allow changing one of the partitions for which only > > > > partitioned table is part of the target schema? > > > > > > I think it still disallow changing partition's schema to the published one. > > > I tested with the following SQLs. > > > ----- > > > create schema sch1; > > > create schema sch2; > > > create schema sch3; > > > > > > create table sch1.tbl1 (a int) partition by range ( a ); create table > > > sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (101); > > > create table sch3.tbl1_part2 partition of sch1.tbl1 for values from > > > (101) to (200); create publication pub for ALL TABLES IN schema sch1, > > > TABLE sch2.tbl1_part1; alter table sch2.tbl1_part1 set schema sch1; > > > ---* It will report an error here * > > > ----- > > > > > > > Use all steps before "create publication" and then try below. These will give an > > error with the patch proposed but if I change it to what you are proposing then > > it won't give an error. > > create publication pub for ALL TABLES IN schema sch2, Table sch1.tbl1; alter > > table sch3.tbl1_part2 set schema sch2; > > > > But now again thinking about it, I am not sure if we really want to give error in > > this case. What do you think? > > Personally, I think we can allow the above case. > > Because if user specify the partitioned table in the publication like above, > they cannot drop the partition separately. And the partitioned table is the > actual one in pg_publication_rel. So, I think allowing this case seems won't > make people feel confused. > Yeah, I also thought on similar lines. So, let's allow this case. > > > Also, if we use list_intersection trick, then how will > > we tell the publication due to which this problem has occurred, or do you think > > we should leave that as an exercise for the user? > > I thought list_intersection will return a puboids list in which the puboid exists in both input list. > We can choose the first puboid and output it in the error message which seems the same as > the current patch. > > But I noticed list_intersection doesn't support T_OidList, so we might need to search the puboid > Manually. Like: > > foreach(cell, relPubids) > { > if (list_member_oid(schemaPubids, lfirst_oid(cell))) > ereport(ERROR, > errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("cannot move table \"%s\" to schema \"%s\"", > RelationGetRelationName(rel), stmt->newschema), > errdetail("Altering table will result in having schema \"%s\" and sameschema's table \"%s\" in the publication \"%s\" which is not supported.", > stmt->newschema, > RelationGetRelationName(rel), > get_publication_name(lfirst_oid(cell), false))); > } > Looks good to me. -- With Regards, Amit Kapila.
On Wed, Sep 29, 2021 at 8:47 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, Tuesday, September 28, 2021 10:49 PM, vignesh C <vignesh21@gmail.com> wrote: > > > > Yes this should not be supported, we should throw an error in this case. > > This is handled in the v34 patch attached at [1]. > > [1] - https://www.postgresql.org/message- > > id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.g > > mail.com > > > > Thanks for fixing it. I confirmed the error can be output as expected. > > Here is a problem related to publish_via_partition_root option when using this > patch. With this option on, I think pg_get_publication_tables function gave an > unexcepted result and the subscriber would get dual data during table sync. > > > For example: > (I used pg_publication_tables view to make it looks clearer) > > create schema sch1; > create table sch1.tbl1 (a int) partition by range ( a ); > create table sch1.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10); > create table sch1.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20); > create table sch1.tbl1_part3 partition of sch1.tbl1 for values from (20) to (30); > create publication pub for all tables in schema sch1 with(publish_via_partition_root=1); > > postgres=# select * from pg_publication_tables where pubname='pub'; > pubname | schemaname | tablename > ---------+------------+------------ > pub | sch1 | tbl1_part1 > pub | sch1 | tbl1_part2 > pub | sch1 | tbl1_part3 > pub | sch1 | tbl1 > (4 rows) > > > It shows both the partitioned table and its leaf partitions. But the result of > FOR ALL TABLES publication couldn't show the leaf partitions. > > > postgres=# create publication pub_all for all tables with(publish_via_partition_root=1); > CREATE PUBLICATION > postgres=# select * from pg_publication_tables where pubname='pub_all'; > pubname | schemaname | tablename > ---------+------------+----------- > pub_all | sch1 | tbl1 > (1 row) > > > How about make the following change to avoid it? I tried it and it also fixed dual > data issue during table sync. > > > diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c > index 04e785b192..4e8ccdabc6 100644 > --- a/src/backend/catalog/pg_publication.c > +++ b/src/backend/catalog/pg_publication.c > @@ -632,7 +632,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt) > Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple); > Oid relid = relForm->oid; > > - if (is_publishable_class(relid, relForm)) > + if (is_publishable_class(relid, relForm) && > + !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT)) > result = lappend_oid(result, relid); > } The suggested change works, I have modified it in the attached patch. Regards, Vignesh
Attachment
- v35-0001-Added-schema-level-support-for-publication.patch
- v35-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v35-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v35-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v35-0005-Implemented-pg_publication_objects-view.patch
On Wed, Sep 29, 2021 at 9:07 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Tues, Sep 28, 2021 10:46 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v34 patch has the changes for the same. > > Thanks for updating the patch. > Here are a few comments. > > 1) > + * ALL TABLES IN SCHEMA schema [[, ...] > > [[ -> [ Modified > 2) > + /* ALTER PUBLICATION ... ADD/DROP TABLE/ALL TABLES IN SCHEMA parameters */ > > The two '/' seems a bit unclear and it doesn't mention the SET case. > Maybe we can write like: > > /* parameters used for ALTER PUBLICATION ... ADD/DROP/SET publication objects */ Modified > 3) > + /* > + * Check if setting the relation to a different schema will result in the > + * publication having schema and same schema's table in the publication. > + */ > + if (stmt->objectType == OBJECT_TABLE) > + { > + ListCell *lc; > + List *schemaPubids = GetSchemaPublications(nspOid); > + foreach(lc, schemaPubids) > + { > + Oid pubid = lfirst_oid(lc); > + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), > + relid)) > + ereport(ERROR, > > How about we check this case like the following ? > > List *schemaPubids = GetSchemaPublications(nspOid); > List *relPubids = GetRelationPublications(RelationGetRelid(rel)); > if (list_intersection(schemaPubids, relPubids)) > ereport(ERROR, ... Modified it with slight changes without using list_intersection. These comments are handled in the v35 version patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm2yJOEPCqR%3DgTMEwveJujH9c9_z4LhKmk2T3vZH7T1DLQ%40mail.gmail.com Regards, VIgnesh
On Wed, Sep 29, 2021 at 10:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 28, 2021 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Mon, Sep 27, 2021 at 12:15 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > Attached v34 patch has the changes for the same. > > > > Few comments on v34-0001-Added-schema-level-support-for-publication > ========================================================== > 1. > + * This rule parses publication object with and without keyword prefix. > > I think we should write it as: "This rule parses publication objects > with and without keyword prefixes." Modified > 2. > + * For the object without keyword prefix, we cannot just use > relation_expr here, > + * because some extended expression in relation_expr cannot be used as a > > /expression/expressions Modified > 3. > +/* > + * Process pubobjspec_list to check for errors in any of the objects and > + * convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType > + * type. Modified to remove type as discussed offline. > 4. > + /* > + * Check if setting the relation to a different schema will result in the > + * publication having schema and same schema's table in the publication. > + */ > + if (stmt->objectType == OBJECT_TABLE) > + { > + ListCell *lc; > + List *schemaPubids = GetSchemaPublications(nspOid); > + foreach(lc, schemaPubids) > + { > + Oid pubid = lfirst_oid(lc); > + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), > + relid)) > + ereport(ERROR, > + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot move table \"%s\" to schema \"%s\"", > + RelationGetRelationName(rel), stmt->newschema), > + errdetail("Altering table will result in having schema \"%s\" and > same schema's table \"%s\" in the publication \"%s\" which is not > supported.", > + stmt->newschema, > + RelationGetRelationName(rel), > + get_publication_name(pubid, false))); > + } > + } > > Let's slightly change the comment as: "Check that setting the relation > to a different schema won't result in the publication having schema > and the same schema's table." and errdetail as: "The schema \"%s\" and > same schema's table \"%s\" cannot be part of the same publication > \"%s\"." > > Maybe it is better to specify that this will disallow the partition table case. Modified, I did not add the above as we are allowing it. > 5. > ObjectsInPublicationToOids() > { > .. > + pubobj = (PublicationObjSpec *) lfirst(cell); > + if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE) > > It is better to keep an empty line between above two lines. Modified > 6. > List *schemaPubids = GetSchemaPublications(nspOid); > foreach(lc, schemaPubids) > .. > Oid pubid = lfirst_oid(lc); > if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), > > Add an empty line between each of the above two lines. Modified > 7. > + /* > + * Schema lock is held until the publication is altered to prevent > + * concurrent schema deletion. No need to unlock the schemas, the locks > + * will be released automatically at the end of alter publication command. > + */ > + LockSchemaList(schemaidlist); > > I think it is better to add a similar comment at other places where > this function is called. And we can shorten the comment atop > LockSchemaList to something like: "The schemas specified in the schema > list are locked in AccessShareLock mode in order to prevent concurrent > schema deletion." Modified > 8. In CreatePublication(), the check if (stmt->for_all_tables) can be > the first check and then in else if we can process tables and schemas. Modified > 9. > AlterPublication() > { > .. > + /* Lock the publication so nobody else can do anything with it. */ > + LockDatabaseObject(PublicationRelationId, pubform->oid, 0, > + AccessExclusiveLock); > > I think it is better to say why we need this lock. So, can we change > the comment to something like: "Lock the publication so nobody else > can do anything with it. This prevents concurrent alter to add > table(s) that were already going to become part of the publication by > adding corresponding schema(s) via this command and similarly it will > prevent the concurrent addition of schema(s) for which there is any > corresponding table being added by this command." Modified These comments are handled in the v35 version patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm2yJOEPCqR%3DgTMEwveJujH9c9_z4LhKmk2T3vZH7T1DLQ%40mail.gmail.com Regards, Vignesh
On Wed, Sep 29, 2021 at 11:49 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Wed, Sep 29, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > 4. > > + /* > > + * Check if setting the relation to a different schema will result in the > > + * publication having schema and same schema's table in the publication. > > + */ > > + if (stmt->objectType == OBJECT_TABLE) > > + { > > + ListCell *lc; > > + List *schemaPubids = GetSchemaPublications(nspOid); > > + foreach(lc, schemaPubids) > > + { > > + Oid pubid = lfirst_oid(lc); > > + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL), > > + relid)) > > + ereport(ERROR, > > + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > > + errmsg("cannot move table \"%s\" to schema \"%s\"", > > + RelationGetRelationName(rel), stmt->newschema), > > + errdetail("Altering table will result in having schema \"%s\" and > > same schema's table \"%s\" in the publication \"%s\" which is not > > supported.", > > + stmt->newschema, > > + RelationGetRelationName(rel), > > + get_publication_name(pubid, false))); > > + } > > + } > > > > Let's slightly change the comment as: "Check that setting the relation > > to a different schema won't result in the publication having schema > > and the same schema's table." and errdetail as: "The schema \"%s\" and > > same schema's table \"%s\" cannot be part of the same publication > > \"%s\"." > > > > Since this code is in AlterTableNamespace() and the relation being > checked may or may not be part of a publication, I'd use "a > publication" instead of "the publication" in the comment. > Also, I'd say that we're doing the check because the mentioned > combination is not supported. > > i.e. "Check that setting the relation to a different schema won't > result in a publication having both a schema and the same schema's > table, as this is not supported." Thanks for the comment, I have handled it in the v35 version patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm2yJOEPCqR%3DgTMEwveJujH9c9_z4LhKmk2T3vZH7T1DLQ%40mail.gmail.com Regards, Vignesh
On Thu, Sep 30, 2021 at 3:39 PM vignesh C <vignesh21@gmail.com> wrote: > > The suggested change works, I have modified it in the attached patch. > I have reviewed the latest version and made a number of changes to the 0001 patch. The changes are in v1-0001-Changes-by-Amit. It includes (a) Changed preprocess_pubobj_list() to make the code easy to understand, (b) the handling of few variables was missing in equal function, (c) the ordering of functions, and a few parameters were not matching .c and .h files, (d) added/edited multiple comments and other cosmetic changes. Apart from that, I have few other comments: 1. It seems you have started using unique list variants in GetPubPartitionOptionRelations because one of its caller GetSchemaPublicationRelations need it. I think the unique variants are costlier, so isn't it better to use it where it is required? I think it would be good to use in GetPubPartitionOptionRelations, if most of its caller requires the same. 2. In GetSchemaPublicationRelations(), I think we need to perform a second scan using RELKIND_PARTITIONED_TABLE only if we publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is what we are doing in GetAllTablesPublicationRelations. Is there a reason to be different here? 3. @@ -538,7 +788,7 @@ RemovePublicationById(Oid pubid) if (!HeapTupleIsValid(tup)) elog(ERROR, "cache lookup failed for publication %u", pubid); - pubform = (Form_pg_publication)GETSTRUCT(tup); + pubform = (Form_pg_publication) GETSTRUCT(tup); We don't need the above change for this patch. I think this may be due pgindent but we can do this separately rather than as part of this patch. -- With Regards, Amit Kapila.
Attachment
On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 30, 2021 at 3:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > The suggested change works, I have modified it in the attached patch. > > > > I have reviewed the latest version and made a number of changes to the > 0001 patch. The changes are in v1-0001-Changes-by-Amit. It includes > (a) Changed preprocess_pubobj_list() to make the code easy to > understand, (b) the handling of few variables was missing in equal > function, (c) the ordering of functions, and a few parameters were not > matching .c and .h files, (d) added/edited multiple comments and other > cosmetic changes. I have merged these changes into the main patch. > Apart from that, I have few other comments: > 1. It seems you have started using unique list variants in > GetPubPartitionOptionRelations because one of its caller > GetSchemaPublicationRelations need it. I think the unique variants are > costlier, so isn't it better to use it where it is required? I think > it would be good to use in GetPubPartitionOptionRelations, if most of > its caller requires the same. I have removed unique list changes from GetPubPartitionOptionRelations and handled it in GetSchemaPublicationRelations. > 2. In GetSchemaPublicationRelations(), I think we need to perform a > second scan using RELKIND_PARTITIONED_TABLE only if we > publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is > what we are doing in GetAllTablesPublicationRelations. Is there a > reason to be different here? In the first table scan we are getting all the ordinary tables present in the schema. In the second table scan we will get all the partitioned table present in the schema and the relations will be added based on pub_partopt. I felt if we have the check we will not get the relations in the following case: create schema sch1; create schema sch2; create table sch1.p (a int) partition by list (a); create table sch2.c1 partition of sch1.p for values in (1); > 3. > @@ -538,7 +788,7 @@ RemovePublicationById(Oid pubid) > if (!HeapTupleIsValid(tup)) > elog(ERROR, "cache lookup failed for publication %u", pubid); > > - pubform = (Form_pg_publication)GETSTRUCT(tup); > + pubform = (Form_pg_publication) GETSTRUCT(tup); > > We don't need the above change for this patch. I think this may be due > pgindent but we can do this separately rather than as part of this > patch. Removed this change. Attached v36 patch has the changes for the same. Regards, Vignesh
Attachment
- v36-0001-Added-schema-level-support-for-publication.patch
- v36-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v36-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v36-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v36-0005-Implemented-pg_publication_objects-view.patch
On Sun, Oct 3, 2021 at 11:25 PM vignesh C <vignesh21@gmail.com> wrote: > > On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > 2. In GetSchemaPublicationRelations(), I think we need to perform a > > second scan using RELKIND_PARTITIONED_TABLE only if we > > publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is > > what we are doing in GetAllTablesPublicationRelations. Is there a > > reason to be different here? > > In the first table scan we are getting all the ordinary tables present > in the schema. In the second table scan we will get all the > partitioned table present in the schema and the relations will be > added based on pub_partopt. I felt if we have the check we will not > get the relations in the following case: > create schema sch1; > create schema sch2; > create table sch1.p (a int) partition by list (a); > create table sch2.c1 partition of sch1.p for values in (1); > But we don't need to get the partitioned tables for the invalidations, see the corresponding case for tables. So, I am not sure why you have used two scans to the system table for such scenarios? Few additional comments on v36-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN: ========================================================================= 1. @@ -3961,21 +3965,25 @@ getPublications(Archive *fout, int *numPublications) appendPQExpBuffer(query, "SELECT p.tableoid, p.oid, p.pubname, " "(%s p.pubowner) AS rolname, " - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubviaroot " + "p.puballtables, p.pubinsert, p.pubupdate, " + "p.pubdelete, p.pubtruncate, p.pubviaroot " "FROM pg_publication p", username_subquery); else if (fout->remoteVersion >= 110000) appendPQExpBuffer(query, "SELECT p.tableoid, p.oid, p.pubname, " "(%s p.pubowner) AS rolname, " - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false AS pubviaroot " + "p.puballtables, p.pubinsert, p.pubupdate, " + "p.pubdelete, p.pubtruncate, false AS pubviaroot " "FROM pg_publication p", username_subquery); else appendPQExpBuffer(query, "SELECT p.tableoid, p.oid, p.pubname, " "(%s p.pubowner) AS rolname, " - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate, false AS pubviaroot " + "p.puballtables, p.pubinsert, p.pubupdate, " + "p.pubdelete, false AS pubtruncate, " + "false AS pubviaroot " "FROM pg_publication p", username_subquery); Is there a reason to change this part of the code? 2. @@ -257,6 +257,9 @@ getSchemaData(Archive *fout, int *numTablesPtr) pg_log_info("reading publication membership"); getPublicationTables(fout, tblinfo, numTables); + pg_log_info("reading publication tables in schemas"); + getPublicationNamespaces(fout, nspinfo, numNamespaces); I think for the above change, the first should be changed to "reading publication membership of tables" and the second one should be changed to "reading publication membership of schemas". 3. The function names getPublicationNamespaces and dumpPublicationSchema are not in sync. Let's name the second one as dumpPublicationNamespace. 4. It is not clear to me why the patch has introduced a new component type object DUMP_COMPONENT_PUBSCHEMA. In particular, in the below code if we are already setting DUMP_COMPONENT_ALL, how the additional setting of DUMP_COMPONENT_PUBSCHEMA helps? @@ -1631,9 +1631,13 @@ selectDumpableNamespace(NamespaceInfo *nsinfo, Archive *fout) if (nsinfo->nspowner == ROLE_PG_DATABASE_OWNER) nsinfo->dobj.dump &= ~DUMP_COMPONENT_DEFINITION; nsinfo->dobj.dump_contains = DUMP_COMPONENT_ALL; + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA; } else + { nsinfo->dobj.dump_contains = nsinfo->dobj.dump = DUMP_COMPONENT_ALL; + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA; + } -- With Regards, Amit Kapila.
On Mon, Oct 4, 2021 at 4:55 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v36 patch has the changes for the same. > I have some comments on the v36-0001 patch: src/backend/commands/publicationcmds.c (1) GetPublicationSchemas() + /* Find all publications associated with the schema */ + pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock); I think the comment is not correct. It should say: + /* Find all schemas associated with the publication */ (2) AlterPublicationSchemas I think that a comment should be added for the following lines, something like the comment used in the similar check in AlterPublicationTables(): + if (!schemaidlist && stmt->action != DEFELEM_SET) + return; (3) CheckAlterPublication Minor comment fix suggested: BEFORE: + * Check if relations and schemas can be in given publication and throws AFTER: + * Check if relations and schemas can be in a given publication and throw (4) LockSchemaList() Suggest re-word of comment, to match imperative comment style used elsewhere in this code. BEFORE: + * The schemas specified in the schema list are locked in AccessShareLock mode AFTER: + * Lock the schemas specified in the schema list in AccessShareLock mode src/backend/commands/tablecmds.c (5) Code has been added to prevent a table being set (via ALTER TABLE) to UNLOGGED if it is part of a publication, but I found that I could still add all tables of a schema having a table that is UNLOGGED: postgres=# create schema sch; CREATE SCHEMA postgres=# create unlogged table sch.test(i int); CREATE TABLE postgres=# create publication pub for table sch.test; ERROR: cannot add relation "test" to publication DETAIL: Temporary and unlogged relations cannot be replicated. postgres=# create publication pub for all tables in schema sch; CREATE PUBLICATION Regards, Greg Nancarrow Fujitsu Australia
On Tue, Oct 5, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > (5) > > Code has been added to prevent a table being set (via ALTER TABLE) to > UNLOGGED if it is part of a publication, but I found that I could > still add all tables of a schema having a table that is UNLOGGED: > > postgres=# create schema sch; > CREATE SCHEMA > postgres=# create unlogged table sch.test(i int); > CREATE TABLE > postgres=# create publication pub for table sch.test; > ERROR: cannot add relation "test" to publication > DETAIL: Temporary and unlogged relations cannot be replicated. > postgres=# create publication pub for all tables in schema sch; > CREATE PUBLICATION > What about when you use "create publication pub for all tables;"? I think that also works, now on similar lines shouldn't the behavior of "all tables in schema" publication be the same? I mean if we want we can detect and give an error but is it advisable to give an error if there are just one or few tables in schema that are unlogged? -- With Regards, Amit Kapila.
On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Code has been added to prevent a table being set (via ALTER TABLE) to > > UNLOGGED if it is part of a publication, but I found that I could > > still add all tables of a schema having a table that is UNLOGGED: > > > > postgres=# create schema sch; > > CREATE SCHEMA > > postgres=# create unlogged table sch.test(i int); > > CREATE TABLE > > postgres=# create publication pub for table sch.test; > > ERROR: cannot add relation "test" to publication > > DETAIL: Temporary and unlogged relations cannot be replicated. > > postgres=# create publication pub for all tables in schema sch; > > CREATE PUBLICATION > > > > What about when you use "create publication pub for all tables;"? I > think that also works, now on similar lines shouldn't the behavior of > "all tables in schema" publication be the same? I mean if we want we > can detect and give an error but is it advisable to give an error if > there are just one or few tables in schema that are unlogged? > OK, it seems that for the ALL TABLES case, there is no such error check, and it just silently skips replication of any temporary/unlogged tables. This is intentional, right? I couldn't see any documentation specifically related to this, so I think perhaps it should be updated to describe this behaviour. At the moment, the existing documentation just states FOR TABLE that "Temporary tables, unlogged tables, foreign tables, materialized views, and regular views cannot be part of a publication". Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the ALL TABLES case. Problem is, shouldn't setting UNLOGGED on a table only then be disallowed if that table was publicised using FOR TABLE? With the patch applied: postgres=# create publication pub3 for all tables in schema sch; CREATE PUBLICATION postgres=# create table sch.test3(i int); CREATE TABLE postgres=# alter table sch.test3 set unlogged; ERROR: cannot change table "test3" to unlogged because it is part of a publication DETAIL: Unlogged relations cannot be replicated. Regards, Greg Nancarrow Fujitsu Australia
On Tue, Oct 5, 2021 at 4:40 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > >At the > moment, the existing documentation just states FOR TABLE that > "Temporary tables, unlogged tables, foreign tables, materialized > views, and regular views cannot be part of a publication". Oh, I see that in the v36-0004 doc update patch, it has added the following for CREATE PUBLICATION ... FOR ALL TABLES IN SCHEMA: "Only persistent base tables and partitioned tables present in the schema will be included as part of the publication. Temporary tables, unlogged tables, foreign tables, materialized views, and regular views from the schema will not be part of the publication." Regards, Greg Nancarrow Fujitsu Australia
On Tue, Oct 5, 2021 at 11:10 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > Code has been added to prevent a table being set (via ALTER TABLE) to > > > UNLOGGED if it is part of a publication, but I found that I could > > > still add all tables of a schema having a table that is UNLOGGED: > > > > > > postgres=# create schema sch; > > > CREATE SCHEMA > > > postgres=# create unlogged table sch.test(i int); > > > CREATE TABLE > > > postgres=# create publication pub for table sch.test; > > > ERROR: cannot add relation "test" to publication > > > DETAIL: Temporary and unlogged relations cannot be replicated. > > > postgres=# create publication pub for all tables in schema sch; > > > CREATE PUBLICATION > > > > > > > What about when you use "create publication pub for all tables;"? I > > think that also works, now on similar lines shouldn't the behavior of > > "all tables in schema" publication be the same? I mean if we want we > > can detect and give an error but is it advisable to give an error if > > there are just one or few tables in schema that are unlogged? > > > > .. > Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the > ALL TABLES case. > Problem is, shouldn't setting UNLOGGED on a table only then be > disallowed if that table was publicised using FOR TABLE? > Right, I also think so. Let us see what Vignesh or others think on this matter. -- With Regards, Amit Kapila.
On Tue, Oct 5, 2021 at 4:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Oct 5, 2021 at 11:10 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > Code has been added to prevent a table being set (via ALTER TABLE) to > > > > UNLOGGED if it is part of a publication, but I found that I could > > > > still add all tables of a schema having a table that is UNLOGGED: > > > > > > > > postgres=# create schema sch; > > > > CREATE SCHEMA > > > > postgres=# create unlogged table sch.test(i int); > > > > CREATE TABLE > > > > postgres=# create publication pub for table sch.test; > > > > ERROR: cannot add relation "test" to publication > > > > DETAIL: Temporary and unlogged relations cannot be replicated. > > > > postgres=# create publication pub for all tables in schema sch; > > > > CREATE PUBLICATION > > > > > > > > > > What about when you use "create publication pub for all tables;"? I > > > think that also works, now on similar lines shouldn't the behavior of > > > "all tables in schema" publication be the same? I mean if we want we > > > can detect and give an error but is it advisable to give an error if > > > there are just one or few tables in schema that are unlogged? > > > > > > > > .. > > Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the > > ALL TABLES case. > > Problem is, shouldn't setting UNLOGGED on a table only then be > > disallowed if that table was publicised using FOR TABLE? > > > > Right, I also think so. Let us see what Vignesh or others think on this matter. Even I felt ALL TABLES IN SCHEMA should behave the same way as the ALL TABLES case. I will keep the create publication behavior as it is i.e. to allow even if unlogged tables are present and change the below alter table behavior which was throwing error to be successful to keep it similar to ALL TABLES publication: alter table sch.test3 set unlogged; ERROR: cannot change table "test3" to unlogged because it is part of a publication DETAIL: Unlogged relations cannot be replicated. Regards, Vignesh
On Mon, Oct 4, 2021 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Oct 3, 2021 at 11:25 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Sat, Oct 2, 2021 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > 2. In GetSchemaPublicationRelations(), I think we need to perform a > > > second scan using RELKIND_PARTITIONED_TABLE only if we > > > publish_via_root (aka pub_partopt is PUBLICATION_PART_ROOT). This is > > > what we are doing in GetAllTablesPublicationRelations. Is there a > > > reason to be different here? > > > > In the first table scan we are getting all the ordinary tables present > > in the schema. In the second table scan we will get all the > > partitioned table present in the schema and the relations will be > > added based on pub_partopt. I felt if we have the check we will not > > get the relations in the following case: > > create schema sch1; > > create schema sch2; > > create table sch1.p (a int) partition by list (a); > > create table sch2.c1 partition of sch1.p for values in (1); > > > > But we don't need to get the partitioned tables for the invalidations, > see the corresponding case for tables. So, I am not sure why you have > used two scans to the system table for such scenarios? The second loop is required to get the 'p' relkind relations like in the below case: create schema sch1; create schema sch2; create table sch1.p1 (a int) partition by list (a); create table sch2.c1 partition of sch1.p1 for values in (1); create table sch2.p2(a int) partition by list (a); create table sch1.c2 partition of sch2.p2 for values in (1); create publication pub1 for all tables in schema sch2; The first loop will give us sch2.c1 relation and the second loop will get sch2.p2, sch1.c2, this is required for schema publication as the schema can have both r relkind and p relkind tables and all of them need to be invalidated. This is not required in case of table publication as we can get the required relations from that particular table. > Few additional comments on > v36-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN: > ========================================================================= > 1. > @@ -3961,21 +3965,25 @@ getPublications(Archive *fout, int *numPublications) > appendPQExpBuffer(query, > "SELECT p.tableoid, p.oid, p.pubname, " > "(%s p.pubowner) AS rolname, " > - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, > p.pubtruncate, p.pubviaroot " > + "p.puballtables, p.pubinsert, p.pubupdate, " > + "p.pubdelete, p.pubtruncate, p.pubviaroot " > "FROM pg_publication p", > username_subquery); > else if (fout->remoteVersion >= 110000) > appendPQExpBuffer(query, > "SELECT p.tableoid, p.oid, p.pubname, " > "(%s p.pubowner) AS rolname, " > - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, > p.pubtruncate, false AS pubviaroot " > + "p.puballtables, p.pubinsert, p.pubupdate, " > + "p.pubdelete, p.pubtruncate, false AS pubviaroot " > "FROM pg_publication p", > username_subquery); > else > appendPQExpBuffer(query, > "SELECT p.tableoid, p.oid, p.pubname, " > "(%s p.pubowner) AS rolname, " > - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS > pubtruncate, false AS pubviaroot " > + "p.puballtables, p.pubinsert, p.pubupdate, " > + "p.pubdelete, false AS pubtruncate, " > + "false AS pubviaroot " > "FROM pg_publication p", > username_subquery); > > Is there a reason to change this part of the code? It is not required, I have removed it. > 2. > @@ -257,6 +257,9 @@ getSchemaData(Archive *fout, int *numTablesPtr) > pg_log_info("reading publication membership"); > getPublicationTables(fout, tblinfo, numTables); > > + pg_log_info("reading publication tables in schemas"); > + getPublicationNamespaces(fout, nspinfo, numNamespaces); > > I think for the above change, the first should be changed to "reading > publication membership of tables" and the second one should be changed > to "reading publication membership of schemas". Modified > 3. The function names getPublicationNamespaces and > dumpPublicationSchema are not in sync. Let's name the second one as > dumpPublicationNamespace. Modified > 4. It is not clear to me why the patch has introduced a new component > type object DUMP_COMPONENT_PUBSCHEMA. In particular, in the below code > if we are already setting DUMP_COMPONENT_ALL, how the additional > setting of DUMP_COMPONENT_PUBSCHEMA helps? > > @@ -1631,9 +1631,13 @@ selectDumpableNamespace(NamespaceInfo *nsinfo, > Archive *fout) > if (nsinfo->nspowner == ROLE_PG_DATABASE_OWNER) > nsinfo->dobj.dump &= ~DUMP_COMPONENT_DEFINITION; > nsinfo->dobj.dump_contains = DUMP_COMPONENT_ALL; > + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA; > } > else > + { > nsinfo->dobj.dump_contains = nsinfo->dobj.dump = DUMP_COMPONENT_ALL; > + nsinfo->dobj.dump |= DUMP_COMPONENT_PUBSCHEMA; > + } I have removed DUMP_COMPONENT_PUBSCHEMA and used DUMP_COMPONENT_NONE to identify the publications that should be dumped. Attached v37 patch has the changes for the same. Regards, Vignesh
Attachment
- v37-0001-Added-schema-level-support-for-publication.patch
- v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v37-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v37-0005-Implemented-pg_publication_objects-view.patch
On Tue, Oct 5, 2021 at 6:57 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Mon, Oct 4, 2021 at 4:55 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v36 patch has the changes for the same. > > > > I have some comments on the v36-0001 patch: > > src/backend/commands/publicationcmds.c > > (1) > GetPublicationSchemas() > > + /* Find all publications associated with the schema */ > + pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock); > > I think the comment is not correct. It should say: > > + /* Find all schemas associated with the publication */ Modified > (2) > AlterPublicationSchemas > > I think that a comment should be added for the following lines, > something like the comment used in the similar check in > AlterPublicationTables(): > > + if (!schemaidlist && stmt->action != DEFELEM_SET) > + return; Modified > (3) > CheckAlterPublication > > Minor comment fix suggested: > > BEFORE: > + * Check if relations and schemas can be in given publication and throws > AFTER: > + * Check if relations and schemas can be in a given publication and throw Modified > (4) > LockSchemaList() > > Suggest re-word of comment, to match imperative comment style used > elsewhere in this code. > > BEFORE: > + * The schemas specified in the schema list are locked in AccessShareLock mode > AFTER: > + * Lock the schemas specified in the schema list in AccessShareLock mode Modified > > src/backend/commands/tablecmds.c > > (5) > > Code has been added to prevent a table being set (via ALTER TABLE) to > UNLOGGED if it is part of a publication, but I found that I could > still add all tables of a schema having a table that is UNLOGGED: > > postgres=# create schema sch; > CREATE SCHEMA > postgres=# create unlogged table sch.test(i int); > CREATE TABLE > postgres=# create publication pub for table sch.test; > ERROR: cannot add relation "test" to publication > DETAIL: Temporary and unlogged relations cannot be replicated. > postgres=# create publication pub for all tables in schema sch; > CREATE PUBLICATION I have changed the alter table behavior to allow setting it to an unlogged table to keep the behavior similar to "ALL TABLES" publication. I have kept the create publication behavior as it is, it will be similar to "ALL TABLES" publication i.e to allow create publication even if there are unlogged tables present. These comments are handled in the v37 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0ON%3D012jGC3oquSVVWTWXhHG0q8yOyRROVGFR9PjWa-g%40mail.gmail.com Regards, Vignesh
On Wed, Oct 6, 2021 at 4:42 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v37 patch has the changes for the same. > A small issue I noticed is that using "\dS" in PSQL shows UNLOGGED tables as belonging to a publication, if the table belongs to a schema that was added to the publication using ALL TABLES IN SCHEMA (yet doesn't show as part of an ALL TABLES publication). Since publication of UNLOGGED tables is silently skipped in the case of ALL TABLES and ALL TABLES IN SCHEMA, it shouldn't show as belonging to the publication, right? test=# \dRp+ pub2 Publication pub2 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -------+------------+---------+---------+---------+-----------+---------- gregn | t | t | t | t | t | f (1 row) test=# \dRp+ pub Publication pub Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -------+------------+---------+---------+---------+-----------+---------- gregn | f | t | t | t | t | t Tables from schemas: "sch1" test=# \dS sch1.test2 Unlogged table "sch1.test2" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- i | integer | | | Publications: "pub" Regards, Greg Nancarrow Fujitsu Australia
On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v37 patch has the changes for the same. > Few comments: ============== v37-0001-Added-schema-level-support-for-publication 1. + * + * The first scan will get all the 'r' relkind tables for the specified schema, + * iterate the 'r' relkind tables and prepare a list of: + * 1) non partition table if pub_partopt is PUBLICATION_PART_ROOT + * 2) partition table and non partition table if pub_partopt is + * PUBLICATION_PART_LEAF. + * + * The second scan will get all the 'p'' relkind tables for the specified + * schema, iterate the 'p' relkind tables and prepare a list of: + * 1) partition table's child relations if pub_partopt is PUBLICATION_PART_LEAF + * 2) partition table if pub_partopt is PUBLICATION_PART_ROOT. I think these comments are redundant and not sure if they are completely correct. We don't need these as the actual code explains these conditions better. The earlier part of these comments is sufficient. v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN 2. + * selectDumpablePublicationObject: policy-setting subroutine + * Mark a publication as to be dumped or not * - * Publication tables have schemas, but those are ignored in decision making, + * Publications have schemas, but those are ignored in decision making, * because publications are only dumped when we are dumping everything. */ Change the above comment lines: a. "Mark a publication as to be dumped or not" to "Mark a publication object as to be dumped or not". b. "Publications have schemas, but those are ignored in decision making, .." to "A publication can have schemas and tables which have schemas, but those are ignored in decision making, .." 3. +/* + * dumpPublicationNamespace + * dump the definition of the given publication tables in schema mapping + */ Can we change the comment to: "dump the definition of the given publication schema mapping"? IT is easier to read and understand. 4. +/* + * The PublicationSchemaInfo struct is used to represent publication tables + * in schema mapping. + */ +typedef struct _PublicationSchemaInfo +{ + DumpableObject dobj; + NamespaceInfo *pubschema; + PublicationInfo *publication; +} PublicationSchemaInfo; Can we change the comment similar to the comment change in point 3? Also, let's keep PublicationInfo * before NamespaceInfo * just to be consistent with the existing structure PublicationRelInfo? 5. + printfPQExpBuffer(&buf, + "SELECT p.pubname\n" + "FROM pg_catalog.pg_publication p\n" + " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n" + " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid AND pc.oid = '%s'\n" I think this part of the query can be improved in multiple ways: (a) pc.oid = '%s' should be part of WHERE clause not join condition, (b) for pubname, no need to use alias name, it can be directly referred as pubname, (c) you can check if the relation is publishable. So, the formed query would look like: SELECT p.pubname FROM pg_catalog.pg_publication p JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid WHERE pc.oid = '%s' and pg_catalog.pg_relation_is_publishable('%s') 6. listSchemas() { .. + if (pub_schema_tuples > 0) + { + /* + * Allocate memory for footers. Size of footers will be 1 (for + * storing "Publications:" string) + Schema count + 1 (for + * storing NULL). + */ + footers = (char **) palloc((1 + pub_schema_tuples + 1) * sizeof(char *)); + footers[0] = pstrdup(_("Publications:")); + + /* Might be an empty set - that's ok */ + for (i = 0; i < pub_schema_tuples; i++) + { + printfPQExpBuffer(&buf, " \"%s\"", + PQgetvalue(result, i, 0)); + + footers[i + 1] = pstrdup(buf.data); + } + + footers[i + 1] = NULL; + myopt.footers = footers; + } .. } Is there a reason of not using printTableAddFooter() here similar to how we use it to print Publications in describeOneTableDetails()? 7. describePublications() { .. + /* Get the schemas for the specified publication */ + printfPQExpBuffer(&buf, + "SELECT n.nspname\n" + "FROM pg_catalog.pg_namespace n,\n" + " pg_catalog.pg_publication_namespace pn\n" + "WHERE n.oid = pn.pnnspid\n" + " AND pn.pnpubid = '%s'\n" + "ORDER BY 1", pubid); + if (!addFooterToPublicationDesc(&buf, "Tables from schemas:", + true, &cont)) + goto error_return; .. } Shouldn't we try to get schemas only when pset.sversion >= 150000? 8. +addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg, + bool singlecol, printTableContent *cont) +{ .. + termPQExpBuffer(buf); .. } It seems this buffer is freed at the caller's site, if so, no need to free it here. -- With Regards, Amit Kapila.
On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v37 patch has the changes for the same. > > > > Few comments: > ============== > Few more comments: ==================== v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN 1. + else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA")) + COMPLETE_WITH_QUERY(Query_for_list_of_schemas + " UNION SELECT 'CURRENT_SCHEMA' " + "UNION SELECT 'WITH ('"); What is the need to display WITH here? It will be displayed after Schema name with the below rule: + else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "IN", "SCHEMA", MatchAny)) + COMPLETE_WITH("WITH ("); 2. It seems tab-completion happens incorrectly for the below case: create publication pub for all tables in schema s1, If I press the tab after above, it completes with below which is wrong because it will lead to incorrect syntax. create publication pub for all tables in schema s1, WITH ( v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication 3. --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl .. + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => { + create_order => 51, + create_sql => + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;', + regexp => qr/^ + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E + /xm, + like => { %full_runs, section_post_data => 1, }, + unlike => { exclude_dump_test_schema => 1, }, In this test, won't it exclude the schema dump_test because of unlike? If so, then we don't have coverage for "ALL Tables In Schema" except for public schema? 4. --- a/src/test/regress/expected/publication.out +++ b/src/test/regress/expected/publication.out .. +-- fail - can't add schema to for all tables publication +ALTER PUBLICATION testpub_foralltables ADD ALL TABLES IN SCHEMA pub_test; In the above and all similar comments, it is better to either quote 'for all tables' or write in CAPS FOR ALL TABLE or both 'FOR ALL TABLE'. 5. +\dRp+ testpub1_forschema + Publication testpub1_forschema + Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root +--------------------------+------------+---------+---------+---------+-----------+---------- + regress_publication_user | f | t | t | t | t | f +Tables from schemas: + "pub_test1" + +SELECT p.pubname FROM pg_catalog.pg_publication p, pg_catalog.pg_namespace n, pg_catalog.pg_publication_namespace pn WHERE n.oid = pn.pnnspid AND p.oid = pn.pnpubid AND n.nspname = 'pub_test1' ORDER BY 1; + pubname +-------------------- + testpub1_forschema +(1 row) I don't think in the above and similar tests, we need to separately check the presence of publication via Select query, if we have tested it via psql command. Let's try to keep the meaningful tests. 6. +INSERT INTO pub_test1.tbl VALUES(1, 'test'); +-- fail +UPDATE pub_test1.tbl SET id = 2; +ERROR: cannot update table "tbl" because it does not have a replica identity and publishes updates +HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1; +-- success +UPDATE pub_test1.tbl SET id = 2; +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1; +-- fail +UPDATE pub_test1.tbl SET id = 2; +ERROR: cannot update table "tbl" because it does not have a replica identity and publishes updates +HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1; +-- success +UPDATE pub_test1.tbl SET id = 2; +ALTER PUBLICATION testpub1_forschema ADD ALL TABLES IN SCHEMA pub_test1; +-- fail +UPDATE pub_test1.tbl SET id = 2; +ERROR: cannot update table "tbl" because it does not have a replica identity and publishes updates +HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. I think here we don't need to test both SET and ADD variants, one of them is sufficient. 7. +-- verify invalidation of partition table having partition on different schema I think this comment is not very clear to me. Can we change it to: "verify invalidation of partition table having parent and child tables in different schema"? 8. +-- verify invalidation of schema having partition parent table and partition child table Similarly, let's change this to: "verify invalidation of partition tables for schema publication that has parent and child tables of different partition hierarchies". Keep comments line boundary as 80 chars, that way they look readable. 9. +++ b/src/test/subscription/t/025_rep_changes_for_schema.pl .. +# Basic logical replication test Let's change this comment to: "Logical replication tests for schema publications" -- With Regards, Amit Kapila.
> On Friday, October 8, 2021 7:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication > 3. > --- a/src/bin/pg_dump/t/002_pg_dump.pl > +++ b/src/bin/pg_dump/t/002_pg_dump.pl > .. > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => { > + create_order => 51, > + create_sql => > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;', > + regexp => qr/^ > + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E > + /xm, > + like => { %full_runs, section_post_data => 1, }, > + unlike => { exclude_dump_test_schema => 1, }, > > In this test, won't it exclude the schema dump_test because of unlike? > If so, then we don't have coverage for "ALL Tables In Schema" except > for public schema? > Yes, the unlike case will exclude the schema dump_test, but I think schema dump_test could be dumped in like case. I checked the log file src/bin/pg_dump/tmp_check/log/regress_log_002_pg_dump and saw some cases were described as "should dump ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test". I think in these cases schema dump_test would be dumped. Besides, a small comment on tab-complete.c: else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL")) - COMPLETE_WITH("TABLES"); - else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES") - || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny)) + COMPLETE_WITH("TABLES", "TABLE IN SCHEMA"); COMPLETE_WITH("TABLES", "TABLE IN SCHEMA"); -> COMPLETE_WITH("TABLES", "TABLES IN SCHEMA"); Regards Tang
On Thu, Oct 7, 2021 at 12:51 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Wed, Oct 6, 2021 at 4:42 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v37 patch has the changes for the same. > > > > A small issue I noticed is that using "\dS" in PSQL shows UNLOGGED > tables as belonging to a publication, if the table belongs to a schema > that was added to the publication using ALL TABLES IN SCHEMA (yet > doesn't show as part of an ALL TABLES publication). > Since publication of UNLOGGED tables is silently skipped in the case > of ALL TABLES and ALL TABLES IN SCHEMA, it shouldn't show as belonging > to the publication, right? > > test=# \dRp+ pub2 > Publication pub2 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -------+------------+---------+---------+---------+-----------+---------- > gregn | t | t | t | t | t | f > (1 row) > > test=# \dRp+ pub > Publication pub > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -------+------------+---------+---------+---------+-----------+---------- > gregn | f | t | t | t | t | t > Tables from schemas: > "sch1" > > test=# \dS sch1.test2 > Unlogged table "sch1.test2" > Column | Type | Collation | Nullable | Default > --------+---------+-----------+----------+--------- > i | integer | | | > Publications: > "pub" Thanks for reporting the issue. The attached patch has the fix for the issue. Regards, Vignesh
Attachment
- v38-0001-Added-schema-level-support-for-publication.patch
- v38-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v38-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v38-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v38-0005-Implemented-pg_publication_objects-view.patch
On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v37 patch has the changes for the same. > > > > Few comments: > ============== > v37-0001-Added-schema-level-support-for-publication > 1. > + * > + * The first scan will get all the 'r' relkind tables for the specified schema, > + * iterate the 'r' relkind tables and prepare a list of: > + * 1) non partition table if pub_partopt is PUBLICATION_PART_ROOT > + * 2) partition table and non partition table if pub_partopt is > + * PUBLICATION_PART_LEAF. > + * > + * The second scan will get all the 'p'' relkind tables for the specified > + * schema, iterate the 'p' relkind tables and prepare a list of: > + * 1) partition table's child relations if pub_partopt is PUBLICATION_PART_LEAF > + * 2) partition table if pub_partopt is PUBLICATION_PART_ROOT. > > I think these comments are redundant and not sure if they are > completely correct. We don't need these as the actual code explains > these conditions better. The earlier part of these comments is > sufficient. Removed it. > v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN > 2. > + * selectDumpablePublicationObject: policy-setting subroutine > + * Mark a publication as to be dumped or not > * > - * Publication tables have schemas, but those are ignored in decision making, > + * Publications have schemas, but those are ignored in decision making, > * because publications are only dumped when we are dumping everything. > */ > > Change the above comment lines: > a. "Mark a publication as to be dumped or not" to "Mark a publication > object as to be dumped or not". > > b. "Publications have schemas, but those are ignored in decision > making, .." to "A publication can have schemas and tables which have > schemas, but those are ignored in decision making, .." Modified > 3. > +/* > + * dumpPublicationNamespace > + * dump the definition of the given publication tables in schema mapping > + */ > > Can we change the comment to: "dump the definition of the given > publication schema mapping"? IT is easier to read and understand. > > 4. > +/* > + * The PublicationSchemaInfo struct is used to represent publication tables > + * in schema mapping. > + */ > +typedef struct _PublicationSchemaInfo > +{ > + DumpableObject dobj; > + NamespaceInfo *pubschema; > + PublicationInfo *publication; > +} PublicationSchemaInfo; > > Can we change the comment similar to the comment change in point 3? > Also, let's keep PublicationInfo * before NamespaceInfo * just to be > consistent with the existing structure PublicationRelInfo? > > 5. > + printfPQExpBuffer(&buf, > + "SELECT p.pubname\n" > + "FROM pg_catalog.pg_publication p\n" > + " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n" > + " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid AND > pc.oid = '%s'\n" > > I think this part of the query can be improved in multiple ways: (a) > pc.oid = '%s' should be part of WHERE clause not join condition, (b) > for pubname, no need to use alias name, it can be directly referred as > pubname, (c) you can check if the relation is publishable. So, the > formed query would look like: > > SELECT p.pubname FROM pg_catalog.pg_publication p JOIN > pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid JOIN > pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid WHERE pc.oid = > '%s' and pg_catalog.pg_relation_is_publishable('%s') Modified > 6. > listSchemas() > { > .. > + if (pub_schema_tuples > 0) > + { > + /* > + * Allocate memory for footers. Size of footers will be 1 (for > + * storing "Publications:" string) + Schema count + 1 (for > + * storing NULL). > + */ > + footers = (char **) palloc((1 + pub_schema_tuples + 1) * sizeof(char *)); > + footers[0] = pstrdup(_("Publications:")); > + > + /* Might be an empty set - that's ok */ > + for (i = 0; i < pub_schema_tuples; i++) > + { > + printfPQExpBuffer(&buf, " \"%s\"", > + PQgetvalue(result, i, 0)); > + > + footers[i + 1] = pstrdup(buf.data); > + } > + > + footers[i + 1] = NULL; > + myopt.footers = footers; > + } > .. > } > > Is there a reason of not using printTableAddFooter() here similar to > how we use it to print Publications in describeOneTableDetails()? There are 2 ways to print table in psql: 1) call printTableInit, printTableAddHeader, printTableAddCell, printTableAddFooter, printTable & printTableCleanup to print the table 2) prepare the table contents and call printQuery to print the table(which will take care of handling all of the above) describeOneTableDetails uses 1st method listSchemas uses 2nd method, in case of this method since table is not initialized we cannot use printTableAddFooter. we have to prepare the footers and set the footers. > 7. > describePublications() > { > .. > + /* Get the schemas for the specified publication */ > + printfPQExpBuffer(&buf, > + "SELECT n.nspname\n" > + "FROM pg_catalog.pg_namespace n,\n" > + " pg_catalog.pg_publication_namespace pn\n" > + "WHERE n.oid = pn.pnnspid\n" > + " AND pn.pnpubid = '%s'\n" > + "ORDER BY 1", pubid); > + if (!addFooterToPublicationDesc(&buf, "Tables from schemas:", > + true, &cont)) > + goto error_return; > .. > } > > Shouldn't we try to get schemas only when pset.sversion >= 150000? Modified > 8. > +addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg, > + bool singlecol, printTableContent *cont) > +{ > .. > + termPQExpBuffer(buf); > .. > } > > It seems this buffer is freed at the caller's site, if so, no need to > free it here. > Modified These comments are fixed in the v38 patch attached. Regards, Vignesh
Attachment
- v38-0001-Added-schema-level-support-for-publication.patch
- v38-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v38-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v38-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v38-0005-Implemented-pg_publication_objects-view.patch
On Fri, Oct 8, 2021 at 4:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Oct 6, 2021 at 11:12 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Attached v37 patch has the changes for the same. > > > > > > > Few comments: > > ============== > > > > Few more comments: > ==================== > v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN > 1. > + else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", > "TABLES", "IN", "SCHEMA")) > + COMPLETE_WITH_QUERY(Query_for_list_of_schemas > + " UNION SELECT 'CURRENT_SCHEMA' " > + "UNION SELECT 'WITH ('"); > > What is the need to display WITH here? It will be displayed after > Schema name with the below rule: > + else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", > "TABLES", "IN", "SCHEMA", MatchAny)) > + COMPLETE_WITH("WITH ("); Removed it. > 2. It seems tab-completion happens incorrectly for the below case: > create publication pub for all tables in schema s1, > > If I press the tab after above, it completes with below which is wrong > because it will lead to incorrect syntax. > > create publication pub for all tables in schema s1, WITH ( Modified > v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication > 3. > --- a/src/bin/pg_dump/t/002_pg_dump.pl > +++ b/src/bin/pg_dump/t/002_pg_dump.pl > .. > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => { > + create_order => 51, > + create_sql => > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;', > + regexp => qr/^ > + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E > + /xm, > + like => { %full_runs, section_post_data => 1, }, > + unlike => { exclude_dump_test_schema => 1, }, > > In this test, won't it exclude the schema dump_test because of unlike? > If so, then we don't have coverage for "ALL Tables In Schema" except > for public schema? I noticed that the tap test framework runs pg_dump tests with various combinations, these tests will be covered in the like tests as Tang suggested at [1]. In the exclude_dump_test_schema since this sql will not be present, exclude_dump_test_schema should be added in the unlike option, as the validation will fail for exclude-schema option which will be run with the following command: pg_dump --no-sync --file=/home/vignesh/postgres/src/bin/pg_dump/tmp_check/tmp_test_4i8F/exclude_dump_test_schema.sql --exclude-schema=dump_test postgres > 4. > --- a/src/test/regress/expected/publication.out > +++ b/src/test/regress/expected/publication.out > .. > +-- fail - can't add schema to for all tables publication > +ALTER PUBLICATION testpub_foralltables ADD ALL TABLES IN SCHEMA pub_test; > > In the above and all similar comments, it is better to either quote > 'for all tables' or write in CAPS FOR ALL TABLE or both 'FOR ALL > TABLE'. Modified > 5. > +\dRp+ testpub1_forschema > + Publication testpub1_forschema > + Owner | All tables | Inserts | Updates | Deletes > | Truncates | Via root > +--------------------------+------------+---------+---------+---------+-----------+---------- > + regress_publication_user | f | t | t | t > | t | f > +Tables from schemas: > + "pub_test1" > + > +SELECT p.pubname FROM pg_catalog.pg_publication p, > pg_catalog.pg_namespace n, pg_catalog.pg_publication_namespace pn > WHERE n.oid = pn.pnnspid AND p.oid = pn.pnpubid AND n.nspname = > 'pub_test1' ORDER BY 1; > + pubname > +-------------------- > + testpub1_forschema > +(1 row) > > I don't think in the above and similar tests, we need to separately > check the presence of publication via Select query, if we have tested > it via psql command. Let's try to keep the meaningful tests. Removed it. > 6. > +INSERT INTO pub_test1.tbl VALUES(1, 'test'); > +-- fail > +UPDATE pub_test1.tbl SET id = 2; > +ERROR: cannot update table "tbl" because it does not have a replica > identity and publishes updates > +HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. > +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1; > +-- success > +UPDATE pub_test1.tbl SET id = 2; > +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1; > +-- fail > +UPDATE pub_test1.tbl SET id = 2; > +ERROR: cannot update table "tbl" because it does not have a replica > identity and publishes updates > +HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. > +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1; > +-- success > +UPDATE pub_test1.tbl SET id = 2; > +ALTER PUBLICATION testpub1_forschema ADD ALL TABLES IN SCHEMA pub_test1; > +-- fail > +UPDATE pub_test1.tbl SET id = 2; > +ERROR: cannot update table "tbl" because it does not have a replica > identity and publishes updates > +HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE. > > I think here we don't need to test both SET and ADD variants, one of > them is sufficient. Removed it > 7. > +-- verify invalidation of partition table having partition on different schema > > I think this comment is not very clear to me. Can we change it to: > "verify invalidation of partition table having parent and child tables > in different schema"? Modified > 8. > +-- verify invalidation of schema having partition parent table and > partition child table > > Similarly, let's change this to: "verify invalidation of partition > tables for schema publication that has parent and child tables of > different partition hierarchies". Keep comments line boundary as 80 > chars, that way they look readable. Modified > 9. > +++ b/src/test/subscription/t/025_rep_changes_for_schema.pl > .. > +# Basic logical replication test > > Let's change this comment to: "Logical replication tests for schema > publications" Modified These comments are handled in the v38 patch attached at [2]. [1] - https://www.postgresql.org/message-id/OS0PR01MB6113A715FB3B85907458F4E7FBB59%40OS0PR01MB6113.jpnprd01.prod.outlook.com [2] - https://www.postgresql.org/message-id/CALDaNm1TP9S0dif2QWoEUcCtNDop1xJ6Rj1xnu2vS92%3Dj9ahYw%40mail.gmail.com Regards, Vignesh
On Mon, Oct 11, 2021 at 7:46 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > > On Friday, October 8, 2021 7:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication > > 3. > > --- a/src/bin/pg_dump/t/002_pg_dump.pl > > +++ b/src/bin/pg_dump/t/002_pg_dump.pl > > .. > > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => { > > + create_order => 51, > > + create_sql => > > + 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;', > > + regexp => qr/^ > > + \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E > > + /xm, > > + like => { %full_runs, section_post_data => 1, }, > > + unlike => { exclude_dump_test_schema => 1, }, > > > > In this test, won't it exclude the schema dump_test because of unlike? > > If so, then we don't have coverage for "ALL Tables In Schema" except > > for public schema? > > > > Yes, the unlike case will exclude the schema dump_test, but I think schema dump_test could be > dumped in like case. > I checked the log file src/bin/pg_dump/tmp_check/log/regress_log_002_pg_dump and > saw some cases were described as "should dump ALTER PUBLICATION pub3 ADD ALL > TABLES IN SCHEMA dump_test". I think in these cases schema dump_test would be > dumped. I agree > Besides, a small comment on tab-complete.c: > > else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL")) > - COMPLETE_WITH("TABLES"); > - else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES") > - || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny)) > + COMPLETE_WITH("TABLES", "TABLE IN SCHEMA"); > > > COMPLETE_WITH("TABLES", "TABLE IN SCHEMA"); > -> > COMPLETE_WITH("TABLES", "TABLES IN SCHEMA"); Modified. This issue is fixed in the v38 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm1TP9S0dif2QWoEUcCtNDop1xJ6Rj1xnu2vS92%3Dj9ahYw%40mail.gmail.com Regards, Vignesh
On Monday, October 11, 2021 2:39 PM vignesh C <vignesh21@gmail.com> wrote: > > These comments are fixed in the v38 patch attached. Thanks for updating the patches. Here are a few comments on the v38-0004-Doc patch. 1. + <para> + Adding/Setting a table that is part of schema specified in + <literal>ALL TABLES IN SCHEMA</literal>, adding/setting a schema to a + publication along with same schema's table specified with + <literal>TABLE</literal>, adding/setting a schema to a publication that + already has a table that is part of specified schema or adding/setting a + table to a publication that already has a table's schema as part of + specified schema is not supported. ISTM we can remove the description "adding/setting a schema to a publication along with same schema's table specified with <literal>TABLE</literal>", because it seems the same as the first mentioned case "Adding/Setting a table that is part of schema specified in <literal>ALL TABLES IN SCHEMA</literal>" 2. +</programlisting></para> + + <para> + Add some schemas to the publication: +<programlisting> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june; +</programlisting> + </para> + + <para> + Add some tables and schemas to the publication: ... + + <para> + Drop some schemas from the publication: ... + <para> + Set some schemas to the publication: +<programlisting> +ALTER PUBLICATION production_publication SET ALL TABLES IN SCHEMA production_september, production_october; Personally, I think we don't need the example about DROP and SET here. The example of ADD seems sufficient. 3. +</programlisting> + </para> + + <para> + Create a publication that publishes all changes for all the tables present in + the schema "production": +<programlisting> +CREATE PUBLICATION production_publication FOR ALL TABLES IN SCHEMA production; +</programlisting> + </para> ... + <para> + Create a publication that publishes all changes for all the tables present in + the schemas "marketing" and "sales": +<programlisting> +CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales; I think the example for publishing all the tables in schemas "marketing" and "sales" is sufficient, the example for pulishing signal schema seems can be removed. Best regards, Hou zj
On Mon, Oct 11, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote: > > These comments are fixed in the v38 patch attached. > Thanks for the updates. I noticed that these patches don't apply on the latest source (last seemed to apply cleanly on HEAD as at about October 6). Regards, Greg Nancarrow Fujitsu Australia
On Mon, Oct 11, 2021 at 1:21 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Mon, Oct 11, 2021 at 5:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > These comments are fixed in the v38 patch attached. > > > > Thanks for the updates. > I noticed that these patches don't apply on the latest source (last > seemed to apply cleanly on HEAD as at about October 6). I was not able to apply v37 patches, but I was able to apply the v38 version of patches on top of HEAD. Regards, Vignesh
On Mon, Oct 11, 2021 at 12:50 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Monday, October 11, 2021 2:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > These comments are fixed in the v38 patch attached. > > Thanks for updating the patches. > Here are a few comments on the v38-0004-Doc patch. > > 1. > + <para> > + Adding/Setting a table that is part of schema specified in > + <literal>ALL TABLES IN SCHEMA</literal>, adding/setting a schema to a > + publication along with same schema's table specified with > + <literal>TABLE</literal>, adding/setting a schema to a publication that > + already has a table that is part of specified schema or adding/setting a > + table to a publication that already has a table's schema as part of > + specified schema is not supported. > > ISTM we can remove the description "adding/setting a schema to a publication > along with same schema's table specified with <literal>TABLE</literal>", > because it seems the same as the first mentioned case "Adding/Setting a table > that is part of schema specified in <literal>ALL TABLES IN SCHEMA</literal>" Modified > 2. > > +</programlisting></para> > + > + <para> > + Add some schemas to the publication: > +<programlisting> > +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june; > +</programlisting> > + </para> > + > + <para> > + Add some tables and schemas to the publication: > ... > + > + <para> > + Drop some schemas from the publication: > ... > + <para> > + Set some schemas to the publication: > +<programlisting> > +ALTER PUBLICATION production_publication SET ALL TABLES IN SCHEMA production_september, production_october; > > Personally, I think we don't need the example about DROP and SET here. > The example of ADD seems sufficient. Modified > 3. > +</programlisting> > + </para> > + > + <para> > + Create a publication that publishes all changes for all the tables present in > + the schema "production": > +<programlisting> > +CREATE PUBLICATION production_publication FOR ALL TABLES IN SCHEMA production; > +</programlisting> > + </para> > ... > + <para> > + Create a publication that publishes all changes for all the tables present in > + the schemas "marketing" and "sales": > +<programlisting> > +CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales; > > I think the example for publishing all the tables in schemas "marketing" and > "sales" is sufficient, the example for pulishing signal schema seems can be > removed. Modified The attached v39 patch has the fixes for the above issues. Regards, Vignesh
Attachment
- v39-0001-Added-schema-level-support-for-publication.patch
- v39-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v39-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v39-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v39-0005-Implemented-pg_publication_objects-view.patch
On Monday, October 11, 2021 11:02 PM vignesh C <vignesh21@gmail.com> wrote: > The attached v39 patch has the fixes for the above issues. Thanks for the updates. I have a few minor suggestions about the testcases in the v39-0003-Test patch. 1) +-- alter publication drop CURRENT_SCHEMA +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA CURRENT_SCHEMA; +\dRp+ testpub1_forschema Since we already tested CURRENT_SCHEMA in various CREATE PUBLICATION cases, maybe we don't need to test it again in SET/DROP/ADD cases. 2) +-- alter publication set schema +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1; +\dRp+ testpub1_forschema + +-- alter publication set multiple schema +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2; +\dRp+ testpub1_forschema + I think the multiple schemas testcase is sufficient, maybe we can remove the single schema case. 3) + +-- alter publication set it with the same schema +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2; +\dRp+ testpub1_forschema ISTM, we didn't have some special code path for this case, maybe we can remove this testcase. Best regards, Hou zj
On Tue, Oct 12, 2021 at 12:24 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Monday, October 11, 2021 11:02 PM vignesh C <vignesh21@gmail.com> wrote: > > The attached v39 patch has the fixes for the above issues. > > Thanks for the updates. > I have a few minor suggestions about the testcases in the v39-0003-Test patch. > > 1) > +-- alter publication drop CURRENT_SCHEMA > +ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA CURRENT_SCHEMA; > +\dRp+ testpub1_forschema > > Since we already tested CURRENT_SCHEMA in various CREATE PUBLICATION cases, maybe > we don't need to test it again in SET/DROP/ADD cases. Modified > 2) > +-- alter publication set schema > +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1; > +\dRp+ testpub1_forschema > + > +-- alter publication set multiple schema > +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2; > +\dRp+ testpub1_forschema > + > > I think the multiple schemas testcase is sufficient, maybe we can remove the > single schema case. Modified > 3) > + > +-- alter publication set it with the same schema > +ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1, pub_test2; > +\dRp+ testpub1_forschema > > ISTM, we didn't have some special code path for this case, maybe we can remove > this testcase. Modified Attached v40 patch has the fix for the above comments. Regards, Vignesh
Attachment
- v40-0001-Added-schema-level-support-for-publication.patch
- v40-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v40-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v40-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v40-0005-Implemented-pg_publication_objects-view.patch
On Tuesday, October 12, 2021 9:15 PM vignesh C <vignesh21@gmail.com> > Attached v40 patch has the fix for the above comments. Thanks for the update, I have some minor issues about partition related behavior. 1) Tang tested and discussed this issue with me. The testcase is: We publish a schema and there is a partition in the published schema. If publish_via_partition_root is on and the partition's parent table is not in the published schema, neither the change on the partition nor the parent table will not be published. But if we publish by FOR TABLE partition and set publish_via_partition_root to on, the change on the partition will be published. So, I think it'd be better to publish the change on partition for FOR ALL TABLES IN SCHEMA case if its parent table is not published in the same publication. It seems we should pass publication oid to the GetSchemaPublicationRelations() and add some check like the following: if (is_publishable_class(relid, relForm) && !(relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT)) result = lappend_oid(result, relid); + if (relForm->relispartition && pub_partopt == PUBLICATION_PART_ROOT) + { + bool skip = false; + List *ancestors = get_partition_ancestors(relid); + List *schemas = GetPublicationSchemas(pubid); + ListCell *lc; + foreach(lc, ancestors) + { + if (list_member_oid(schemas, get_rel_namespace(lfirst_oid(lc)))) + { + skip = true; + break; + } + } + if (!skip) + result = lappend_oid(result, relid); + } 2) + /* + * It is quite possible that some of the partitions are in a different + * schema than the parent table, so we need to get such partitions + * separately. + */ + scan = table_beginscan_catalog(classRel, keycount, key); + while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL) + { + Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple); + + if (is_publishable_class(relForm->oid, relForm)) + { + List *partitionrels = NIL; + + partitionrels = GetPubPartitionOptionRelations(partitionrels, + pub_partopt, + relForm->oid); I think a partitioned table could also be a partition which should not be appended to the list. I think we should also filter these cases here by same check in 1). Thoughts ? Best regards, Hou zj
On Wed, Oct 13, 2021 at 12:15 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v40 patch has the fix for the above comments. > [Maybe this has some overlap with what Hou-san reported, and I have not tested this against his proposed fixes] If partitions belong to a different schema than the parent partitioned table, then the current patch implementation allows the partitions to (optionally) be explicitly added to a publication that includes the parent partitioned table (and for the most part, it doesn't seem to make any difference to the publication behavior). Should this be allowed? e.g. CREATE SCHEMA sch; CREATE SCHEMA sch1; CREATE TABLE sch.sale (sale_date date not null, country_code text, product_sku text, units integer) PARTITION BY RANGE (sale_date); CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM ('2019-01-01') TO ('2019-02-01'); CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM ('2019-02-01') TO ('2019-03-01'); postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE sch1.sale_201901, TABLE sch1.sale_201902; CREATE PUBLICATION postgres=# \dRp+ Publication pub Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -------+------------+---------+---------+---------+-----------+---------- gregn | f | t | t | t | t | f Tables: "sch1.sale_201901" "sch1.sale_201902" Tables from schemas: "sch" Also, I found the following scenario where the data is double-published: (1) PUB: CREATE PUBLICATION pub FOR TABLE sch1.sale_201901, TABLE sch1.sale_201902 WITH (publish_via_partition_root=true); (2) SUB: CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub; (3) PUB: INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5), ('2019-01-02', 'AU', 'disk', 8); (4) SUB: SELECT * FROM sch.sale; (5) PUB: ALTER PUBLICATION pub ADD ALL TABLES IN SCHEMA sch; (6) SUB: ALTER SUBSCRIPTION sub REFRESH PUBLICATION; (7) SUB: SELECT * FROM sch.sale; sale_date | country_code | product_sku | units ------------+--------------+-------------+------- 2019-01-01 | AU | cpu | 5 2019-01-02 | AU | disk | 8 2019-01-01 | AU | cpu | 5 2019-01-02 | AU | disk | 8 Regards, Greg Nancarrow Fujitsu Australia
On Wednesday, October 13, 2021 4:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > Also, I found the following scenario where the data is double-published: > > (1) PUB: CREATE PUBLICATION pub FOR TABLE sch1.sale_201901, TABLE > sch1.sale_201902 WITH (publish_via_partition_root=true); > (2) SUB: CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres > host=localhost port=5432' PUBLICATION pub; > (3) PUB: INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5), > ('2019-01-02', 'AU', 'disk', 8); > (4) SUB: SELECT * FROM sch.sale; > (5) PUB: ALTER PUBLICATION pub ADD ALL TABLES IN SCHEMA sch; > (6) SUB: ALTER SUBSCRIPTION sub REFRESH PUBLICATION; > (7) SUB: SELECT * FROM sch.sale; > > sale_date | country_code | product_sku | units > ------------+--------------+-------------+------- > 2019-01-01 | AU | cpu | 5 > 2019-01-02 | AU | disk | 8 > 2019-01-01 | AU | cpu | 5 > 2019-01-02 | AU | disk | 8 > > I changed your test step in (5) and used "ADD TABLE" command as below: ALTER PUBLICATION pub ADD TABLE sch.sale; I could get the same result on HEAD. So I think it's not a problem related to this patch. Maybe we can post this issue in a new thread. Regards Tang
On Wed, Oct 13, 2021 at 1:40 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Wed, Oct 13, 2021 at 12:15 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v40 patch has the fix for the above comments. > > > > [Maybe this has some overlap with what Hou-san reported, and I have > not tested this against his proposed fixes] > > If partitions belong to a different schema than the parent partitioned > table, then the current patch implementation allows the partitions to > (optionally) be explicitly added to a publication that includes the > parent partitioned table (and for the most part, it doesn't seem to > make any difference to the publication behavior). Should this be > allowed? > > e.g. > > CREATE SCHEMA sch; > CREATE SCHEMA sch1; > CREATE TABLE sch.sale (sale_date date not null, country_code text, > product_sku text, units integer) PARTITION BY RANGE (sale_date); > CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM > ('2019-01-01') TO ('2019-02-01'); > CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM > ('2019-02-01') TO ('2019-03-01'); > > postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE > sch1.sale_201901, TABLE sch1.sale_201902; > CREATE PUBLICATION > postgres=# \dRp+ > Publication pub > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -------+------------+---------+---------+---------+-----------+---------- > gregn | f | t | t | t | t | f > Tables: > "sch1.sale_201901" > "sch1.sale_201902" > Tables from schemas: > "sch" > I don't see any problem with this. Do you have a specific problem in mind due to this? -- With Regards, Amit Kapila.
On Thu, Oct 14, 2021 at 9:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > If partitions belong to a different schema than the parent partitioned > > table, then the current patch implementation allows the partitions to > > (optionally) be explicitly added to a publication that includes the > > parent partitioned table (and for the most part, it doesn't seem to > > make any difference to the publication behavior). Should this be > > allowed? > > > > e.g. > > > > CREATE SCHEMA sch; > > CREATE SCHEMA sch1; > > CREATE TABLE sch.sale (sale_date date not null, country_code text, > > product_sku text, units integer) PARTITION BY RANGE (sale_date); > > CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM > > ('2019-01-01') TO ('2019-02-01'); > > CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM > > ('2019-02-01') TO ('2019-03-01'); > > > > postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE > > sch1.sale_201901, TABLE sch1.sale_201902; > > CREATE PUBLICATION > > postgres=# \dRp+ > > Publication pub > > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > > -------+------------+---------+---------+---------+-----------+---------- > > gregn | f | t | t | t | t | f > > Tables: > > "sch1.sale_201901" > > "sch1.sale_201902" > > Tables from schemas: > > "sch" > > > > I don't see any problem with this. Do you have a specific problem in > mind due to this? > I'm not sure if it's a problem as such, really just a query from me as to whether it should be allowed to also (redundantly) add partitions to the publication, in addition to the partitioned table, since the current documentation says: "When a partitioned table is added to a publication, all of its existing and future partitions are implicitly considered to be part of the publication". I guess it should be allowed, as I find I can do it in the current implementation just with TABLE. Regards, Greg Nancarrow Fujitsu Australia
On Wednesday, October 13, 2021 10:49 AM Hou, Zhijie wrote: > On Tuesday, October 12, 2021 9:15 PM vignesh C <vignesh21@gmail.com> > > Attached v40 patch has the fix for the above comments. > > Thanks for the update, I have some minor issues about partition related > behavior. > > 1) > > Tang tested and discussed this issue with me. > The testcase is: > We publish a schema and there is a partition in the published schema. If > publish_via_partition_root is on and the partition's parent table is not in the > published schema, neither the change on the partition nor the parent table will > not be published. > > But if we publish by FOR TABLE partition and set publish_via_partition_root to > on, the change on the partition will be published. So, I think it'd be better to > publish the change on partition for FOR ALL TABLES IN SCHEMA case if its parent > table > is not published in the same publication. > > It seems we should pass publication oid to the GetSchemaPublicationRelations() > and add some check like the following: > 2) > I think a partitioned table could also be a partition which should not be > appended to the list. I think we should also filter these cases here by same > check in 1). After some offline discussion with Vignesh and Amit. I found my proposed fix can be improved because it brings some overhead to functions which could be invoked many times for the same publication. Now, I think it'd be better to add this check in GetAllSchemaPublicationRelations and GetPublicationRelations. Besides, I found we misunderstood the flag PUBLICATION_PART_ROOT it means: "ROOT: only the table explicitly mentioned in the publication" We cannot use it as a flag to judge whether do the partition filtering, I think we need to pass the actual pubviaroot flag. Based on the V40 patchset, attaching the Top-up patch which try to fix the partition issue in a cleaner way. Best regards, Hou zj
Attachment
- v40-Topup-partition-fix.patch
- v40-0001-Added-schema-level-support-for-publication.patch
- v40-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v40-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
- v40-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v40-0005-Implemented-pg_publication_objects-view.patch
On Fri, Oct 15, 2021 at 6:45 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Thu, Oct 14, 2021 at 9:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > If partitions belong to a different schema than the parent partitioned > > > table, then the current patch implementation allows the partitions to > > > (optionally) be explicitly added to a publication that includes the > > > parent partitioned table (and for the most part, it doesn't seem to > > > make any difference to the publication behavior). Should this be > > > allowed? > > > > > > e.g. > > > > > > CREATE SCHEMA sch; > > > CREATE SCHEMA sch1; > > > CREATE TABLE sch.sale (sale_date date not null, country_code text, > > > product_sku text, units integer) PARTITION BY RANGE (sale_date); > > > CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM > > > ('2019-01-01') TO ('2019-02-01'); > > > CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM > > > ('2019-02-01') TO ('2019-03-01'); > > > > > > postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE > > > sch1.sale_201901, TABLE sch1.sale_201902; > > > CREATE PUBLICATION > > > postgres=# \dRp+ > > > Publication pub > > > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > > > -------+------------+---------+---------+---------+-----------+---------- > > > gregn | f | t | t | t | t | f > > > Tables: > > > "sch1.sale_201901" > > > "sch1.sale_201902" > > > Tables from schemas: > > > "sch" > > > > > > > I don't see any problem with this. Do you have a specific problem in > > mind due to this? > > > > I'm not sure if it's a problem as such, really just a query from me as > to whether it should be allowed to also (redundantly) add partitions > to the publication, in addition to the partitioned table, since the > current documentation says: "When a partitioned table is added to a > publication, all of its existing and future partitions are implicitly > considered to be part of the publication". > I guess it should be allowed, as I find I can do it in the current > implementation just with TABLE. > I have also checked the "For Table" case and it behaves similar to what the patch has for schema. So, I think it is better to retain the current behavior of patch. -- With Regards, Amit Kapila.
On Sat, Oct 16, 2021 at 4:57 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Besides, I found we misunderstood the flag PUBLICATION_PART_ROOT it means: > "ROOT: only the table explicitly mentioned in the publication" We cannot use it > as a flag to judge whether do the partition filtering, I think we need to pass > the actual pubviaroot flag. > I agree, PUBLICATION_PART_ROOT can't be used to determine whether to do partition filtering, the "pubviaroot" flag is needed. > Based on the V40 patchset, attaching the Top-up patch which try to fix the > partition issue in a cleaner way. > A minor thing, in your "top-up patch", the test code added to publication.sql, you need to remove the last "DROP TABLE sch2.tbl1_part1;". It causes an error because the table doesn't exist and it seems to have been erroneously copied from the previous test case. Regards, Greg Nancarrow Fujitsu Australia
On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote: > Based on the V40 patchset, attaching the Top-up patch which try to fix the > partition issue in a cleaner way. Attach the new version patch set which merge the partition fix into it. Besides, instead of introducing new function and parameter, just add the partition filter in pg_get_publication_tables which makes the code cleaner. Only 0001 and 0003 was changed. Best regards, Hou zj
Attachment
- v41-0004-Documentation-for-FOR-ALL-TABLES-IN-SCHEMA-publi.patch
- v41-0005-Implemented-pg_publication_objects-view.patch
- v41-0001-Added-schema-level-support-for-publication.patch
- v41-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN.patch
- v41-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication.patch
On Monday, October 18, 2021 12:04 PM Greg Nancarrow wrote: > On Sat, Oct 16, 2021 at 4:57 PM houzj.fnst@fujitsu.com wrote: > > Based on the V40 patchset, attaching the Top-up patch which try to fix > > the partition issue in a cleaner way. > > > > A minor thing, in your "top-up patch", the test code added to publication.sql, > you need to remove the last "DROP TABLE sch2.tbl1_part1;". It causes an error > because the table doesn't exist and it seems to have been erroneously copied > from the previous test case. Thanks for the comment. I have removed the last "DROP TABLE sch2.tbl1_part1;" in V41 patch set. Best regards, Hou zj
Hi, On Mon, Oct 18, 2021 at 3:14 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote: > > Based on the V40 patchset, attaching the Top-up patch which try to fix the > > partition issue in a cleaner way. > > Attach the new version patch set which merge the partition fix into it. > Besides, instead of introducing new function and parameter, just add the > partition filter in pg_get_publication_tables which makes the code cleaner. > > Only 0001 and 0003 was changed. I've reviewed 0001 and 0002 patch and here are comments: 0001 patch: +/* + * Get the list of publishable relation oids for a specified schema. + * + * Schema will be having both ordinary('r') relkind tables and partitioned('p') + * relkind tables, so two rounds of scan are required. + */ +List * +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt) +{ + Relation classRel; + ScanKeyData key[3]; + TableScanDesc scan; I think it's enough to have key[2], not key[3]. BTW, this function does the table scan on pg_class twice in order to get OIDs of both normal tables and partitioned tables. But can't we do that by the single table scan? I think we can set a scan key for relnamespace, and check relkind inside a scan loop. --- + ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations, + &schemaidlist); + + if (list_length(relations) > 0) + { + List *rels; + + rels = OpenTableList(relations); + CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist, + PUBLICATIONOBJ_TABLE); + PublicationAddTables(puboid, rels, true, NULL); + CloseTableList(rels); + } + + if (list_length(schemaidlist) > 0) + { + /* FOR ALL TABLES IN SCHEMA requires superuser */ + if (!superuser()) + ereport(ERROR, + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); + Perhaps we can do a superuser check before handling "relations"? If the user doesn't have the permission, we don't need to do anything for relations. 0002 patch: postgres(1:13619)=# create publication pub for all TABLES in schema CURRENT_SCHEMA pg_catalog public s2 information_schema pg_toast s1 Since pg_catalog and pg_toast cannot be added to the schema publication can we exclude them from the completion list? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Mon, Oct 18, 2021 at 2:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Hi, > > On Mon, Oct 18, 2021 at 3:14 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Saturday, October 16, 2021 1:57 PM houzj.fnst@fujitsu.com wrote: > > > Based on the V40 patchset, attaching the Top-up patch which try to fix the > > > partition issue in a cleaner way. > > > > Attach the new version patch set which merge the partition fix into it. > > Besides, instead of introducing new function and parameter, just add the > > partition filter in pg_get_publication_tables which makes the code cleaner. > > > > Only 0001 and 0003 was changed. > > I've reviewed 0001 and 0002 patch and here are comments: > > 0001 patch: > > +/* > + * Get the list of publishable relation oids for a specified schema. > + * > + * Schema will be having both ordinary('r') relkind tables and partitioned('p') > + * relkind tables, so two rounds of scan are required. > + */ > +List * > +GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt) > +{ > + Relation classRel; > + ScanKeyData key[3]; > + TableScanDesc scan; > > I think it's enough to have key[2], not key[3]. Modified > BTW, this function does the table scan on pg_class twice in order to > get OIDs of both normal tables and partitioned tables. But can't we do > that by the single table scan? I think we can set a scan key for > relnamespace, and check relkind inside a scan loop. Modified > --- > + ObjectsInPublicationToOids(stmt->pubobjects, pstate, > &relations, > + > &schemaidlist); > + > + if (list_length(relations) > 0) > + { > + List *rels; > + > + rels = OpenTableList(relations); > + CheckObjSchemaNotAlreadyInPublication(rels, > schemaidlist, > + > PUBLICATIONOBJ_TABLE); > + PublicationAddTables(puboid, rels, true, NULL); > + CloseTableList(rels); > + } > + > + if (list_length(schemaidlist) > 0) > + { > + /* FOR ALL TABLES IN SCHEMA requires superuser */ > + if (!superuser()) > + ereport(ERROR, > + > errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be > superuser to create FOR ALL TABLES IN SCHEMA publication")); > + > > Perhaps we can do a superuser check before handling "relations"? If > the user doesn't have the permission, we don't need to do anything for > relations. Modified > 0002 patch: > > postgres(1:13619)=# create publication pub for all TABLES in schema > CURRENT_SCHEMA pg_catalog public s2 > information_schema pg_toast s1 > > Since pg_catalog and pg_toast cannot be added to the schema > publication can we exclude them from the completion list? Modified Thanks for the comments, the attached v42 patch has the fixes for the same. Regards, Vignesh
Attachment
- v42-0001-Add-support-for-publishing-the-tables-of-schema.patch
- v42-0002-Add-client-side-support-to-logical-replication-f.patch
- v42-0003-Add-tests-for-the-schema-publication-feature-of-.patch
- v42-0004-Add-documentation-for-the-schema-publication-fea.patch
- v42-0005-Add-new-pg_publication_objects-view-to-display-T.patch
On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v42 patch has the fixes for the same. Thanks for your new patch. I tried your patch and found that the permission check for superuser didn't work. For example: postgres=# create role r1; CREATE ROLE postgres=# grant all privileges on database postgres to r1; GRANT postgres=# set role r1; SET postgres=> create schema s1; CREATE SCHEMA postgres=> create publication pub for all tables in schema s1; CREATE PUBLICATION Role r1 is not superuser, but this role could create publication for all tables in schema successfully, I think it is related the following change. List schemaidlist was not assigned yet. I think we should check it later. @@ -165,6 +265,12 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt) (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("must be superuser to create FOR ALL TABLES publication"))); + /* FOR ALL TABLES IN SCHEMA requires superuser */ + if (list_length(schemaidlist) > 0 && !superuser()) + ereport(ERROR, + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); + rel = table_open(PublicationRelationId, RowExclusiveLock); /* Check if name is used */ Regards Tang
On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the comments, the attached v42 patch has the fixes for the same. > > Thanks for your new patch. > > I tried your patch and found that the permission check for superuser didn't work. > > For example: > postgres=# create role r1; > CREATE ROLE > postgres=# grant all privileges on database postgres to r1; > GRANT > postgres=# set role r1; > SET > postgres=> create schema s1; > CREATE SCHEMA > postgres=> create publication pub for all tables in schema s1; > CREATE PUBLICATION > > Role r1 is not superuser, but this role could create publication for all tables in schema > successfully, I think it is related the following change. List schemaidlist was > not assigned yet. I think we should check it later. > It seems this got broken in yesterday's patch version. Do you think it is a good idea to add a test for this case? -- With Regards, Amit Kapila.
On Tuesday, October 19, 2021 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com > <tanghy.fnst@fujitsu.com> wrote: > > > > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> > wrote: > > > > > > Thanks for the comments, the attached v42 patch has the fixes for the same. > > > > Thanks for your new patch. > > > > I tried your patch and found that the permission check for superuser didn't work. > > > > For example: > > postgres=# create role r1; > > CREATE ROLE > > postgres=# grant all privileges on database postgres to r1; > > GRANT > > postgres=# set role r1; > > SET > > postgres=> create schema s1; > > CREATE SCHEMA > > postgres=> create publication pub for all tables in schema s1; > > CREATE PUBLICATION > > > > Role r1 is not superuser, but this role could create publication for all tables in > schema > > successfully, I think it is related the following change. List schemaidlist was > > not assigned yet. I think we should check it later. > > > > It seems this got broken in yesterday's patch version. Do you think it > is a good idea to add a test for this case? > Agreed. Thanks for your suggestion. I tried to add this test to publication.sql, a patch diff file for this test is attached. Regards Tang
Attachment
On Mon, Oct 18, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote: > Few comments on latest set of patches: =============================== 1. +/* + * Filter out the partitions whose parent tables was also specified in + * the publication. + */ +static List * +filter_out_partitions(List *relids) Can we name this function as filter_partitions()? 2. + /* + * If the publication publishes partition changes via their + * respective root partitioned tables, we must exclude partitions + * in favor of including the root partitioned tables. Otherwise, + * the function could return both the child and parent tables which + * could cause the data of child table double-published in + * subscriber side. + */ Let's slightly change the last part of the line in the above comment as: "... which could cause data of the child table to be double-published on the subscriber side." 3. --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c .. .. @@ -38,7 +40,6 @@ #include "utils/builtins.h" #include "utils/catcache.h" #include "utils/fmgroids.h" -#include "utils/inval.h" #include "utils/lsyscache.h" Does this change belong to this patch? If not, maybe you can submit a separate patch for this. A similar change is present in publicationcmds.c as well, not sure if that is required as well. 4. --- a/src/backend/commands/publicationcmds.c +++ b/src/backend/commands/publicationcmds.c ... ... +#include "nodes/makefuncs.h" Do we need to include this file? I am able to compile without including this file. v42-0003-Add-tests-for-the-schema-publication-feature-of- 5. +-- pg_publication_tables +SET client_min_messages = 'ERROR'; +CREATE SCHEMA sch1; +CREATE SCHEMA sch2; +CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a); +CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM (1) to (10); +CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH (PUBLISH_VIA_PARTITION_ROOT=1); +SELECT * FROM pg_publication_tables; + +DROP PUBLICATION pub; +CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH (PUBLISH_VIA_PARTITION_ROOT=1); +SELECT * FROM pg_publication_tables; Can we expand the above comment on the lines of: "Test the list of partitions published"? v42-0004-Add-documentation-for-the-schema-publication-fea 6. + <row> + <entry><link linkend="catalog-pg-publication-namespace"><structname>pg_publication_namespace</structname></link></entry> + <entry>schema to publication mapping</entry> + </row> + <row> <entry><link linkend="catalog-pg-publication-rel"><structname>pg_publication_rel</structname></link></entry> <entry>relation to publication mapping</entry> @@ -6238,6 +6243,67 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </table> </sect1> + <sect1 id="catalog-pg-publication-namespace"> + <title><structname>pg_publication_namespace</structname></title> At one place, the new catalog is placed after pg_publication_rel and at another place, it is before it. Shouldn't it be before in both places as we have a place as per naming order? 7. The + <literal>ADD</literal> clause will add one or more tables/schemas to the + publication. The <literal>DROP</literal> clauses will remove one or more + tables/schemas from the publication. Isn't it better to write the above as one line: "The <literal>ADD</literal> and <literal>DROP</literal> clauses will add and remove one or more tables/schemas from the publication."? 8. + <para> + Add some schemas to the publication: +<programlisting> +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing_june, sales_june; +</programlisting> + </para> Can we change schema names to just marketing and sales? Also, let's change the description as:"Add schemas <structname>marketing</structname> and <structname>sales</structname> to the publication <structname>sales_publication</structname>? 9. + [ FOR ALL TABLES + | FOR <replaceable class="parameter">publication object</replaceable> [, ... ] ] [ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ] + +<phrase>where <replaceable class="parameter">publication object</replaceable> is one of:</phrase> Similar to Alter Publication, here also we should use publication_object instead of publication object. 10. + <para> + Create a publication that publishes all changes for tables "users" and + "departments" and that publishes all changes for all the tables present in + the schema "production": +<programlisting> +CREATE PUBLICATION production_publication FOR TABLE users, departments, ALL TABLES IN SCHEMA production; +</programlisting> + </para> + + <para> + Create a publication that publishes all changes for all the tables present in + the schemas "marketing" and "sales": It is better to use <structname> before and </structname> after schema names in above descriptions. -- With Regards, Amit Kapila.
On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the comments, the attached v42 patch has the fixes for the same. > > Thanks for your new patch. > > I tried your patch and found that the permission check for superuser didn't work. > > For example: > postgres=# create role r1; > CREATE ROLE > postgres=# grant all privileges on database postgres to r1; > GRANT > postgres=# set role r1; > SET > postgres=> create schema s1; > CREATE SCHEMA > postgres=> create publication pub for all tables in schema s1; > CREATE PUBLICATION > > Role r1 is not superuser, but this role could create publication for all tables in schema > successfully, I think it is related the following change. List schemaidlist was > not assigned yet. I think we should check it later. > > @@ -165,6 +265,12 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt) > (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > errmsg("must be superuser to create FOR ALL TABLES publication"))); > > + /* FOR ALL TABLES IN SCHEMA requires superuser */ > + if (list_length(schemaidlist) > 0 && !superuser()) > + ereport(ERROR, > + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); > + > rel = table_open(PublicationRelationId, RowExclusiveLock); > > /* Check if name is used */ This issue got induced in the v42 version, attached v43 patch has the fixes for the same. Regards, Vignesh
Attachment
- v43-0001-Add-support-for-publishing-the-tables-of-schema.patch
- v43-0002-Add-client-side-support-to-logical-replication-f.patch
- v43-0003-Add-tests-for-the-schema-publication-feature-of-.patch
- v43-0004-Add-documentation-for-the-schema-publication-fea.patch
- v43-0005-Add-new-pg_publication_objects-view-to-display-T.patch
On Tue, Oct 19, 2021 at 11:23 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote: > > On Tuesday, October 19, 2021 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Oct 19, 2021 at 9:15 AM tanghy.fnst@fujitsu.com > > <tanghy.fnst@fujitsu.com> wrote: > > > > > > On Monday, October 18, 2021 8:23 PM vignesh C <vignesh21@gmail.com> > > wrote: > > > > > > > > Thanks for the comments, the attached v42 patch has the fixes for the same. > > > > > > Thanks for your new patch. > > > > > > I tried your patch and found that the permission check for superuser didn't work. > > > > > > For example: > > > postgres=# create role r1; > > > CREATE ROLE > > > postgres=# grant all privileges on database postgres to r1; > > > GRANT > > > postgres=# set role r1; > > > SET > > > postgres=> create schema s1; > > > CREATE SCHEMA > > > postgres=> create publication pub for all tables in schema s1; > > > CREATE PUBLICATION > > > > > > Role r1 is not superuser, but this role could create publication for all tables in > > schema > > > successfully, I think it is related the following change. List schemaidlist was > > > not assigned yet. I think we should check it later. > > > > > > > It seems this got broken in yesterday's patch version. Do you think it > > is a good idea to add a test for this case? > > > > Agreed. Thanks for your suggestion. > > I tried to add this test to publication.sql, a patch diff file for this test is attached. Thanks for the test case, I have merged it to the v43 version patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm2pJ49wAv%3DgEZrAP5%3D_apAzv_rgK3zjX-wfwCY%2BWWfT9w%40mail.gmail.com Regards, Vignesh
On Tue, Oct 19, 2021 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Oct 18, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Few comments on latest set of patches: > =============================== > 1. > +/* > + * Filter out the partitions whose parent tables was also specified in > + * the publication. > + */ > +static List * > +filter_out_partitions(List *relids) > > Can we name this function as filter_partitions()? Modified > 2. > + /* > + * If the publication publishes partition changes via their > + * respective root partitioned tables, we must exclude partitions > + * in favor of including the root partitioned tables. Otherwise, > + * the function could return both the child and parent tables which > + * could cause the data of child table double-published in > + * subscriber side. > + */ > > Let's slightly change the last part of the line in the above comment > as: "... which could cause data of the child table to be > double-published on the subscriber side." Modified > 3. > --- a/src/backend/catalog/pg_publication.c > +++ b/src/backend/catalog/pg_publication.c > .. > .. > @@ -38,7 +40,6 @@ > #include "utils/builtins.h" > #include "utils/catcache.h" > #include "utils/fmgroids.h" > -#include "utils/inval.h" > #include "utils/lsyscache.h" > > Does this change belong to this patch? If not, maybe you can submit a > separate patch for this. A similar change is present in > publicationcmds.c as well, not sure if that is required as well. I have removed these changes from this patch, I will post a patch for this separately later. > 4. > --- a/src/backend/commands/publicationcmds.c > +++ b/src/backend/commands/publicationcmds.c > ... > ... > +#include "nodes/makefuncs.h" > > Do we need to include this file? I am able to compile without > including this file. Modified > v42-0003-Add-tests-for-the-schema-publication-feature-of- > 5. > +-- pg_publication_tables > +SET client_min_messages = 'ERROR'; > +CREATE SCHEMA sch1; > +CREATE SCHEMA sch2; > +CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a); > +CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM > (1) to (10); > +CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH > (PUBLISH_VIA_PARTITION_ROOT=1); > +SELECT * FROM pg_publication_tables; > + > +DROP PUBLICATION pub; > +CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH > (PUBLISH_VIA_PARTITION_ROOT=1); > +SELECT * FROM pg_publication_tables; > > Can we expand the above comment on the lines of: "Test the list of > partitions published"? Modified > v42-0004-Add-documentation-for-the-schema-publication-fea > 6. > + <row> > + <entry><link > linkend="catalog-pg-publication-namespace"><structname>pg_publication_namespace</structname></link></entry> > + <entry>schema to publication mapping</entry> > + </row> > + > <row> > <entry><link > linkend="catalog-pg-publication-rel"><structname>pg_publication_rel</structname></link></entry> > <entry>relation to publication mapping</entry> > @@ -6238,6 +6243,67 @@ SCRAM-SHA-256$<replaceable><iteration > count></replaceable>:<replaceable>&l > </table> > </sect1> > > + <sect1 id="catalog-pg-publication-namespace"> > + <title><structname>pg_publication_namespace</structname></title> > > At one place, the new catalog is placed after pg_publication_rel and > at another place, it is before it. Shouldn't it be before in both > places as we have a place as per naming order? Modified > 7. > The > + <literal>ADD</literal> clause will add one or more tables/schemas to the > + publication. The <literal>DROP</literal> clauses will remove one or more > + tables/schemas from the publication. > > Isn't it better to write the above as one line: "The > <literal>ADD</literal> and <literal>DROP</literal> clauses will add > and remove one or more tables/schemas from the publication."? Modified > 8. > + <para> > + Add some schemas to the publication: > +<programlisting> > +ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA > marketing_june, sales_june; > +</programlisting> > + </para> > > Can we change schema names to just marketing and sales? Also, let's > change the description as:"Add schemas > <structname>marketing</structname> and <structname>sales</structname> > to the publication <structname>sales_publication</structname>? Modified > 9. > + [ FOR ALL TABLES > + | FOR <replaceable class="parameter">publication > object</replaceable> [, ... ] ] > [ WITH ( <replaceable > class="parameter">publication_parameter</replaceable> [= <replaceable > class="parameter">value</replaceable>] [, ... ] ) ] > + > +<phrase>where <replaceable class="parameter">publication > object</replaceable> is one of:</phrase> > > Similar to Alter Publication, here also we should use > publication_object instead of publication object. Modified > 10. > + <para> > + Create a publication that publishes all changes for tables "users" and > + "departments" and that publishes all changes for all the tables present in > + the schema "production": > +<programlisting> > +CREATE PUBLICATION production_publication FOR TABLE users, > departments, ALL TABLES IN SCHEMA production; > +</programlisting> > + </para> > + > + <para> > + Create a publication that publishes all changes for all the tables > present in > + the schemas "marketing" and "sales": > > It is better to use <structname> before and </structname> after schema > names in above descriptions. Modified Attached v43 patch has the fixes for the same. Regards, Vignesh
Attachment
- v43-0001-Add-support-for-publishing-the-tables-of-schema.patch
- v43-0002-Add-client-side-support-to-logical-replication-f.patch
- v43-0003-Add-tests-for-the-schema-publication-feature-of-.patch
- v43-0004-Add-documentation-for-the-schema-publication-fea.patch
- v43-0005-Add-new-pg_publication_objects-view-to-display-T.patch
On Tuesday, October 19, 2021 11:42 PM vignesh C <vignesh21@gmail.com> wrote: > > This issue got induced in the v42 version, attached v43 patch has the > fixes for the same. > Thanks for your new patch. I confirmed that this issue has be fixed. All regression tests passed. I also tested V43 in some other scenarios and found no issue. So the v43 patch LGTM. Regards Tang
On Tue, Oct 19, 2021 at 9:42 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for the modified patch. I have a few more comments and suggestions: As the thread [1] is still not concluded, I suggest we fix the duplicate data case only when schemas are involved by slightly tweaking the code as per attached. This is just to give you an idea about what I have in mind, if you find a better solution then feel free to let me know. Few additional minor comments: 1. +-- Test the list of partitions published Shall we change the above comment as: "Test the list of partitions published with or without 'PUBLISH_VIA_PARTITION_ROOT' parameter"? 2. psql documentation for \dRp[+] needs to be modified. \dRp Before: " .. If + is appended to the command name, the tables associated with each publication are shown as well." After: " .. If + is appended to the command name, the tables and schemas associated with each publication are shown as well. Apart from the above, I think we should merge the first four patches as there doesn't seem to be any big problems pending. We can keep still keep tests added by 025_rep_changes_for_schema.pl as a separate patch as there might be some timing-dependent tests in that file. [1] - https://www.postgresql.org/message-id/OS0PR01MB57167F45D481F78CDC5986F794B99%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
Attachment
On Wed, Oct 20, 2021 at 12:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Oct 19, 2021 at 9:42 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the modified patch. I have a few more comments and suggestions: > > As the thread [1] is still not concluded, I suggest we fix the > duplicate data case only when schemas are involved by slightly > tweaking the code as per attached. This is just to give you an idea > about what I have in mind, if you find a better solution then feel > free to let me know. > > Few additional minor comments: > 1. > +-- Test the list of partitions published > > Shall we change the above comment as: "Test the list of partitions > published with or without 'PUBLISH_VIA_PARTITION_ROOT' parameter"? Modified > 2. psql documentation for \dRp[+] needs to be modified. > > \dRp > Before: > " .. If + is appended to the command name, the tables associated with > each publication are shown as well." > After: > " .. If + is appended to the command name, the tables and schemas > associated with each publication are shown as well. Modified > Apart from the above, I think we should merge the first four patches > as there doesn't seem to be any big problems pending. We can keep > still keep tests added by 025_rep_changes_for_schema.pl as a separate > patch as there might be some timing-dependent tests in that file. This version of patch retains the changes related to PublicationRelInfo, I will handle the merging of the patches in the next version so that this version of patch change related to PublicationRelInfo can be reviewed easily. Attached v44 patch as the fixes for the same. Regards, Vignesh
Attachment
- v44-0001-Allow-publishing-the-tables-of-schema.patch
- v44-0002-Add-client-side-support-to-logical-replication-f.patch
- v44-0003-Add-tests-for-the-schema-publication-feature-of-.patch
- v44-0004-Add-documentation-for-the-schema-publication-fea.patch
- v44-0005-Add-new-pg_publication_objects-view-to-display-T.patch
On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v44 patch as the fixes for the same. > Regarding the documentation, I think some minor updates are needed in: doc/src/sgml/logical-replication.sgml. For example, currently it says: Publications may currently only contain tables. Objects must be added explicitly, except when a publication is created for <literal>ALL TABLES</literal>. There is also some security-related information in this file which may need updating for ALL TABLES IN SCHEMA. Also, I'm not sure the documentation updates in the patches clearly define how partitions relate to ALL TABLES IN SCHEMA. For example, if a partitioned table belongs to a different schema to that of a child partition that belongs to a schema specified for ALL TABLES IN SCHEMA, is it implicitly included in the publication or not included? Regards, Greg Nancarrow Fujitsu Australia
On Thurs, Oct 21, 2021 12:25 AM vignesh C <vignesh21@gmail.com> wrote: > This version of patch retains the changes related to PublicationRelInfo, I will > handle the merging of the patches in the next version so that this version of > patch change related to PublicationRelInfo can be reviewed easily. Thanks for the patches, I think the change related to PublicationRelInfo looks good. Best regards, Hou zj
On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached v44 patch as the fixes for the same. > In the v44-0001 patch, I have some doubts about the condition guarding the following code in pg_get_publication_tables(): + if (schemarelids) + { + /* + * If the publication publishes partition changes via their + * respective root partitioned tables, we must exclude + * partitions in favor of including the root partitioned + * tables. Otherwise, the function could return both the child + * and parent tables which could cause data of the child table + * to be double-published on the subscriber side. + * + * XXX As of now, we do this when a publication has associated + * schema or for all tables publication. See + * GetAllTablesPublicationRelations(). + */ + tables = list_concat_unique_oid(relids, schemarelids); + if (publication->pubviaroot) + tables = filter_partitions(tables); + } Shouldn't a partition be filtered out only if publication->pubviaroot and the partition belongs to a schema (i.e. ALL TABLES IN SCHEMA) included in the publication? The current code seems to filter out partitions of partitioned tables included in the publication if ANY schemas are included as part of the publication (e.g. which could be a schema not including any partitioned tables or partitions). Regards, Greg Nancarrow Fujitsu Australia
On Thu, Oct 21, 2021 at 4:41 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v44 patch as the fixes for the same. > > > > Regarding the documentation, I think some minor updates are needed in: > doc/src/sgml/logical-replication.sgml. > For example, currently it says: > Publications may currently only contain tables. Objects must be > added explicitly, except when a publication is created for > <literal>ALL TABLES</literal>. > There is also some security-related information in this file which may > need updating for ALL TABLES IN SCHEMA. Modified > Also, I'm not sure the documentation updates in the patches clearly > define how partitions relate to ALL TABLES IN SCHEMA. > For example, if a partitioned table belongs to a different schema to > that of a child partition that belongs to a schema specified for ALL > TABLES IN SCHEMA, is it implicitly included in the publication or not > included? Added Thanks for the comments, the attached v45 patch has the fix for the same. Regards, Vignesh
Attachment
On Thu, Oct 21, 2021 at 3:29 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Thu, Oct 21, 2021 at 3:25 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v44 patch as the fixes for the same. > > > > In the v44-0001 patch, I have some doubts about the condition guarding > the following code in pg_get_publication_tables(): > > + if (schemarelids) > + { > + /* > + * If the publication publishes partition changes via their > + * respective root partitioned tables, we must exclude > + * partitions in favor of including the root partitioned > + * tables. Otherwise, the function could return both the child > + * and parent tables which could cause data of the child table > + * to be double-published on the subscriber side. > + * > + * XXX As of now, we do this when a publication has associated > + * schema or for all tables publication. See > + * GetAllTablesPublicationRelations(). > + */ > + tables = list_concat_unique_oid(relids, schemarelids); > + if (publication->pubviaroot) > + tables = filter_partitions(tables); > + } > > Shouldn't a partition be filtered out only if publication->pubviaroot > and the partition belongs to a schema (i.e. ALL TABLES IN SCHEMA) > included in the publication? > The current code seems to filter out partitions of partitioned tables > included in the publication if ANY schemas are included as part of the > publication (e.g. which could be a schema not including any > partitioned tables or partitions). I could reproduce the issue by using the following test: --- Setup create schema sch1; create schema sch2; create table sch1.tbl1 (a int) partition by range (a); create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10); create table sch2.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20); create schema sch3; create table sch3.t1(c1 int); --- Publication create publication pub1 for all tables in schema sch3, table sch1.tbl1, table sch2.tbl1_part1 with ( publish_via_partition_root =on); insert into sch1.tbl1 values(1); insert into sch1.tbl1 values(11); insert into sch3.t1 values(1); ---- Subscription CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub1; The patch posted at [1] has the fix for the same. [1] - https://www.postgresql.org/message-id/CALDaNm1onqBEr0WE_e7%3DCNw3bURfrGRmbMjX31d-nx3FGLS10A%40mail.gmail.com Regards, Vignesh
On Fri, Oct 22, 2021 at 12:19 AM vignesh C <vignesh21@gmail.com> wrote: > > I could reproduce the issue by using the following test: > --- Setup > create schema sch1; > create schema sch2; > create table sch1.tbl1 (a int) partition by range (a); > create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10); > create table sch2.tbl1_part2 partition of sch1.tbl1 for values from > (10) to (20); > create schema sch3; > create table sch3.t1(c1 int); > > --- Publication > create publication pub1 for all tables in schema sch3, table > sch1.tbl1, table sch2.tbl1_part1 with ( publish_via_partition_root > =on); > insert into sch1.tbl1 values(1); > insert into sch1.tbl1 values(11); > insert into sch3.t1 values(1); > > ---- Subscription > CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost > port=5432' PUBLICATION pub1; > > The patch posted at [1] has the fix for the same. > [1] - https://www.postgresql.org/message-id/CALDaNm1onqBEr0WE_e7%3DCNw3bURfrGRmbMjX31d-nx3FGLS10A%40mail.gmail.com > Thanks for addressing this. I checked the updated code and did some more testing and the fix LGTM. I was also previously concerned about what the behavior should be when only including just the partitions of a partitioned table in a publication using ALL TABLES IN SCHEMA and having publish_via_partition_root=true. It seems to implicitly include the partitioned table in the publication. But I did some testing and found that this is the current behavior when only the partitions are individually included in a publication using TABLE, so it seems to be OK. Regards, Greg Nancarrow Fujitsu Australia
On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote: > > > Thanks for the comments, the attached v45 patch has the fix for the same. > The first patch is mostly looking good to me apart from the below minor comments: 1. + <para> + The catalog <structname>pg_publication_namespace</structname> contains the + mapping between schemas and publications in the database. This is a + many-to-many mapping. There are extra spaces after mapping at the end which are not required. 2. + <literal>CREATE</literal> privilege on the database. Also, the new owner + of a <literal>FOR ALL TABLES</literal> publication must be a superuser. I think we can modify the second line as: "Also, the new owner of a <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN SCHEMA</literal> publication must be a superuser. 3. /table's schema as part of specified schema is not supported./table's schema as part of the specified schema is not supported. 4. + <para> + Create a publication that publishes all changes for tables + <structname>users</structname>, <structname>departments</structname> and + that publishes all changes for all the tables present in the schema + <structname>production</structname>: I don't think '...that publishes...' is required twice in the above sentence. 5. +static List *OpenReliIdList(List *relids); static List *OpenTableList(List *tables); static void CloseTableList(List *rels); static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists, AlterPublicationStmt *stmt); static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok); +static void LockSchemaList(List *schemalist); +static void PublicationAddSchemas(Oid pubid, List *schemas, bool if_not_exists, + AlterPublicationStmt *stmt); +static void PublicationDropSchemas(Oid pubid, List *schemas, bool missing_ok); Keep the later definitions also in this order. I suggest move LockSchemaList() just after CloseTableList() both in declaration and definition. 6. +/* + * Convert the PublicationObjSpecType list into schema oid list and rangevar + * list. + */ I think you need to say publication table instead of rangevar in the above comment. 7. + /* + * It is quite possible that for the SET case user has not specified any + * schema in which case we need to remove all the existing schemas. + */ /schema/schemas 8. +/* + * Open relations specified by a RangeVar list. /RangeVar/PublicationTable 9. +static bool +_equalPublicationObject(const PublicationObjSpec *a, + const PublicationObjSpec *b) +{ + COMPARE_SCALAR_FIELD(pubobjtype); + COMPARE_STRING_FIELD(name); + COMPARE_NODE_FIELD(pubtable); + COMPARE_LOCATION_FIELD(location); + + return true; +} + Let's define this immediately before _equalPublicationTable as all publication functions are defined there. Also, make the handling of T_PublicationObjSpec before T_PublicationTable in equal() function as that is the way nodes are defined. -- With Regards, Amit Kapila.
On Fri, Oct 22, 2021 at 12:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > I was also previously concerned about what the behavior should be when > only including just the partitions of a partitioned table in a > publication using ALL TABLES IN SCHEMA and having > publish_via_partition_root=true. It seems to implicitly include the > partitioned table in the publication. But I did some testing and found > that this is the current behavior when only the partitions are > individually included in a publication using TABLE, so it seems to be > OK. > Thinking some more about this, it still may still be confusing to the user if not explicitly stated in the ALL TABLES IN SCHEMA case. How about adding some additional explanation to the end of the following paragraph: + <para> + When a partitioned table is published via schema level publication, all + of its existing and future partitions irrespective of it being from the + publication schema or not are implicitly considered to be part of the + publication. So, even operations that are performed directly on a + partition are also published via publications that its ancestors are + part of. + </para> Something like: Similarly, if a partition is published via schema level publication and publish_via_partition_root=true, the parent partitioned table is implicitly considered to be part of the publication, irrespective of it being from the publication schema or not. Regards, Greg Nancarrow Fujitsu Australia
On Fri, Oct 22, 2021 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Thanks for the comments, the attached v45 patch has the fix for the same. > > > > The first patch is mostly looking good to me apart from the below > minor comments: Let me share other minor comments on v45-0001 patch: > > 1. > + <para> > + The catalog <structname>pg_publication_namespace</structname> contains the > + mapping between schemas and publications in the database. This is a > + many-to-many mapping. > > There are extra spaces after mapping at the end which are not required. + <literal>ADD</literal> and <literal>DROP</literal> clauses will add and + remove one or more tables/schemas from the publication. Note that adding + tables/schemas to a publication that is already subscribed to will require a There is also an extra space after "adding". - [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...] - | FOR ALL TABLES ] + [ FOR ALL TABLES + | FOR <replaceable class="parameter">publication_object</replaceable> [, ... ] ] Similarly, after "TABLES". + + <para> + Specifying a table that is part of a schema specified by + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. + </para> And, after "by". --- +static void +AlterPublicationSchemas(AlterPublicationStmt *stmt, + HeapTuple tup, List *schemaidlist) +{ (snip) + PublicationAddSchemas(pubform->oid, schemaidlist, true, stmt); + } + + return; +} The "return" at the end of the function is not necessary. --- + if (pubobj->name) + pubobj->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA; + else if (!pubobj->name && !pubobj->pubtable) + pubobj->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA; + else if (!pubobj->name) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("invalid schema name at or near"), + parser_errposition(pubobj->location)); I think it's better to change the last "else if" to just "else". --- + + if (schemarelids) + { + /* + * If the publication publishes partition changes via their + * respective root partitioned tables, we must exclude + * partitions in favor of including the root partitioned + * tables. Otherwise, the function could return both the child + * and parent tables which could cause data of the child table + * to be double-published on the subscriber side. + * + * XXX As of now, we do this when a publication has associated + * schema or for all tables publication. See + * GetAllTablesPublicationRelations(). + */ + tables = list_concat_unique_oid(relids, schemarelids); + if (publication->pubviaroot) + tables = filter_partitions(tables, schemarelids); + } + else + tables = relids; + + } There is an extra newline after "table = relids;". The rest looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Fri, Oct 22, 2021 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Thanks for the comments, the attached v45 patch has the fix for the same. > > > > The first patch is mostly looking good to me apart from the below > minor comments: > > 1. > + <para> > + The catalog <structname>pg_publication_namespace</structname> contains the > + mapping between schemas and publications in the database. This is a > + many-to-many mapping. > > There are extra spaces after mapping at the end which are not required. Modified > 2. > + <literal>CREATE</literal> privilege on the database. Also, the new owner > + of a <literal>FOR ALL TABLES</literal> publication must be a superuser. > > I think we can modify the second line as: "Also, the new owner of a > <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN > SCHEMA</literal> publication must be a superuser. Modified > 3. > /table's schema as part of specified schema is not supported./table's > schema as part of the specified schema is not supported. Modified > 4. > + <para> > + Create a publication that publishes all changes for tables > + <structname>users</structname>, <structname>departments</structname> and > + that publishes all changes for all the tables present in the schema > + <structname>production</structname>: > > I don't think '...that publishes...' is required twice in the above sentence. Modified > 5. > +static List *OpenReliIdList(List *relids); > static List *OpenTableList(List *tables); > static void CloseTableList(List *rels); > static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists, > AlterPublicationStmt *stmt); > static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok); > +static void LockSchemaList(List *schemalist); > +static void PublicationAddSchemas(Oid pubid, List *schemas, bool if_not_exists, > + AlterPublicationStmt *stmt); > +static void PublicationDropSchemas(Oid pubid, List *schemas, bool missing_ok); > > Keep the later definitions also in this order. I suggest move > LockSchemaList() just after CloseTableList() both in declaration and > definition. Modified > 6. > +/* > + * Convert the PublicationObjSpecType list into schema oid list and rangevar > + * list. > + */ > > I think you need to say publication table instead of rangevar in the > above comment. Modified > 7. > + /* > + * It is quite possible that for the SET case user has not specified any > + * schema in which case we need to remove all the existing schemas. > + */ > > /schema/schemas Modified > 8. > +/* > + * Open relations specified by a RangeVar list. > > /RangeVar/PublicationTable Modified > 9. > +static bool > +_equalPublicationObject(const PublicationObjSpec *a, > + const PublicationObjSpec *b) > +{ > + COMPARE_SCALAR_FIELD(pubobjtype); > + COMPARE_STRING_FIELD(name); > + COMPARE_NODE_FIELD(pubtable); > + COMPARE_LOCATION_FIELD(location); > + > + return true; > +} > + > > Let's define this immediately before _equalPublicationTable as all > publication functions are defined there. Also, make the handling of > T_PublicationObjSpec before T_PublicationTable in equal() function as > that is the way nodes are defined. Modified Thanks for the comments, attached v46 patch has the fix for the same. Regards, Vignesh
Attachment
On Fri, Oct 22, 2021 at 11:59 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Oct 22, 2021 at 12:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > I was also previously concerned about what the behavior should be when > > only including just the partitions of a partitioned table in a > > publication using ALL TABLES IN SCHEMA and having > > publish_via_partition_root=true. It seems to implicitly include the > > partitioned table in the publication. But I did some testing and found > > that this is the current behavior when only the partitions are > > individually included in a publication using TABLE, so it seems to be > > OK. > > > > Thinking some more about this, it still may still be confusing to the > user if not explicitly stated in the ALL TABLES IN SCHEMA case. > How about adding some additional explanation to the end of the > following paragraph: > > + <para> > + When a partitioned table is published via schema level publication, all > + of its existing and future partitions irrespective of it being from the > + publication schema or not are implicitly considered to be part of the > + publication. So, even operations that are performed directly on a > + partition are also published via publications that its ancestors are > + part of. > + </para> > > Something like: > > Similarly, if a partition is published via schema level publication > and publish_via_partition_root=true, the parent partitioned table is > implicitly considered to be part of the publication, irrespective of > it being from the publication schema or not. I have made this change in the v46 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm3kBrMO5EyEgK_TyOrBuw%2BRvdJ2mJfpWb5e8FbuKg2cQw%40mail.gmail.com Regards, Vignesh
On Fri, Oct 22, 2021 at 1:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Oct 22, 2021 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Oct 21, 2021 at 6:47 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > Thanks for the comments, the attached v45 patch has the fix for the same. > > > > > > > The first patch is mostly looking good to me apart from the below > > minor comments: > > Let me share other minor comments on v45-0001 patch: > > > > > 1. > > + <para> > > + The catalog <structname>pg_publication_namespace</structname> contains the > > + mapping between schemas and publications in the database. This is a > > + many-to-many mapping. > > > > There are extra spaces after mapping at the end which are not required. Modified > + <literal>ADD</literal> and <literal>DROP</literal> clauses will add and > + remove one or more tables/schemas from the publication. Note that adding > + tables/schemas to a publication that is already subscribed to will require a > > There is also an extra space after "adding". Modified > - [ FOR TABLE [ ONLY ] <replaceable > class="parameter">table_name</replaceable> [ * ] [, ...] > - | FOR ALL TABLES ] > + [ FOR ALL TABLES > + | FOR <replaceable > class="parameter">publication_object</replaceable> [, ... ] ] > > Similarly, after "TABLES". Modified > + > + <para> > + Specifying a table that is part of a schema specified by > + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. > + </para> > > And, after "by". Modified > --- > > +static void > +AlterPublicationSchemas(AlterPublicationStmt *stmt, > + HeapTuple tup, List > *schemaidlist) > +{ > (snip) > + PublicationAddSchemas(pubform->oid, schemaidlist, true, stmt); > + } > + > + return; > +} > > The "return" at the end of the function is not necessary. Modified > --- > + if (pubobj->name) > + pubobj->pubobjtype = > PUBLICATIONOBJ_REL_IN_SCHEMA; > + else if (!pubobj->name && !pubobj->pubtable) > + pubobj->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA; > + else if (!pubobj->name) > + ereport(ERROR, > + errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("invalid > schema name at or near"), > + > parser_errposition(pubobj->location)); > > I think it's better to change the last "else if" to just "else". Modified > --- > + > + if (schemarelids) > + { > + /* > + * If the publication publishes > partition changes via their > + * respective root partitioned > tables, we must exclude > + * partitions in favor of including > the root partitioned > + * tables. Otherwise, the function > could return both the child > + * and parent tables which could > cause data of the child table > + * to be double-published on the > subscriber side. > + * > + * XXX As of now, we do this when a > publication has associated > + * schema or for all tables publication. See > + * GetAllTablesPublicationRelations(). > + */ > + tables = > list_concat_unique_oid(relids, schemarelids); > + if (publication->pubviaroot) > + tables = > filter_partitions(tables, schemarelids); > + } > + else > + tables = relids; > + > + } > > There is an extra newline after "table = relids;". Removed it I have made this change in the v46 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm3kBrMO5EyEgK_TyOrBuw%2BRvdJ2mJfpWb5e8FbuKg2cQw%40mail.gmail.com Regards, Vignesh
On Fri, Oct 22, 2021 at 11:59 AM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Fri, Oct 22, 2021 at 12:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > I was also previously concerned about what the behavior should be when > > only including just the partitions of a partitioned table in a > > publication using ALL TABLES IN SCHEMA and having > > publish_via_partition_root=true. It seems to implicitly include the > > partitioned table in the publication. But I did some testing and found > > that this is the current behavior when only the partitions are > > individually included in a publication using TABLE, so it seems to be > > OK. > > > > Thinking some more about this, it still may still be confusing to the > user if not explicitly stated in the ALL TABLES IN SCHEMA case. > How about adding some additional explanation to the end of the > following paragraph: > > + <para> > + When a partitioned table is published via schema level publication, all > + of its existing and future partitions irrespective of it being from the > + publication schema or not are implicitly considered to be part of the > + publication. So, even operations that are performed directly on a > + partition are also published via publications that its ancestors are > + part of. > + </para> > > Something like: > > Similarly, if a partition is published via schema level publication > and publish_via_partition_root=true, the parent partitioned table is > implicitly considered to be part of the publication, irrespective of > it being from the publication schema or not. > I don't think we need to explicitly mention this in docs as this is quite similar to what is happening for the "For Table" case as well and it seems to be clarified by "publish_via_partition_root" definition in Create Publication docs [1]. [1] - https://www.postgresql.org/docs/devel/sql-createpublication.html -- With Regards, Amit Kapila.
On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote: > I am getting a compilation error in the latest patch on HEAD. I think was relying on some variable removed by a recent commit 92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that compilation error, I observed that we don't need the second and third parameters in pg_dump.c/getPublicationNamespaces() as those are not getting used. -- With Regards, Amit Kapila.
On Mon, Oct 25, 2021 at 10:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I am getting a compilation error in the latest patch on HEAD. I think > was relying on some variable removed by a recent commit > 92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that > compilation error, I observed that we don't need the second and third > parameters in pg_dump.c/getPublicationNamespaces() as those are not > getting used. I have fixed this in the v47 version attached. Regards, Vignesh
Attachment
On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Oct 25, 2021 at 10:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Oct 22, 2021 at 8:56 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > I am getting a compilation error in the latest patch on HEAD. I think > > was relying on some variable removed by a recent commit > > 92316a4582a5714d4e494aaf90360860e7fec37a. While looking at that > > compilation error, I observed that we don't need the second and third > > parameters in pg_dump.c/getPublicationNamespaces() as those are not > > getting used. > > I have fixed this in the v47 version attached. > Thanks, the first patch in the series "Allow publishing the tables of schema." looks good to me. Unless there are more comments/bugs/objections, I am planning to commit it in a day or so. -- With Regards, Amit Kapila.
On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote: > > > > I have fixed this in the v47 version attached. > > > > Thanks, the first patch in the series "Allow publishing the tables of > schema." looks good to me. Unless there are more > comments/bugs/objections, I am planning to commit it in a day or so. > Yesterday, I have pushed the first patch. Feel free to submit the remaining patches. -- With Regards, Amit Kapila.
On Thu, Oct 28, 2021 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > I have fixed this in the v47 version attached. > > > > > > > Thanks, the first patch in the series "Allow publishing the tables of > > schema." looks good to me. Unless there are more > > comments/bugs/objections, I am planning to commit it in a day or so. > > > > Yesterday, I have pushed the first patch. Feel free to submit the > remaining patches. Thanks for committing the patch, please find the remaining patches attached. Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments offline, I have fixed those in the attached patch. Regards, Vignesh
Attachment
On Thu, Oct 28, 2021 at 3:25 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for committing the patch, please find the remaining patches attached. A few comments on the v48-0002 patch: (1) The quoting of TABLE/SCHEMA looks a bit odd in the patch comment (2) src/backend/catalog/system_views.sq ON should be capitalized in the following line: + JOIN pg_catalog.pg_namespace N on N.oid = S.pnnspid (3) Some basic tests should be added for this in the publication tests. Regards, Greg Nancarrow Fujitsu Australia
On Thu, Oct 28, 2021 at 9:55 AM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for committing the patch, please find the remaining patches attached. > Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments > offline, I have fixed those in the attached patch. > Pushed the first test case patch. About v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it doesn't display anything for "for all tables" publication. Instead of selecting from pg_publication_rel, you can use the existing view pg_publication_tables to solve that problem. Having said that, I am not completely sure about the value of this new view pg_publication_objects which displays all objects of publications. I see that users might want to see all the objects that the publication publishes and when we include other objects like sequences it might be more helpful. Sawada-San, others, what do you think? Is it really useful to have such a view? One point to think is if we introduce such a view then how it should behave for schema objects? Do we need to display only schemas or additionally all the tables in the schema as well? If you follow the above suggestion of mine then I think it will display both schemas published and tables in that schema that will be considered for publishing. -- With Regards, Amit Kapila.
On Friday, October 29, 2021 12:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Oct 28, 2021 at 9:55 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for committing the patch, please find the remaining patches attached. > > Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments > > offline, I have fixed those in the attached patch. > > > > Pushed the first test case patch. About > v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it > doesn't display anything for "for all tables" publication. Instead of > selecting from pg_publication_rel, you can use the existing view > pg_publication_tables to solve that problem. > > Having said that, I am not completely sure about the value of this new > view pg_publication_objects which displays all objects of > publications. I see that users might want to see all the objects that > the publication publishes and when we include other objects like > sequences it might be more helpful. > > Sawada-San, others, what do you think? Is it really useful to have such a view? > > One point to think is if we introduce such a view then how it should > behave for schema objects? Do we need to display only schemas or > additionally all the tables in the schema as well? If you follow the > above suggestion of mine then I think it will display both schemas > published and tables in that schema that will be considered for > publishing. > Personally, if I want to see ALL the published objects in a publication, I would use '\dRp+' command. I think there might not be too much value to have this view. Regards Tang
Hi, On 10/28/21 04:41, Amit Kapila wrote: > On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote: >>> >>> I have fixed this in the v47 version attached. >>> >> >> Thanks, the first patch in the series "Allow publishing the tables of >> schema." looks good to me. Unless there are more >> comments/bugs/objections, I am planning to commit it in a day or so. >> > > Yesterday, I have pushed the first patch. Feel free to submit the > remaining patches. > I haven't been following this thread recently, but while rebasing the sequence decoding patch I noticed this adds PUBLICATIONOBJ_TABLE, /* Table type */ PUBLICATIONOBJ_REL_IN_SCHEMA, /* Relations in schema type */ Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel instead of table? I'm asking because the sequence decoding patch mimics ALTER PUBLICATION options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA, which does not specify the object type. I wonder if it'd be better to just separate the schema and object type specification, instead of mashing it into a single constant. Otherwise we'll end up with (M x N) combinations, which seems silly. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 29, 2021 at 3:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Sawada-San, others, what do you think? Is it really useful to have such a view? > > One point to think is if we introduce such a view then how it should > behave for schema objects? Do we need to display only schemas or > additionally all the tables in the schema as well? If you follow the > above suggestion of mine then I think it will display both schemas > published and tables in that schema that will be considered for > publishing. > I find the proposed view useful for processing the publication structure and members in SQL, without having to piece the information together from the other pg_publication_* tables. Personally I don't think it is necessary to additionally display all tables in the schema (that information can be retrieved by pg_tables or information_schema.tables, if needed). Regards, Greg Nancarrow Fujitsu Australia
On Fri, Oct 29, 2021 at 1:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Oct 28, 2021 at 9:55 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for committing the patch, please find the remaining patches attached. > > Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments > > offline, I have fixed those in the attached patch. > > > > Pushed the first test case patch. About > v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it > doesn't display anything for "for all tables" publication. Instead of > selecting from pg_publication_rel, you can use the existing view > pg_publication_tables to solve that problem. > > Having said that, I am not completely sure about the value of this new > view pg_publication_objects which displays all objects of > publications. I see that users might want to see all the objects that > the publication publishes and when we include other objects like > sequences it might be more helpful. > > Sawada-San, others, what do you think? Is it really useful to have such a view? I haven't followed the discussion on pg_publication_objects view but what is the primary use case of this view? If it's to list all tables published in a publication (e.g, "select * from pg_publication_objects where pubname = 'pub1'), pg_publication_objects view lacks the information of FOR ALL TABLES publications. And probably we can use pg_publication_tables instead. On the other hand, if it's to list all tables published in FOR ALL TABLES IN SCHEMA publications (e.g., "select * from pg_publication_object where objtype = 'schema'), the view doesn't show tables published in such publications. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I haven't followed the discussion on pg_publication_objects view but > what is the primary use case of this view? If it's to list all tables > published in a publication (e.g, "select * from pg_publication_objects > where pubname = 'pub1'), pg_publication_objects view lacks the > information of FOR ALL TABLES publications. And probably we can use > pg_publication_tables instead. On the other hand, if it's to list all > tables published in FOR ALL TABLES IN SCHEMA publications (e.g., > "select * from pg_publication_object where objtype = 'schema'), the > view doesn't show tables published in such publications. > I think that Amit originally suggested to have a view that provides information about the objects in each publication (like table, tables in schema, sequence ...). So it currently is at the granularity level of the objects that are actually added to the publication (TABLE t, ALL TABLES IN SCHEMA s) I agree that the view is currently missing ALL TABLES publication information, but I think it could be easily added. Also, currently for the "objtype" column, the type "schema" does not seem specific enough; maybe that should be instead named "all-tables-in-schema" or similar. Regards, Greg Nancarrow Fujitsu Australia
On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 10/28/21 04:41, Amit Kapila wrote: > > On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > >> > >> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote: > >>> > >>> I have fixed this in the v47 version attached. > >>> > >> > >> Thanks, the first patch in the series "Allow publishing the tables of > >> schema." looks good to me. Unless there are more > >> comments/bugs/objections, I am planning to commit it in a day or so. > >> > > > > Yesterday, I have pushed the first patch. Feel free to submit the > > remaining patches. > > > > I haven't been following this thread recently, but while rebasing the > sequence decoding patch I noticed this adds > > PUBLICATIONOBJ_TABLE, /* Table type */ > PUBLICATIONOBJ_REL_IN_SCHEMA, /* Relations in schema type */ > > Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel > instead of table? > Yeah, it should be PUBLICATIONOBJ_TABLE_IN_SCHEMA considering we have to add other objects like sequence. > I'm asking because the sequence decoding patch mimics ALTER PUBLICATION > options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this > seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA, > which does not specify the object type. > I think we should name it PUBLICATIONOBJ_TABLE_CURRSCHEMA. Does that make sense? > I wonder if it'd be better to just separate the schema and object type > specification, instead of mashing it into a single constant. > Do you mean to say the syntax on the lines of Create Publication For Table t1, t2 Schema s1, s2;? If so, then originally the patch had the syntax on those lines but Tom pointed out that the meaning of such a syntax can change over a period of time and that can break apps [1]. I think the current syntax gives a lot of flexibility to users and we have some precedent for it as well. [1] - https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us -- With Regards, Amit Kapila.
On Mon, Nov 1, 2021 at 2:38 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I haven't followed the discussion on pg_publication_objects view but > > what is the primary use case of this view? If it's to list all tables > > published in a publication (e.g, "select * from pg_publication_objects > > where pubname = 'pub1'), pg_publication_objects view lacks the > > information of FOR ALL TABLES publications. And probably we can use > > pg_publication_tables instead. On the other hand, if it's to list all > > tables published in FOR ALL TABLES IN SCHEMA publications (e.g., > > "select * from pg_publication_object where objtype = 'schema'), the > > view doesn't show tables published in such publications. > > Both the problems mentioned can be fixed if we follow the change suggested by me in one of the emails above [1]. > > I think that Amit originally suggested to have a view that provides > information about the objects in each publication (like table, tables > in schema, sequence ...). > Right and I think as you also mentioned in your previous email it can save the effort of users if they want to know all the objects published via a publication. I am just not sure if it is worth adding such a view or we leave it to users to find that information via querying individual views or system tables for objects. [1] - https://www.postgresql.org/message-id/CAA4eK1%2BL2-6JQ174sVfE3_K%3DmjTKJ2A8-z%2B_pExDHhqdBJvb5Q%40mail.gmail.com -- With Regards, Amit Kapila.
On 11/1/21 11:18, Amit Kapila wrote: > On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 10/28/21 04:41, Amit Kapila wrote: >>> On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >>>> >>>> On Mon, Oct 25, 2021 at 1:11 PM vignesh C <vignesh21@gmail.com> wrote: >>>>> >>>>> I have fixed this in the v47 version attached. >>>>> >>>> >>>> Thanks, the first patch in the series "Allow publishing the tables of >>>> schema." looks good to me. Unless there are more >>>> comments/bugs/objections, I am planning to commit it in a day or so. >>>> >>> >>> Yesterday, I have pushed the first patch. Feel free to submit the >>> remaining patches. >>> >> >> I haven't been following this thread recently, but while rebasing the >> sequence decoding patch I noticed this adds >> >> PUBLICATIONOBJ_TABLE, /* Table type */ >> PUBLICATIONOBJ_REL_IN_SCHEMA, /* Relations in schema type */ >> >> Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel >> instead of table? >> > > Yeah, it should be PUBLICATIONOBJ_TABLE_IN_SCHEMA considering we have > to add other objects like sequence. > >> I'm asking because the sequence decoding patch mimics ALTER PUBLICATION >> options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this >> seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA, >> which does not specify the object type. >> > > I think we should name it PUBLICATIONOBJ_TABLE_CURRSCHEMA. Does that make sense? > >> I wonder if it'd be better to just separate the schema and object type >> specification, instead of mashing it into a single constant. >> > > Do you mean to say the syntax on the lines of Create Publication For > Table t1, t2 Schema s1, s2;? If so, then originally the patch had the > syntax on those lines but Tom pointed out that the meaning of such a > syntax can change over a period of time and that can break apps [1]. I > think the current syntax gives a lot of flexibility to users and we > have some precedent for it as well. > No, I'm not talking about the syntax at all - I'm talking about how we represent it. PUBLICATIONOBJ_TABLE_CURRSCHEMA mixes the object type and schema in the same constant, so I am wondering if we should just split that into two pieces - one determining the schema, one determining the object type. So PublicationObjSpec would have two fields instead of just pubobjtype. The advantage would be we wouldn't need a whole lot of new constants for each object type - adding sequences pretty much means adding PUBLICATIONOBJ_SEQUENCE PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA PUBLICATIONOBJ_SEQUENCE_CURRSCHEMA and after splitting we'd need just the first one. But maybe it's not that bad, though. We don't expect all that many object types in publications, I guess. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Nov 1, 2021 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Nov 1, 2021 at 2:38 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > I haven't followed the discussion on pg_publication_objects view but > > > what is the primary use case of this view? If it's to list all tables > > > published in a publication (e.g, "select * from pg_publication_objects > > > where pubname = 'pub1'), pg_publication_objects view lacks the > > > information of FOR ALL TABLES publications. And probably we can use > > > pg_publication_tables instead. On the other hand, if it's to list all > > > tables published in FOR ALL TABLES IN SCHEMA publications (e.g., > > > "select * from pg_publication_object where objtype = 'schema'), the > > > view doesn't show tables published in such publications. > > > > > Both the problems mentioned can be fixed if we follow the change > suggested by me in one of the emails above [1]. > > > > > I think that Amit originally suggested to have a view that provides > > information about the objects in each publication (like table, tables > > in schema, sequence ...). > > > > Right and I think as you also mentioned in your previous email it can > save the effort of users if they want to know all the objects > published via a publication. Thank you for the explanation. Given we already have pg_publication_tables view, if pg_publication_objects view also shows all tables published in FOR ALL TABLES publications or FOR ALL TABLES IN SCHEMA publications, there is essentially not much difference between pg_publication_tables and pg_publication_objects except for objtype column. Right? If so it'd be better to have one row for each FOR ALL TABLES publication and FOR ALL TABLES IN SCHEMA publication with objtype = 'database' or 'schema' etc, instead of individual tables. > I am just not sure if it is worth adding > such a view or we leave it to users to find that information via > querying individual views or system tables for objects. I've not looked at the patch for logical replication of sequences but the view becomes more useful once we support the new type of replication object? If so, we can consider this view again after the patch gets committed. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Tue, Nov 2, 2021 at 6:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > I am just not sure if it is worth adding > > such a view or we leave it to users to find that information via > > querying individual views or system tables for objects. > > I've not looked at the patch for logical replication of sequences but > the view becomes more useful once we support the new type of > replication object? If so, we can consider this view again after the > patch gets committed. > +1. -- With Regards, Amit Kapila.
On Mon, Nov 1, 2021 at 5:52 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 11/1/21 11:18, Amit Kapila wrote: > > On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> I wonder if it'd be better to just separate the schema and object type > >> specification, instead of mashing it into a single constant. > >> > > > > Do you mean to say the syntax on the lines of Create Publication For > > Table t1, t2 Schema s1, s2;? If so, then originally the patch had the > > syntax on those lines but Tom pointed out that the meaning of such a > > syntax can change over a period of time and that can break apps [1]. I > > think the current syntax gives a lot of flexibility to users and we > > have some precedent for it as well. > > > > No, I'm not talking about the syntax at all - I'm talking about how we > represent it. PUBLICATIONOBJ_TABLE_CURRSCHEMA mixes the object type and > schema in the same constant, so I am wondering if we should just split > that into two pieces - one determining the schema, one determining the > object type. So PublicationObjSpec would have two fields instead of just > pubobjtype. > > The advantage would be we wouldn't need a whole lot of new constants for > each object type - adding sequences pretty much means adding > > PUBLICATIONOBJ_SEQUENCE > PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA > PUBLICATIONOBJ_SEQUENCE_CURRSCHEMA > > and after splitting we'd need just the first one. > I see your point but OTOH, I think it will lead to additional checks in post-processing functions like ObjectsInPublicationToOids() as we have to always check both object type and schema to make decisions. > But maybe it's not > that bad, though. We don't expect all that many object types in > publications, I guess. > Yeah, that is also true. So maybe at this, we can just rename the few types as suggested by you and we can look at it later if we anytime have more number of objects to add. -- With Regards, Amit Kapila.
On 11/2/21 11:37 AM, Amit Kapila wrote: > On Mon, Nov 1, 2021 at 5:52 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 11/1/21 11:18, Amit Kapila wrote: >>> On Mon, Nov 1, 2021 at 2:48 AM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> I wonder if it'd be better to just separate the schema and object type >>>> specification, instead of mashing it into a single constant. >>>> >>> >>> Do you mean to say the syntax on the lines of Create Publication For >>> Table t1, t2 Schema s1, s2;? If so, then originally the patch had the >>> syntax on those lines but Tom pointed out that the meaning of such a >>> syntax can change over a period of time and that can break apps [1]. I >>> think the current syntax gives a lot of flexibility to users and we >>> have some precedent for it as well. >>> >> >> No, I'm not talking about the syntax at all - I'm talking about how we >> represent it. PUBLICATIONOBJ_TABLE_CURRSCHEMA mixes the object type and >> schema in the same constant, so I am wondering if we should just split >> that into two pieces - one determining the schema, one determining the >> object type. So PublicationObjSpec would have two fields instead of just >> pubobjtype. >> >> The advantage would be we wouldn't need a whole lot of new constants for >> each object type - adding sequences pretty much means adding >> >> PUBLICATIONOBJ_SEQUENCE >> PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA >> PUBLICATIONOBJ_SEQUENCE_CURRSCHEMA >> >> and after splitting we'd need just the first one. >> > > I see your point but OTOH, I think it will lead to additional checks > in post-processing functions like ObjectsInPublicationToOids() as we > have to always check both object type and schema to make decisions. > True. >> But maybe it's not >> that bad, though. We don't expect all that many object types in >> publications, I guess. >> > > Yeah, that is also true. So maybe at this, we can just rename the few > types as suggested by you and we can look at it later if we anytime > have more number of objects to add. > +1 -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > > > > > Yeah, that is also true. So maybe at this, we can just rename the few > > types as suggested by you and we can look at it later if we anytime > > have more number of objects to add. > > > > +1 > Apart from what you have pointed above, we are using "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA". -- With Regards, Amit Kapila.
On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: > > > > > > > > > > Yeah, that is also true. So maybe at this, we can just rename the few > > > types as suggested by you and we can look at it later if we anytime > > > have more number of objects to add. > > > > > > > +1 > > > > Apart from what you have pointed above, we are using > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA". Thanks for the comments, the attached patch has the changes for the same. Regards, Vignesh
Attachment
On Wed, Nov 3, 2021 12:25 PM vignesh C <vignesh21@gmail.com> wrote: > On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > > > > > > > > > > > > > > Yeah, that is also true. So maybe at this, we can just rename the > > > > few types as suggested by you and we can look at it later if we > > > > anytime have more number of objects to add. > > > > > > > > > > +1 > > > > > > > Apart from what you have pointed above, we are using > > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace > > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA". > > Thanks for the comments, the attached patch has the changes for the same. > Thanks for the patch. I have only one minor comment: + PUBLICATIONOBJ_TABLE_IN_SCHEMA, /* Relations in schema type */ I think the ' Relations' in the comments also need to be changed to 'tables'. The other part of the patch looks good to me. Best regards, Hou zj
While you are changing these, maybe also change: Before: PUBLICATIONOBJ..._CURRSCHEMA After: PUBLICATIONOBJ..._CUR_SCHEMA Which seems more readable to me. ------ Kind Regards, Peter Smith. Fujitsu Australia
On Wed, Nov 3, 2021 at 11:37 AM Peter Smith <smithpb2250@gmail.com> wrote: > > While you are changing these, maybe also change: > > Before: PUBLICATIONOBJ..._CURRSCHEMA > After: PUBLICATIONOBJ..._CUR_SCHEMA > > Which seems more readable to me. Thanks for the comment, the attached patch has the changes for the same. Regards, Vignesh
Attachment
On Wed, Nov 3, 2021 at 11:07 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wed, Nov 3, 2021 12:25 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Nov 3, 2021 at 8:30 AM Amit Kapila <amit.kapila16@gmail.com> > > wrote: > > > > > > On Tue, Nov 2, 2021 at 8:13 PM Tomas Vondra > > > <tomas.vondra@enterprisedb.com> wrote: > > > > > > > > > > > > > > > > > > Yeah, that is also true. So maybe at this, we can just rename the > > > > > few types as suggested by you and we can look at it later if we > > > > > anytime have more number of objects to add. > > > > > > > > > > > > > +1 > > > > > > > > > > Apart from what you have pointed above, we are using > > > "DO_PUBLICATION_REL_IN_SCHEMA" in pg_dump. I think we should replace > > > that as well with "DO_PUBLICATION_TABLE_IN_SCHEMA". > > > > Thanks for the comments, the attached patch has the changes for the same. > > > > Thanks for the patch. > I have only one minor comment: > > + PUBLICATIONOBJ_TABLE_IN_SCHEMA, /* Relations in schema type */ > > I think the ' Relations' in the comments also need to be changed to 'tables'. > > The other part of the patch looks good to me. Thanks for the comment, the patch at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm3g4ZaJ8h%3D16_A%2BytbyPUdJPaAz94YzQLqkD%3DyPu%2BVPwA%40mail.gmail.com Regards, Vignesh
On Wed, Nov 3, 2021 at 11:55 AM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Nov 3, 2021 at 11:37 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > While you are changing these, maybe also change: > > > > Before: PUBLICATIONOBJ..._CURRSCHEMA > > After: PUBLICATIONOBJ..._CUR_SCHEMA > > > > Which seems more readable to me. > > Thanks for the comment, the attached patch has the changes for the same. > LGTM. I'll push this tomorrow unless there are more comments or suggestions. -- With Regards, Amit Kapila.
FYI - I found a small problem with one of the new PublicationObjSpec parser error messages that was introduced by the recent schema publication commit [1]. The error message text is assuming that the error originates from CREATE PUBLICATION, but actually that same error can also come from ALTER PUBLICATION. For example, e.g.1) Here the error came from CREATE PUBLICATION, so the message text looks OK. test_pub=# CREATE PUBLICATION p1 FOR t1; 2021-11-04 10:50:17.208 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s) at character 27 2021-11-04 10:50:17.208 AEDT [738] STATEMENT: CREATE PUBLICATION p1 FOR t1; ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s) LINE 1: CREATE PUBLICATION p1 FOR t1; ^ ~~ e.g.2) Here the error came from ALTER PUBLICATION, so the message text is not OK because the ALTER syntax [2] does not even have a FOR keyword. test_pub=# ALTER PUBLICATION p1 SET t1; 2021-11-04 10:51:53.912 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s) at character 26 2021-11-04 10:51:53.912 AEDT [738] STATEMENT: ALTER PUBLICATION p1 SET t1; ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1; ^ ------ [1] https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd [2] https://www.postgresql.org/docs/devel/sql-alterpublication.html Kind Regards, Peter Smith. Fujitsu Australia
On Thurs, Nov 4, 2021 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote: > FYI - I found a small problem with one of the new PublicationObjSpec parser > error messages that was introduced by the recent schema publication commit > [1]. > > The error message text is assuming that the error originates from CREATE > PUBLICATION, but actually that same error can also come from ALTER > PUBLICATION. > e.g.2) Here the error came from ALTER PUBLICATION, so the message text is > not OK because the ALTER syntax [2] does not even have a FOR keyword. > > test_pub=# ALTER PUBLICATION p1 SET t1; > 2021-11-04 10:51:53.912 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN > SCHEMA should be specified before the table/schema name(s) at character 26 > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT: ALTER PUBLICATION p1 SET > t1; > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before > the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1; I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be specified before ...". Best regards, Hou zj
On Thu, Nov 4, 2021 at 11:55 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Thurs, Nov 4, 2021 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI - I found a small problem with one of the new PublicationObjSpec parser > > error messages that was introduced by the recent schema publication commit > > [1]. > > > > The error message text is assuming that the error originates from CREATE > > PUBLICATION, but actually that same error can also come from ALTER > > PUBLICATION. > > e.g.2) Here the error came from ALTER PUBLICATION, so the message text is > > not OK because the ALTER syntax [2] does not even have a FOR keyword. > > > > test_pub=# ALTER PUBLICATION p1 SET t1; > > 2021-11-04 10:51:53.912 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN > > SCHEMA should be specified before the table/schema name(s) at character 26 > > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT: ALTER PUBLICATION p1 SET > > t1; > > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before > > the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1; > > I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be specified before ...". > +1 - Yes, I also thought the fix should just be some simple/generic change of the wording like your suggestion (rather than a different message for both cases). ------ Kind Regards, Peter Smith. Fujitsu Australia
On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI - I found a small problem with one of the new PublicationObjSpec > parser error messages that was introduced by the recent schema > publication commit [1]. > > The error message text is assuming that the error originates from > CREATE PUBLICATION, but actually that same error can also come from > ALTER PUBLICATION. > > For example, > > e.g.1) Here the error came from CREATE PUBLICATION, so the message > text looks OK. > > test_pub=# CREATE PUBLICATION p1 FOR t1; > 2021-11-04 10:50:17.208 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN > SCHEMA should be specified before the table/schema name(s) at > character 27 > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT: CREATE PUBLICATION p1 FOR t1; > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before > the table/schema name(s) > LINE 1: CREATE PUBLICATION p1 FOR t1; > ^ > ~~ > > e.g.2) Here the error came from ALTER PUBLICATION, so the message text > is not OK because the ALTER syntax [2] does not even have a FOR > keyword. > > test_pub=# ALTER PUBLICATION p1 SET t1; > 2021-11-04 10:51:53.912 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN > SCHEMA should be specified before the table/schema name(s) at > character 26 > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT: ALTER PUBLICATION p1 SET t1; > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before > the table/schema name(s) > LINE 1: ALTER PUBLICATION p1 SET t1; > ^ Thanks for the comment, I changed the error message to remove the FOR keyword. The attached patch has the changes for the same. Regards, Vignesh
Attachment
On Thu, Nov 4, 2021 at 3:24 PM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > FYI - I found a small problem with one of the new PublicationObjSpec > > parser error messages that was introduced by the recent schema > > publication commit [1]. > > > > The error message text is assuming that the error originates from > > CREATE PUBLICATION, but actually that same error can also come from > > ALTER PUBLICATION. > > > > For example, > > > > e.g.1) Here the error came from CREATE PUBLICATION, so the message > > text looks OK. > > > > test_pub=# CREATE PUBLICATION p1 FOR t1; > > 2021-11-04 10:50:17.208 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN > > SCHEMA should be specified before the table/schema name(s) at > > character 27 > > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT: CREATE PUBLICATION p1 FOR t1; > > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before > > the table/schema name(s) > > LINE 1: CREATE PUBLICATION p1 FOR t1; > > ^ > > ~~ > > > > e.g.2) Here the error came from ALTER PUBLICATION, so the message text > > is not OK because the ALTER syntax [2] does not even have a FOR > > keyword. > > > > test_pub=# ALTER PUBLICATION p1 SET t1; > > 2021-11-04 10:51:53.912 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES IN > > SCHEMA should be specified before the table/schema name(s) at > > character 26 > > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT: ALTER PUBLICATION p1 SET t1; > > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before > > the table/schema name(s) > > LINE 1: ALTER PUBLICATION p1 SET t1; > > ^ > > Thanks for the comment, I changed the error message to remove the FOR > keyword. The attached patch has the changes for the same. > LGTM. ------ Kind Regards, Peter Smith. Fujitsu Australia
> On Thu, Nov 4, 2021 at 3:24 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Thu, Nov 4, 2021 at 5:41 AM Peter Smith <smithpb2250@gmail.com> > wrote: > > > > > > FYI - I found a small problem with one of the new PublicationObjSpec > > > parser error messages that was introduced by the recent schema > > > publication commit [1]. > > > > > > The error message text is assuming that the error originates from > > > CREATE PUBLICATION, but actually that same error can also come from > > > ALTER PUBLICATION. > > > > > > For example, > > > > > > e.g.1) Here the error came from CREATE PUBLICATION, so the message > > > text looks OK. > > > > > > test_pub=# CREATE PUBLICATION p1 FOR t1; > > > 2021-11-04 10:50:17.208 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES > > > IN SCHEMA should be specified before the table/schema name(s) at > > > character 27 > > > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT: CREATE PUBLICATION p1 > > > FOR t1; > > > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified > > > before the table/schema name(s) LINE 1: CREATE PUBLICATION p1 FOR > > > t1; > > > ^ > > > ~~ > > > > > > e.g.2) Here the error came from ALTER PUBLICATION, so the message > > > text is not OK because the ALTER syntax [2] does not even have a FOR > > > keyword. > > > > > > test_pub=# ALTER PUBLICATION p1 SET t1; > > > 2021-11-04 10:51:53.912 AEDT [738] ERROR: FOR TABLE/FOR ALL TABLES > > > IN SCHEMA should be specified before the table/schema name(s) at > > > character 26 > > > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT: ALTER PUBLICATION p1 > > > SET t1; > > > ERROR: FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified > > > before the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1; > > > ^ > > > > Thanks for the comment, I changed the error message to remove the FOR > > keyword. The attached patch has the changes for the same. > > > > LGTM. +1 Best regards, Hou zj
FYI - I spotted a trivial SQL mistake (?) of the schema publication patch [1]. See the file describe.c, function describeOneTableDetails. The new SQL has a 3rd UNION that looks like: ... "UNION\n" "SELECT pubname\n" "FROM pg_catalog.pg_publication p\n" "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" "ORDER BY 1;", oid, oid, oid, oid); Notice that there is a table alias "p" but it is never used. It seems to me like it is just an accidental omission. I think it should be written like - BEFORE: "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" AFTER: "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" Doing this will make it consistent with the SQL of the nearby "else" case which uses the same alias as expected. ------ [1] https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd Kind Regards, Peter Smith. Fujitsu Australia
On Tue, Nov 9, 2021 at 7:20 AM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI - I spotted a trivial SQL mistake (?) of the schema publication patch [1]. > > See the file describe.c, function describeOneTableDetails. > The new SQL has a 3rd UNION that looks like: > > ... > "UNION\n" > "SELECT pubname\n" > "FROM pg_catalog.pg_publication p\n" > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" > "ORDER BY 1;", > oid, oid, oid, oid); > > Notice that there is a table alias "p" but it is never used. It seems > to me like it is just an accidental omission. I think it should be > written like - > > BEFORE: > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" > AFTER: > "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" > > Doing this will make it consistent with the SQL of the nearby "else" > case which uses the same alias as expected. > The above makes sense to me. So, pushed a fix for this along with Vignesh's patch to fix other comments related to this work. -- With Regards, Amit Kapila.
On Tue, Nov 9, 2021 at 2:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Nov 9, 2021 at 7:20 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > FYI - I spotted a trivial SQL mistake (?) of the schema publication patch [1]. > > > > See the file describe.c, function describeOneTableDetails. > > The new SQL has a 3rd UNION that looks like: > > > > ... > > "UNION\n" > > "SELECT pubname\n" > > "FROM pg_catalog.pg_publication p\n" > > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" > > "ORDER BY 1;", > > oid, oid, oid, oid); > > > > Notice that there is a table alias "p" but it is never used. It seems > > to me like it is just an accidental omission. I think it should be > > written like - > > > > BEFORE: > > "WHERE puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" > > AFTER: > > "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n" > > > > Doing this will make it consistent with the SQL of the nearby "else" > > case which uses the same alias as expected. > > > > The above makes sense to me. So, pushed a fix for this along with > Vignesh's patch to fix other comments related to this work. Thanks for committing the patch. I have changed the status of the Commitfest entry for this patch to Committed. Adding of the view can be handled once the Sequence patch is committed. Regards, Vignesh
I just noticed that this (commit 5a2832465fd8) added a separate catalog to store schemas which are part of a publication, side-by-side with the catalog to store relations which are part of a publication. This seems a strange way to represent publication membership: in order to find out what objects are members of a publication, you have to scan both pg_publication_rel and pg_publication_namespace. Wouldn't it make more sense to have a single catalog for both things, maybe something like pg_publication_object oid OID -- unique key (for pg_depend) prpubid OID -- of pg_publication prrelid OID -- OID of relation, or 0 if not a relation prnspid OID -- OID of namespace, or 0 if not a namespace which seems more natural to me, and pollutes the system less with weird syscaches, etc. What do you think? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > I just noticed that this (commit 5a2832465fd8) added a separate catalog > to store schemas which are part of a publication, side-by-side with the > catalog to store relations which are part of a publication. This seems > a strange way to represent publication membership: in order to find out > what objects are members of a publication, you have to scan both > pg_publication_rel and pg_publication_namespace. Wouldn't it make more > sense to have a single catalog for both things, maybe something like > > pg_publication_object > oid OID -- unique key (for pg_depend) > prpubid OID -- of pg_publication > prrelid OID -- OID of relation, or 0 if not a relation > prnspid OID -- OID of namespace, or 0 if not a namespace > > which seems more natural to me, and pollutes the system less with weird > syscaches, etc. > It will unnecessarily increase the size per row for FOR TABLE publication both because of the additional column and additional index (schema_id, pub_id) and vice versa. I think future projects (like row_filter, column_filter, etc) will make this impact much bigger as new columns for those won't be required for schema publications. Basically, as soon as we start to store additional properties for different objects, storing different objects together would start becoming more and more worse. This will also in turn increase the scan cost where we need only schema or rel publication access like where ever we are using PUBLICATIONNAMESPACEMAP. I think it will increase the cost of scanning table publications as its corresponding index will be larger. -- With Regards, Amit Kapila.
On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > I just noticed that this (commit 5a2832465fd8) added a separate catalog > to store schemas which are part of a publication, side-by-side with the > catalog to store relations which are part of a publication. This seems > a strange way to represent publication membership: in order to find out > what objects are members of a publication, you have to scan both > pg_publication_rel and pg_publication_namespace. Wouldn't it make more > sense to have a single catalog for both things, maybe something like > > pg_publication_object > oid OID -- unique key (for pg_depend) > prpubid OID -- of pg_publication > prrelid OID -- OID of relation, or 0 if not a relation > prnspid OID -- OID of namespace, or 0 if not a namespace > > which seems more natural to me, and pollutes the system less with weird > syscaches, etc. > > What do you think? I felt the existing tables are better normalized than the proposed one. Regards, Vignesh