Thread: why can't a table be part of the same publication as its schema
Apparently, you can't add a table to a publication if its schema is already part of the publication (and vice versa), e.g., =# alter publication p1 add table s1.t1; ERROR: 22023: cannot add relation "s1.t1" to publication DETAIL: Table's schema "s1" is already part of the publication or part of the specified schema list. Is there a reason for this? It looks a bit like a misfeature to me. It constrains how you can move your tables around between schemas, based on how somewhere else a publication has been constructed. It seems to me that it shouldn't be difficult to handle the case that a table is part of the publication via two different routes. (We must already handle that since a subscription can use more than one publication.)
On Thu, Sep 8, 2022 at 5:06 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > Apparently, you can't add a table to a publication if its schema is > already part of the publication (and vice versa), e.g., > > =# alter publication p1 add table s1.t1; > ERROR: 22023: cannot add relation "s1.t1" to publication > DETAIL: Table's schema "s1" is already part of the publication or part > of the specified schema list. > > Is there a reason for this? > Yes, because otherwise, there was confusion while dropping the objects from publication. Consider in the above case, if we would have allowed it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN SCHEMA s1, then (a) shall we remove both schema s1 and a table that is separately added (s1.t1) from that schema, or (b) just remove schema s1? There is a point that the user can expect that as she has added them separately, so we should allow her to drop them separately as well. OTOH, if we go that way, then it is again questionable that when the user has asked to Drop all the tables in the schema (via DROP ALL TABLES IN SCHEMA command) then why keep some tables? The other confusion from allowing publications to have schemas and tables from the same schema is that say the user has created a publication with the command CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA s1, and then she can ask to allow dropping one of the tables in the schema by ALTER PUBLICATION pub1 DROP TABLE s1.t1. Now, it will be tricky to support and I think it doesn't make sense as well. Similarly, what if the user has first created a publication with "CREATE PUBLICATION pub1 ADD TABLE s1.t1, s1.t2, ... s1.t99;" and then tries to drop all tables in one shot by ALTER PUBLICATION DROP ALL TABLES IN SCHEMA sch1;? To avoid these confusions, we have disallowed adding a table if its schema is already part of the publication and vice-versa. Also, there won't be any additional benefit to doing so. -- With Regards, Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes: > To avoid these confusions, we have disallowed adding a table if its > schema is already part of the publication and vice-versa. Really? Is there logic in ALTER TABLE SET SCHEMA that rejects the command dependent on the contents of the publication tables? If so, are there locks taken in both ALTER TABLE SET SCHEMA and the publication-modifying commands that are sufficient to prevent race conditions in such changes? This position sounds quite untenable from here, even if I found your arguments-in-support convincing, which I don't really. ISTM the rule should be along the lines of "table S.T should be published either if schema S is published or S.T itself is". There's no obvious need to interconnect the two conditions. regards, tom lane
On Fri, Sep 9, 2022 at 11:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > To avoid these confusions, we have disallowed adding a table if its > > schema is already part of the publication and vice-versa. > > Really? > > Is there logic in ALTER TABLE SET SCHEMA that rejects the command > dependent on the contents of the publication tables? > Yes, it has. For example, postgres=# create schema s1; CREATE SCHEMA postgres=# create table s1.t1(c1 int); CREATE TABLE postgres=# create schema s2; CREATE SCHEMA postgres=# create publication pub1 for all tables in schema s2, table s1.t1; CREATE PUBLICATION postgres=# Alter table s1.t1 set schema s2; ERROR: cannot move table "t1" to schema "s2" DETAIL: The schema "s2" and same schema's table "t1" cannot be part of the same publication "pub1". > If so, are > there locks taken in both ALTER TABLE SET SCHEMA and the > publication-modifying commands that are sufficient to prevent > race conditions in such changes? > Good point. I have checked it and found that ALTER TABLE SET SCHEMA takes AccessExclusiveLock on relation and AccessShareLock on the schema which it is going to set. The alter publication command takes ShareUpdateExclusiveLock on relation for dropping/adding a table to publication which will prevent any race condition with ALTER TABLE SET SCHEMA. However, the alter publication command takes AccessShareLock for dropping/adding schema which won't block with ALTER TABLE SET SCHEMA command. So, I think we need to change the lock mode for it in alter publication command. > This position sounds quite untenable from here, even if I found > your arguments-in-support convincing, which I don't really. > ISTM the rule should be along the lines of "table S.T should > be published either if schema S is published or S.T itself is". > There's no obvious need to interconnect the two conditions. > This rule is currently followed when a subscription has more than one publication. It is just that we didn't allow it in the same publication because of a fear that it may cause confusion for some of the users. The other thing to look at here is that the existing case of a "FOR ALL TABLES" publication also follows a similar rule such that it doesn't allow adding individual tables if the publication is for all tables. For example, postgres=# create publication pub1 for all tables; CREATE PUBLICATION postgres=# alter publication pub1 add table t1; ERROR: publication "pub1" is defined as FOR ALL TABLES DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications. So, why shouldn't a "FOR ALL TABLES IN SCHEMA" publication follow a similar behavior? -- With Regards, Amit Kapila.
On Fri, Sep 9, 2022 at 5:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > So, why shouldn't a "FOR ALL TABLES IN SCHEMA" publication follow a > similar behavior? Surely that is not the same case at all. If you're publishing everything, there's no point in also having a specific list of things that you want published, but when you're publishing only some things, there is. If my wife tells me to wash everything in the laundry basket and also my nice pants, and I discover that my nice pants already happen to be in the laundry basket, I do not tell her: ERROR: my nice pants are already in the laundry basket It feels like a mistake to me that there's any catalog representation at all for a table that is published because it is part of a schema. The way a feature like this should work is that the schema should be labelled as published, and we should discover which tables are part of it at any given time as we go. We shouldn't need separate catalog entries for each table in the schema just because the schema is published. But if we do have such catalog entries, surely there should be a difference between the catalog entry that gets created when the table is individually published and the one that gets created when the containing schema is published. We have such tracking in other cases (coninhcount, conislocal; attinhcount, attislocal). In my opinion, this shouldn't have been committed working the way it does. -- Robert Haas EDB: http://www.enterprisedb.com
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Friday, September 9, 2022 9:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Sep 9, 2022 at 5:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > So, why shouldn't a "FOR ALL TABLES IN SCHEMA" publication follow a > > similar behavior? Hi > > It feels like a mistake to me that there's any catalog representation at all for a > table that is published because it is part of a schema. > The way a feature like this should work is that the schema should be labelled as > published, and we should discover which tables are part of it at any given time as > we go. We shouldn't need separate catalog entries for each table in the schema > just because the schema is published. IIRC, the feature currently works almost the same as you described. It doesn't create entry for tables that are published via its schema level, it only record the published schema and check which tables are part of it. Sorry, If I misunderstand your points or missed something. Best regards, Hou zj
On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > IIRC, the feature currently works almost the same as you described. It doesn't > create entry for tables that are published via its schema level, it only record > the published schema and check which tables are part of it. Oh, well if that's the case, that is great news. But then I don't understand Amit's comment from before: > Yes, because otherwise, there was confusion while dropping the objects > from publication. Consider in the above case, if we would have allowed > it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN > SCHEMA s1, then (a) shall we remove both schema s1 and a table that is > separately added (s1.t1) from that schema, or (b) just remove schema > s1? I believe that (b) is the correct behavior, so I assumed that this issue must be some difficulty in implementing it, like a funny catalog representation. Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1 { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we used this ALL TABLES IN SCHEMA language. -- Robert Haas EDB: http://www.enterprisedb.com
> On Sep 9, 2022, at 8:18 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1 > { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we > used this ALL TABLES IN SCHEMA language. The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of other objecttypes became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. People mightfind that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Sep 9, 2022 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > IIRC, the feature currently works almost the same as you described. It doesn't > > create entry for tables that are published via its schema level, it only record > > the published schema and check which tables are part of it. > > Oh, well if that's the case, that is great news. > Yes, the feature works as you and Hou-San have mentioned. > But then I don't > understand Amit's comment from before: > > > Yes, because otherwise, there was confusion while dropping the objects > > from publication. Consider in the above case, if we would have allowed > > it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN > > SCHEMA s1, then (a) shall we remove both schema s1 and a table that is > > separately added (s1.t1) from that schema, or (b) just remove schema > > s1? > > I believe that (b) is the correct behavior, so I assumed that this > issue must be some difficulty in implementing it, like a funny catalog > representation. > No, it was because of syntax. IIRC, during development, Greg Nancarrow raised a point [1] that a user can expect the individually added tables for a schema which is also part of the publication to also get dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC, originally, the patch had a behavior (b) but then changed due to discussion around this point. But now that it seems you and others don't feel that was right, we can change back to (b) as I think that shouldn't be difficult to achieve. > Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1 > { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we > used this ALL TABLES IN SCHEMA language. > It was exactly due to the reason Mark had mentioned in the email [2]. [1] - https://www.postgresql.org/message-id/CAJcOf-fTRZ3HiA5xU0-O-PT390A7wuUUkjP8uX3aQJLBsJNVmw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/596EA671-66DF-4285-8560-0270DC062353%40enterprisedb.com -- With Regards, Amit Kapila.
On Fri, Sep 9, 2022 at 2:17 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > On Sep 9, 2022, at 8:18 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1 > > { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we > > used this ALL TABLES IN SCHEMA language. > > The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of otherobject types became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. Peoplemight find that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes. If I encountered this syntax in a vacuum, that's not what I would think. I would think that ADD ALL TABLES IN SCHEMA meant add all the tables in the schema to the publication one by one as individual objects, i.e. add the tables that are currently as of this moment in that schema to the publication; and I would think that ADD SCHEMA meant remember that this schema is part of the publication and so whenever tables are created and dropped in that schema (or moved in and out) what is being published is automatically updated. The analogy here seems to be to GRANT, which actually does support both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA modifies each table that is currently in that schema (never mind what happens later). -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, 10 Sept 2022 at 19:18, Robert Haas <robertmhaas@gmail.com> wrote:
If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects, i.e. add the tables that are currently as of this moment in
that schema to the publication; and I would think that ADD SCHEMA
meant remember that this schema is part of the publication and so
whenever tables are created and dropped in that schema (or moved in
and out) what is being published is automatically updated.
The analogy here seems to be to GRANT, which actually does support
both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives
privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA
modifies each table that is currently in that schema (never mind what
happens later).
Yes, except GRANT ON SCHEMA only grants access to the schema - CREATE or USAGE. You cannot write GRANT SELECT ON SCHEMA to grant access to all tables in the schema.
> On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >>> I don't understand why we >>> used this ALL TABLES IN SCHEMA language. >> >> The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of otherobject types became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. Peoplemight find that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes. > > If I encountered this syntax in a vacuum, that's not what I would > think. I would think that ADD ALL TABLES IN SCHEMA meant add all the > tables in the schema to the publication one by one as individual > objects Yes, it appears the syntax was chosen to avoid one kind of confusion, but created another kind. Per the docs on this feature: FOR ALL TABLES IN SCHEMA Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tablescreated in the future. Like you, I wouldn't expect that definition, given the behavior of GRANT with respect to the same grammatical construction. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Monday, September 12, 2022 1:08 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > > On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > > >>> I don't understand why we > >>> used this ALL TABLES IN SCHEMA language. > >> > >> The conversation, as I recall, was that "ADD SCHEMA foo" would only mean > all tables in foo, until publication of other object types became supported, at > which point "ADD SCHEMA foo" would suddenly mean more than it did before. > People might find that surprising, so the "ALL TABLES IN" was intended to > future-proof against surprising behavioral changes. > > > > If I encountered this syntax in a vacuum, that's not what I would > > think. I would think that ADD ALL TABLES IN SCHEMA meant add all the > > tables in the schema to the publication one by one as individual > > objects > > Yes, it appears the syntax was chosen to avoid one kind of confusion, but created > another kind. Per the docs on this feature: > > FOR ALL TABLES IN SCHEMA > Marks the publication as one that replicates changes for all tables in the > specified list of schemas, including tables created in the future. > > Like you, I wouldn't expect that definition, given the behavior of GRANT with > respect to the same grammatical construction. I'm a bit unsure if it should be compared to GRANT. Because even if we chose "ALTER PUBLICATION p1 { ADD | DROP } SCHEMA name", it's also not consistent with the meaning of GRANT ON SCHEMA, as GRANT ON SCHEMA doesn't grant rights on the tables within schema if I understand correctly. I feel we'd better compare the syntax with the existing publication command: FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means publishing all the tables in the database *including* tables created in the future. I think both the syntax and meaning of ALL TABLES IN SCHEMA are consistent with the existing FOR ALL TABLES. And the behavior is clearly documented, so personally I think it's fine. https://www.postgresql.org/docs/devel/sql-createpublication.html -- FOR ALL TABLES Marks the publication as one that replicates changes for all tables in the database, including tables created in thefuture. FOR ALL TABLES IN SCHEMA Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tablescreated in the future. -- Besides, as mentioned(and suggested by Tom[1]), we might support publishing SEQUENCE(or others) in the future. It would give more flexibility to user if we have another FOR ALL SEQUENCES(or other objects) IN SCHEMA. [1] https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us Best regards, Hou zj
On Sat, 10 Sept 2022 at 07:32, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 9, 2022 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > IIRC, the feature currently works almost the same as you described. It doesn't > > > create entry for tables that are published via its schema level, it only record > > > the published schema and check which tables are part of it. > > > > Oh, well if that's the case, that is great news. > > > > Yes, the feature works as you and Hou-San have mentioned. > > > But then I don't > > understand Amit's comment from before: > > > > > Yes, because otherwise, there was confusion while dropping the objects > > > from publication. Consider in the above case, if we would have allowed > > > it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN > > > SCHEMA s1, then (a) shall we remove both schema s1 and a table that is > > > separately added (s1.t1) from that schema, or (b) just remove schema > > > s1? > > > > I believe that (b) is the correct behavior, so I assumed that this > > issue must be some difficulty in implementing it, like a funny catalog > > representation. > > > > No, it was because of syntax. IIRC, during development, Greg Nancarrow > raised a point [1] that a user can expect the individually added > tables for a schema which is also part of the publication to also get > dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC, > originally, the patch had a behavior (b) but then changed due to > discussion around this point. But now that it seems you and others > don't feel that was right, we can change back to (b) as I think that > shouldn't be difficult to achieve. I have made the changes to allow creation of publication with a schema and table of the same schema. The attached patch has the changes for the same. I'm planning to review and test the patch further. Regards, Vignesh
Attachment
At Mon, 12 Sep 2022 04:26:48 +0000, "houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com> wrote in > On Monday, September 12, 2022 1:08 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > > > On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > > > > >>> I don't understand why we > > >>> used this ALL TABLES IN SCHEMA language. > > >> > > >> The conversation, as I recall, was that "ADD SCHEMA foo" would only mean > > all tables in foo, until publication of other object types became supported, at > > which point "ADD SCHEMA foo" would suddenly mean more than it did before. > > People might find that surprising, so the "ALL TABLES IN" was intended to > > future-proof against surprising behavioral changes. > > > > > > If I encountered this syntax in a vacuum, that's not what I would > > > think. I would think that ADD ALL TABLES IN SCHEMA meant add all the > > > tables in the schema to the publication one by one as individual > > > objects > > > > Yes, it appears the syntax was chosen to avoid one kind of confusion, but created > > another kind. Per the docs on this feature: > > > > FOR ALL TABLES IN SCHEMA > > Marks the publication as one that replicates changes for all tables in the > > specified list of schemas, including tables created in the future. > > > > Like you, I wouldn't expect that definition, given the behavior of GRANT with > > respect to the same grammatical construction. > > I'm a bit unsure if it should be compared to GRANT. Because even if we chose > "ALTER PUBLICATION p1 { ADD | DROP } SCHEMA name", it's also not > consistent with the meaning of GRANT ON SCHEMA, as GRANT ON SCHEMA doesn't > grant rights on the tables within schema if I understand correctly. > > I feel we'd better compare the syntax with the existing publication command: > FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means publishing > all the tables in the database *including* tables created in the future. I > think both the syntax and meaning of ALL TABLES IN SCHEMA are consistent with > the existing FOR ALL TABLES. IMHO, I feel closer to Robert. "ALL TABLES IN SCHEMA" sounds like the concrete tables at the time of invocation. While I agree that it is not directly comparable to GRANT, but if I see "ALTER PUBLICATION p1 ADD SCHEMA s1", I automatically translate that into "all tables in the schema s1 at the time of using this publication". At least, it would cause less confusion when it were "ALT PUB p1 DROP SCEMA s1" aginst "DROP ALL TABLES IN SCHEMA s1". However.. > And the behavior is clearly documented, so personally I think it's fine. > https://www.postgresql.org/docs/devel/sql-createpublication.html > -- > FOR ALL TABLES > Marks the publication as one that replicates changes for all tables in the database, including tables created in thefuture. > FOR ALL TABLES IN SCHEMA > Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tablescreated in the future. > -- > > Besides, as mentioned(and suggested by Tom[1]), we might support publishing > SEQUENCE(or others) in the future. It would give more flexibility to user if we > have another FOR ALL SEQUENCES(or other objects) IN SCHEMA. > > [1] https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us Fair point. Should be stupid, but how about the following? CREATE PUBLICATION p1 FOR TABLES * IN SCHEMA s1; DROP PUBLICATION p1 FOR TABLES * IN SCHEMA s1; ATLER PUBLICATION p1 ADD TABLES * IN SCHEMA s1; ALTER PUBLICATION p1 DROP TABLES * IN SCHEMA s1; This is an analog of synchronous_standby_names. But I'm not sure a bare asterisc can appear there.. We could use ANY instead? CREATE PUBLICATION p1 FOR TABLES ANY IN SCHEMA s1; ... regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Tuesday, September 13, 2022 12:40 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Mon, 12 Sep 2022 04:26:48 +0000, "houzj.fnst@fujitsu.com" > <houzj.fnst@fujitsu.com> wrote in > > On Monday, September 12, 2022 1:08 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > > > > On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > > >>> I don't understand why we > > > >>> used this ALL TABLES IN SCHEMA language. > > > >> > > > >> The conversation, as I recall, was that "ADD SCHEMA foo" would > > > >> only mean > > > all tables in foo, until publication of other object types became > > > supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. > > > People might find that surprising, so the "ALL TABLES IN" was > > > intended to future-proof against surprising behavioral changes. > > > > > > > > If I encountered this syntax in a vacuum, that's not what I would > > > > think. I would think that ADD ALL TABLES IN SCHEMA meant add all > > > > the tables in the schema to the publication one by one as > > > > individual objects > > > > > > Yes, it appears the syntax was chosen to avoid one kind of > > > confusion, but created another kind. Per the docs on this feature: > > > > > > FOR ALL TABLES IN SCHEMA > > > Marks the publication as one that replicates changes for all > > > tables in the specified list of schemas, including tables created in the future. > > > > > > Like you, I wouldn't expect that definition, given the behavior of > > > GRANT with respect to the same grammatical construction. > > > > I'm a bit unsure if it should be compared to GRANT. Because even if we > > chose "ALTER PUBLICATION p1 { ADD | DROP } SCHEMA name", it's also not > > consistent with the meaning of GRANT ON SCHEMA, as GRANT ON SCHEMA > > doesn't grant rights on the tables within schema if I understand correctly. > > > > I feel we'd better compare the syntax with the existing publication command: > > FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means > > publishing all the tables in the database *including* tables created > > in the future. I think both the syntax and meaning of ALL TABLES IN > > SCHEMA are consistent with the existing FOR ALL TABLES. > > IMHO, I feel closer to Robert. "ALL TABLES IN SCHEMA" sounds like the > concrete tables at the time of invocation. While I agree that it is not directly > comparable to GRANT, but if I see "ALTER PUBLICATION p1 ADD SCHEMA s1", I > automatically translate that into "all tables in the schema s1 at the time of using > this publication". At least, it would cause less confusion when it were "ALT PUB > p1 DROP SCEMA s1" aginst "DROP ALL TABLES IN SCHEMA s1". > > However.. > > > And the behavior is clearly documented, so personally I think it's fine. > > https://www.postgresql.org/docs/devel/sql-createpublication.html > > -- > > FOR ALL TABLES > > Marks the publication as one that replicates changes for all tables in the > database, including tables created in the future. > > FOR ALL TABLES IN SCHEMA > > Marks the publication as one that replicates changes for all tables in the > specified list of schemas, including tables created in the future. > > -- > > > > Besides, as mentioned(and suggested by Tom[1]), we might support > > publishing SEQUENCE(or others) in the future. It would give more > > flexibility to user if we have another FOR ALL SEQUENCES(or other objects) IN > SCHEMA. > > > > [1] > > > https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.u > > s > > Fair point. Should be stupid, but how about the following? > > CREATE PUBLICATION p1 FOR TABLES * IN SCHEMA s1; > DROP PUBLICATION p1 FOR TABLES * IN SCHEMA s1; > ATLER PUBLICATION p1 ADD TABLES * IN SCHEMA s1; ALTER PUBLICATION > p1 DROP TABLES * IN SCHEMA s1; > > This is an analog of synchronous_standby_names. But I'm not sure a bare > asterisc can appear there.. We could use ANY instead? > > CREATE PUBLICATION p1 FOR TABLES ANY IN SCHEMA s1; ... Thanks for the suggestions. But personally, I am not sure if this is better the current syntax as it seems syntactically inconsistent with the existing "FOR ALL TABLES". Also, the behavior to include future tables is consistent with FOR ALL TABLES. Best regards, Hou zj
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Monday, September 12, 2022 10:14 PM vignesh C <vignesh21@gmail.com> wrote: > On Sat, 10 Sept 2022 at 07:32, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Sep 9, 2022 at 8:48 PM Robert Haas <robertmhaas@gmail.com> > wrote: > > > > > > On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > IIRC, the feature currently works almost the same as you > > > > described. It doesn't create entry for tables that are published > > > > via its schema level, it only record the published schema and check which > tables are part of it. > > > > > > Oh, well if that's the case, that is great news. > > > > > > > Yes, the feature works as you and Hou-San have mentioned. > > > > > But then I don't > > > understand Amit's comment from before: > > > > > > > Yes, because otherwise, there was confusion while dropping the > > > > objects from publication. Consider in the above case, if we would > > > > have allowed it and then the user performs ALTER PUBLICATION p1 > > > > DROP ALL TABLES IN SCHEMA s1, then (a) shall we remove both schema > > > > s1 and a table that is separately added (s1.t1) from that schema, > > > > or (b) just remove schema s1? > > > > > > I believe that (b) is the correct behavior, so I assumed that this > > > issue must be some difficulty in implementing it, like a funny > > > catalog representation. > > > > > > > No, it was because of syntax. IIRC, during development, Greg Nancarrow > > raised a point [1] that a user can expect the individually added > > tables for a schema which is also part of the publication to also get > > dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC, > > originally, the patch had a behavior (b) but then changed due to > > discussion around this point. But now that it seems you and others > > don't feel that was right, we can change back to (b) as I think that > > shouldn't be difficult to achieve. > > I have made the changes to allow creation of publication with a schema and > table of the same schema. The attached patch has the changes for the same. > I'm planning to review and test the patch further. Thanks for the patch. While reviewing it, I found that the column list behavior might need to be changed or confirmed after allowing the above case. After applying the patch, we support adding a table with column list along with the table's schema[1], and it will directly apply the column list in the logical replication after applying the patch. [1]-- CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA public; ----- If from the point of view of consistency, for column list, we could report an ERROR because we currently don't allow using different column lists for a table. Maybe an ERROR like: "ERROR: cannot use column for table x when the table's schema is also in the publication" But if we want to report an ERROR for column list in above case. We might need to restrict the ALTER TABLE SET SCHEMA as well because user could move a table which is published with column list to a schema that is also published in the publication, so we might need to add some similar check(which is removed in Vignesh's patch) to tablecmd.c to disallow this case. Another option could be just ingore the column list if table's schema is also part of publication. But it seems slightly inconsistent with the rule that we disallow using different column list for a table. Best regards, Hou zj
On 14.09.22 07:10, houzj.fnst@fujitsu.com wrote: > After applying the patch, we support adding a table with column list along with > the table's schema[1], and it will directly apply the column list in the > logical replication after applying the patch. > > [1]-- > CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA public; > ----- > > If from the point of view of consistency, for column list, we could report an > ERROR because we currently don't allow using different column lists for a > table. Maybe an ERROR like: > > "ERROR: cannot use column for table x when the table's schema is also in the publication" > > But if we want to report an ERROR for column list in above case. We might need > to restrict the ALTER TABLE SET SCHEMA as well because user could move a table > which is published with column list to a schema that is also published in the > publication, so we might need to add some similar check(which is removed in > Vignesh's patch) to tablecmd.c to disallow this case. > > Another option could be just ingore the column list if table's schema is also > part of publication. But it seems slightly inconsistent with the rule that we > disallow using different column list for a table. Ignoring things doesn't seem like a good idea. A solution might be to disallow adding any schemas to a publication if column lists on a table are specified.
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Thursday, September 15, 2022 3:37 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: Hi, > > On 14.09.22 07:10, houzj.fnst@fujitsu.com wrote: > > After applying the patch, we support adding a table with column list > > along with the table's schema[1], and it will directly apply the > > column list in the logical replication after applying the patch. > > > > [1]-- > > CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN > > SCHEMA public; > > ----- > > > > If from the point of view of consistency, for column list, we could > > report an ERROR because we currently don't allow using different > > column lists for a table. Maybe an ERROR like: > > > > "ERROR: cannot use column for table x when the table's schema is also in the > publication" > > > > But if we want to report an ERROR for column list in above case. We > > might need to restrict the ALTER TABLE SET SCHEMA as well because user > > could move a table which is published with column list to a schema > > that is also published in the publication, so we might need to add > > some similar check(which is removed in Vignesh's patch) to tablecmd.c to > disallow this case. > > > > Another option could be just ingore the column list if table's schema > > is also part of publication. But it seems slightly inconsistent with > > the rule that we disallow using different column list for a table. > > Ignoring things doesn't seem like a good idea. > > A solution might be to disallow adding any schemas to a publication if column > lists on a table are specified. Thanks for the suggestion. If I understand correctly, you mean we can disallow publishing a table with column list and any schema(a schema that the table might not be part of) in the same publication[1]. something like-- [1]CREATE PUBLICATION pub FOR TABLE public.test(a), ALL TABLES IN SCHEMA s2; ERROR: "cannot add schema to publication when column list is used in the published table" -- Personally, it looks acceptable to me as user can anyway achieve the same purpose by creating serval publications and combine it and we can save the restriction at ALTER TABLE SET SCHEMA. Although it restricts some cases. I will post a top-up patch about this soon. About the row filter handling, maybe we don't need to restrict row filter like above ? Because the rule is to simply merge the row filter with 'OR' among publications, so it seems we could ignore the row filter in the publication when the table's schema is also published in the same publication(which means no filter). Best regards, Hou zj
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Thursday, September 15, 2022 10:48 AM houzj.fnst@fujitsu.com wrote: > > On Thursday, September 15, 2022 3:37 AM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: > > Hi, > > > > > On 14.09.22 07:10, houzj.fnst@fujitsu.com wrote: > > > After applying the patch, we support adding a table with column list > > > along with the table's schema[1], and it will directly apply the > > > column list in the logical replication after applying the patch. > > > > > > [1]-- > > > CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN > > > SCHEMA public; > > > ----- > > > > > > If from the point of view of consistency, for column list, we could > > > report an ERROR because we currently don't allow using different > > > column lists for a table. Maybe an ERROR like: > > > > > > "ERROR: cannot use column for table x when the table's schema is > > > also in the > > publication" > > > > > > But if we want to report an ERROR for column list in above case. We > > > might need to restrict the ALTER TABLE SET SCHEMA as well because > > > user could move a table which is published with column list to a > > > schema that is also published in the publication, so we might need > > > to add some similar check(which is removed in Vignesh's patch) to > > > tablecmd.c to > > disallow this case. > > > > > > Another option could be just ingore the column list if table's > > > schema is also part of publication. But it seems slightly > > > inconsistent with the rule that we disallow using different column list for a > table. > > > > Ignoring things doesn't seem like a good idea. > > > > A solution might be to disallow adding any schemas to a publication if > > column lists on a table are specified. > > Thanks for the suggestion. If I understand correctly, you mean we can disallow > publishing a table with column list and any schema(a schema that the table > might not be part of) in the same publication[1]. > > something like-- > [1]CREATE PUBLICATION pub FOR TABLE public.test(a), ALL TABLES IN SCHEMA > s2; > ERROR: "cannot add schema to publication when column list is used in the > published table" > -- > > Personally, it looks acceptable to me as user can anyway achieve the same > purpose by creating serval publications and combine it and we can save the > restriction at ALTER TABLE SET SCHEMA. Although it restricts some cases. > I will post a top-up patch about this soon. > > > About the row filter handling, maybe we don't need to restrict row filter like > above ? Because the rule is to simply merge the row filter with 'OR' among > publications, so it seems we could ignore the row filter in the publication when > the table's schema is also published in the same publication(which means no > filter). Attach the new version patch which added suggested restriction for column list and merged Vignesh's patch. Some other document might need to be updated. I will update them soon. Best regards, Hou zj
Attachment
On Thu, Sep 15, 2022 at 10:57 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > ... > Attach the new version patch which added suggested restriction for column list > and merged Vignesh's patch. > > Some other document might need to be updated. I will update them soon. > > Best regards, > Hou zj Hi Hou-san. FYI, I found your v3 patch apply was broken due to a very recent changes pushed: e.g. error: patch failed: doc/src/sgml/logical-replication.sgml:1142 ~~ PSA a rebase of your same patch (I left the version number the same) ------ Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Thu, Sep 15, 2022 at 8:18 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Thursday, September 15, 2022 3:37 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > > Another option could be just ingore the column list if table's schema > > > is also part of publication. But it seems slightly inconsistent with > > > the rule that we disallow using different column list for a table. > > > > Ignoring things doesn't seem like a good idea. > > > > A solution might be to disallow adding any schemas to a publication if column > > lists on a table are specified. > > Thanks for the suggestion. If I understand correctly, you mean we can disallow > publishing a table with column list and any schema(a schema that the table > might not be part of) in the same publication[1]. > > something like-- > [1]CREATE PUBLICATION pub FOR TABLE public.test(a), ALL TABLES IN SCHEMA s2; > ERROR: "cannot add schema to publication when column list is used in the published table" > -- > > Personally, it looks acceptable to me as user can anyway achieve the same > purpose by creating serval publications and combine it and we can save the > restriction at ALTER TABLE SET SCHEMA. Although it restricts some cases. > Yeah, I agree that it restricts more cases for how different combinations can be specified for a publication but OTOH it helps to uplift restriction in ALTER TABLE ... SET SCHEMA which seems like a good trade-off. > I will post a top-up patch about this soon. > > > About the row filter handling, maybe we don't need to restrict row filter like > above ? Because the rule is to simply merge the row filter with 'OR' among > publications, so it seems we could ignore the row filter in the publication when > the table's schema is also published in the same publication(which means no filter). > Yeah, this is what we are doing internally when combining multiple publications but let me explain with an example the case of a single publication so that if anybody has any objections to it, we can discuss the same. Case-1: When row filter is specified *without* ALL TABLES IN SCHEMA clause postgres=# create table t1(c1 int, c2 int, c3 int); CREATE TABLE postgres=# create publication pub1 for table t1 where (c1 > 10); CREATE PUBLICATION postgres=# select pubname, schemaname, tablename, rowfilter from pg_publication_tables; pubname | schemaname | tablename | rowfilter ---------+------------+-----------+----------- pub1 | public | t1 | (c1 > 10) (1 row) Case-2: When row filter is specified *with* ALL TABLES IN SCHEMA clause postgres=# create schema s1; CREATE SCHEMA postgres=# create table s1.t2(c1 int, c2 int, c3 int); CREATE TABLE postgres=# create publication pub2 for table s1.t2 where (c1 > 10), all tables in schema s1; CREATE PUBLICATION postgres=# select pubname, schemaname, tablename, rowfilter from pg_publication_tables; pubname | schemaname | tablename | rowfilter ---------+------------+-----------+----------- pub1 | public | t1 | (c1 > 10) pub2 | s1 | t2 | (2 rows) So, for case-2, the rowfilter is not considered. Note, case-2 was not possible before the patch which is discussed here and after the patch, the behavior will be the same as we have it before when we combine publications. -- With Regards, Amit Kapila.
On Thu, Sep 15, 2022 at 6:27 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Attach the new version patch which added suggested restriction for column list > and merged Vignesh's patch. > Few comments: ============ 1. static void -CheckPubRelationColumnList(List *tables, const char *queryString, +CheckPubRelationColumnList(List *tables, bool publish_schema, + const char *queryString, bool pubviaroot) It is better to keep bool parameters together at the end. 2. /* + * Disallow using column list if any schema is in the publication. + */ + if (publish_schema) + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot use publication column list for relation \"%s.%s\"", + get_namespace_name(RelationGetNamespace(pri->relation)), + RelationGetRelationName(pri->relation)), + errdetail("Column list cannot be specified if any schema is part of the publication or specified in the list.")); I think it would be better to explain why we disallow this case. 3. + if (!heap_attisnull(coltuple, Anum_pg_publication_rel_prattrs, NULL)) + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot add schema to the publication"), + errdetail("Schema cannot be added if any table that specifies column list is already part of the publication")); A full stop is missing at the end in the errdetail message. 4. I have modified a few comments in the attached. Please check and if you like the changes then please include those in the next version. -- With Regards, Amit Kapila.
Attachment
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Friday, September 16, 2022 1:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 15, 2022 at 6:27 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > Attach the new version patch which added suggested restriction for > > column list and merged Vignesh's patch. > > > > Few comments: > ============ > 1. > static void > -CheckPubRelationColumnList(List *tables, const char *queryString, > +CheckPubRelationColumnList(List *tables, bool publish_schema, > + const char *queryString, > bool pubviaroot) > > It is better to keep bool parameters together at the end. > > 2. > /* > + * Disallow using column list if any schema is in the publication. > + */ > + if (publish_schema) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot use publication column list for relation \"%s.%s\"", > + get_namespace_name(RelationGetNamespace(pri->relation)), > + RelationGetRelationName(pri->relation)), > + errdetail("Column list cannot be specified if any schema is part of > the publication or specified in the list.")); > > I think it would be better to explain why we disallow this case. > > 3. > + if (!heap_attisnull(coltuple, Anum_pg_publication_rel_prattrs, NULL)) > + ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot add schema to the publication"), errdetail("Schema > + cannot be added if any table that specifies column > list is already part of the publication")); > > A full stop is missing at the end in the errdetail message. > > 4. I have modified a few comments in the attached. Please check and if you like > the changes then please include those in the next version. Thanks for the comments. Attach the new version patch which addressed above comments and ran pgident. I also improved some codes and documents based on some comments given by Vignesh and Peter offlist. Best regards, Hou zj
Attachment
On Fri, Sep 16, 2022 at 1:09 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Attach the new version patch which addressed above comments and ran pgident. > I also improved some codes and documents based on some comments > given by Vignesh and Peter offlist. > Thanks, the patch looks mostly good to me. I have made a few cosmetic changes and edited a few comments. I would like to push this to HEAD and backpatch it to 15 by Tuesday unless there are any comments. I think we should back patch this because otherwise, users will see a change in behavior in 16 but if others don't think the same way then we can consider pushing this to HEAD only. -- With Regards, Amit Kapila.
Attachment
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Saturday, September 17, 2022 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 16, 2022 at 1:09 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> > wrote: > > > > Attach the new version patch which addressed above comments and ran > pgident. > > I also improved some codes and documents based on some comments given > > by Vignesh and Peter offlist. > > > > Thanks, the patch looks mostly good to me. I have made a few cosmetic changes > and edited a few comments. I would like to push this to HEAD and backpatch it > to 15 by Tuesday unless there are any comments. I think we should back patch > this because otherwise, users will see a change in behavior in 16 but if others > don't think the same way then we can consider pushing this to HEAD only. Thanks for the patch. I rebased it based on PG15 and here is the patch. Best regards, Hou zj
Attachment
> diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml > index 1ae3287..0ab768d 100644 > --- a/doc/src/sgml/logical-replication.sgml > +++ b/doc/src/sgml/logical-replication.sgml > @@ -1120,6 +1120,11 @@ test_sub=# SELECT * FROM child ORDER BY a; > </para> > > <para> > + Specifying a column list when the publication also publishes > + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. > + </para> > + > + <para> > For partitioned tables, the publication parameter > <literal>publish_via_partition_root</literal> determines which column list > is used. If <literal>publish_via_partition_root</literal> is > > diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml > index 0a68c4b..0ced7da 100644 > --- a/doc/src/sgml/ref/create_publication.sgml > +++ b/doc/src/sgml/ref/create_publication.sgml > @@ -103,17 +103,17 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> > </para> > > <para> > + Specifying a column list when the publication also publishes > + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. > + </para> > > @@ -733,6 +694,24 @@ CheckPubRelationColumnList(List *tables, const char *queryString, > continue; > > /* > + * Disallow specifying column list if any schema is in the > + * publication. > + * > + * XXX We could instead just forbid the case when the publication > + * tries to publish the table with a column list and a schema for that > + * table. However, if we do that then we need a restriction during > + * ALTER TABLE ... SET SCHEMA to prevent such a case which doesn't > + * seem to be a good idea. > + */ > + if (publish_schema) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot use publication column list for relation \"%s.%s\"", > + get_namespace_name(RelationGetNamespace(pri->relation)), > + RelationGetRelationName(pri->relation)), > + errdetail("Column list cannot be specified if any schema is part of the publication or specified inthe list.")); > + This seems a pretty arbitrary restriction. It feels like you're adding this restriction precisely so that you don't have to write the code to reject the ALTER .. SET SCHEMA if an incompatible configuration is detected. But we already have such checks in several cases, so I don't see why this one does not seem a good idea. The whole FOR ALL TABLES IN SCHEMA thing seems pretty weird in several aspects. Others have already commented about the syntax, which is unlike what GRANT uses; I'm also surprised that we've gotten away with it being superuser-only. Why are we building more superuser-only features in this day and age? I think not even FOR ALL TABLES should require superuser. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ Thou shalt study thy libraries and strive not to reinvent them without cause, that thy code may be short and readable and thy days pleasant and productive. (7th Commandment for C Programmers)
On 9/19/22 11:16 AM, Alvaro Herrera wrote: > >> diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml >> index 1ae3287..0ab768d 100644 >> --- a/doc/src/sgml/logical-replication.sgml >> +++ b/doc/src/sgml/logical-replication.sgml >> @@ -1120,6 +1120,11 @@ test_sub=# SELECT * FROM child ORDER BY a; >> </para> >> >> <para> >> + Specifying a column list when the publication also publishes >> + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. >> + </para> >> + >> + <para> >> For partitioned tables, the publication parameter >> <literal>publish_via_partition_root</literal> determines which column list >> is used. If <literal>publish_via_partition_root</literal> is >> >> diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml >> index 0a68c4b..0ced7da 100644 >> --- a/doc/src/sgml/ref/create_publication.sgml >> +++ b/doc/src/sgml/ref/create_publication.sgml >> @@ -103,17 +103,17 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> >> </para> >> >> <para> >> + Specifying a column list when the publication also publishes >> + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. >> + </para> >> >> @@ -733,6 +694,24 @@ CheckPubRelationColumnList(List *tables, const char *queryString, >> continue; >> >> /* >> + * Disallow specifying column list if any schema is in the >> + * publication. >> + * >> + * XXX We could instead just forbid the case when the publication >> + * tries to publish the table with a column list and a schema for that >> + * table. However, if we do that then we need a restriction during >> + * ALTER TABLE ... SET SCHEMA to prevent such a case which doesn't >> + * seem to be a good idea. >> + */ >> + if (publish_schema) >> + ereport(ERROR, >> + errcode(ERRCODE_INVALID_PARAMETER_VALUE), >> + errmsg("cannot use publication column list for relation \"%s.%s\"", >> + get_namespace_name(RelationGetNamespace(pri->relation)), >> + RelationGetRelationName(pri->relation)), >> + errdetail("Column list cannot be specified if any schema is part of the publication or specifiedin the list.")); >> + > > This seems a pretty arbitrary restriction. It feels like you're adding > this restriction precisely so that you don't have to write the code to > reject the ALTER .. SET SCHEMA if an incompatible configuration is > detected. But we already have such checks in several cases, so I don't > see why this one does not seem a good idea. > > The whole FOR ALL TABLES IN SCHEMA thing seems pretty weird in several > aspects. Others have already commented about the syntax, which is > unlike what GRANT uses; I'm also surprised that we've gotten away with > it being superuser-only. Why are we building more superuser-only > features in this day and age? I think not even FOR ALL TABLES should > require superuser. FYI, I've added this to the PG15 open items as there are some open questions to resolve in this thread. Jonathan
Attachment
On Mon, Sep 19, 2022 at 8:46 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml > > index 1ae3287..0ab768d 100644 > > --- a/doc/src/sgml/logical-replication.sgml > > +++ b/doc/src/sgml/logical-replication.sgml > > @@ -1120,6 +1120,11 @@ test_sub=# SELECT * FROM child ORDER BY a; > > </para> > > > > <para> > > + Specifying a column list when the publication also publishes > > + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. > > + </para> > > + > > + <para> > > For partitioned tables, the publication parameter > > <literal>publish_via_partition_root</literal> determines which column list > > is used. If <literal>publish_via_partition_root</literal> is > > > > diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml > > index 0a68c4b..0ced7da 100644 > > --- a/doc/src/sgml/ref/create_publication.sgml > > +++ b/doc/src/sgml/ref/create_publication.sgml > > @@ -103,17 +103,17 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> > > </para> > > > > <para> > > + Specifying a column list when the publication also publishes > > + <literal>FOR ALL TABLES IN SCHEMA</literal> is not supported. > > + </para> > > > > @@ -733,6 +694,24 @@ CheckPubRelationColumnList(List *tables, const char *queryString, > > continue; > > > > /* > > + * Disallow specifying column list if any schema is in the > > + * publication. > > + * > > + * XXX We could instead just forbid the case when the publication > > + * tries to publish the table with a column list and a schema for that > > + * table. However, if we do that then we need a restriction during > > + * ALTER TABLE ... SET SCHEMA to prevent such a case which doesn't > > + * seem to be a good idea. > > + */ > > + if (publish_schema) > > + ereport(ERROR, > > + errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > + errmsg("cannot use publication column list for relation \"%s.%s\"", > > + get_namespace_name(RelationGetNamespace(pri->relation)), > > + RelationGetRelationName(pri->relation)), > > + errdetail("Column list cannot be specified if any schema is part of the publicationor specified in the list.")); > > + > > This seems a pretty arbitrary restriction. It feels like you're adding > this restriction precisely so that you don't have to write the code to > reject the ALTER .. SET SCHEMA if an incompatible configuration is > detected. But we already have such checks in several cases, so I don't > see why this one does not seem a good idea. > I agree that we have such checks at other places as well and one somewhat similar is in ATPrepChangePersistence(). ATPrepChangePersistence() { ... ... /* * Check that the table is not part of any publication when changing to * UNLOGGED, as UNLOGGED tables can't be published. */ However, another angle to look at it is that we try to avoid adding restrictions in other DDL commands for defined publications. I am not sure but it appears to me Peter E. is not in favor of restrictions in other DDLs. I think we don't have a strict rule in this regard, so we are trying to see what makes the most sense based on feedback and do it accordingly. > The whole FOR ALL TABLES IN SCHEMA thing seems pretty weird in several > aspects. Others have already commented about the syntax, which is > unlike what GRANT uses; I'm also surprised that we've gotten away with > it being superuser-only. Why are we building more superuser-only > features in this day and age? I think not even FOR ALL TABLES should > require superuser. > The intention was to be in sync with FOR ALL TABLES. -- With Regards, Amit Kapila.
On 9/19/22 4:52 PM, Jonathan S. Katz wrote: > On 9/19/22 11:16 AM, Alvaro Herrera wrote: >> This seems a pretty arbitrary restriction. It feels like you're adding >> this restriction precisely so that you don't have to write the code to >> reject the ALTER .. SET SCHEMA if an incompatible configuration is >> detected. But we already have such checks in several cases, so I don't >> see why this one does not seem a good idea. >> >> The whole FOR ALL TABLES IN SCHEMA thing seems pretty weird in several >> aspects. Others have already commented about the syntax, which is >> unlike what GRANT uses; I'm also surprised that we've gotten away with >> it being superuser-only. Why are we building more superuser-only >> features in this day and age? I think not even FOR ALL TABLES should >> require superuser. > > FYI, I've added this to the PG15 open items as there are some open > questions to resolve in this thread. (Replying personally, not RMT). I wanted to enumerate the concerns raised in this thread in the context of the open item to understand what needs to be addressed, and also give an opinion. I did read up on the original thread to better understand context around decisions. I believe the concerns are these 3 things: 1. Allowing calls that have "ALL TABLES IN SCHEMA" that include calls to specific tables in schema 2. The syntax of the "ALL TABLES IN SCHEMA" and comparing it to similar behaviors in PostgreSQL 3. Adding on an additional "superuser-only" feature For #1 (allowing calls that have schema/table overlap...), there appears to be both a patch that allows this (reversing[8]), and a suggestion for dealing with a corner-case that is reasonable, i.e. disallowing adding schemas to a publication when specifying column-lists. Do we think we can have consensus on this prior to the RC1 freeze? For #2 ("ALL TABLES IN SCHEMA" syntax), this was heavily discussed on the original thread[1][3][4][5][7]. I thought Tom's proposal on the syntax[3] was reasonable as it "future proofs" for when we allow other schema-scoped objects to be published and give control over which ones can be published. The bigger issue seems to be around the behavior in regards to the syntax. The current behavior is that when one specifies "ALL TABLES IN SCHEMA", any future tables created in that schema are added to the publication. While folks tried to find parallels to GRANT[6], I think this actually resembles how we handle partitions that are published[9][10], i.e.: "When a partitioned table is added to a publication, all of its existing and future partitions are implicitly considered to be part of the publication."[10] Additionally, this is the behavior that is already present in "FOR ALL TABLES": "Marks the publication as one that replicates changes for all tables in the database, including tables created in the future."[10] I don't think we should change this behavior that's already in logical replication. While I understand the reasons why "GRANT ... ALL TABLES IN SCHEMA" has a different behavior (i.e. it's not applied to future objects) and do not advocate to change it, I have personally been affected where I thought a permission would be applied to all future objects, only to discover otherwise. I believe it's more intuitive to think that "ALL" applies to "everything, always." For #3 (more superuser-only), in general I do agree that we shouldn't be adding more of these. However, we have in this release, and not just to this feature. ALTER SUBSCRIPTION ... SKIP[11] requires superuser. I think it's easier for us to "relax" privileges (e.g. w/predefined roles) than to make something "superuser-only" in the future, so I don't see this being a blocker for v15. The feature will continue to work for users even if we remove "superuser-only" in the future. To summarize: 1. I do think we should fix the issue that Peter originally brought up in this thread before v15. That is an open item. 2. I don't see why we need to change the syntax/behavior, I think that will make this feature much harder to use. 3. I don't think we need to change the superuser requirement right now, but we should do that for a future release. Thanks, Jonathan [1] https://www.postgresql.org/message-id/CAFiTN-u_m0cq7Rm5Bcu9EW4gSHG94WaLuxLfibwE-o7%2BLea2GQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/C4D04B90-AC4D-42A7-B93C-4799CEDDDD96%40enterprisedb.com [3] https://www.postgresql.org/message-id/155565.1628954580%40sss.pgh.pa.us [4] https://www.postgresql.org/message-id/CAHut%2BPvNwzp-EdtsDNazwrNrV4ziqCovNdLywzOJKSy52LvRjw%40mail.gmail.com [5] https://www.postgresql.org/message-id/CAHut%2BPt6Czj0KsE0ip6nMsPf4FatHgNDni-wSu2KkYNYF9mDAw%40mail.gmail.com [6] https://www.postgresql.org/message-id/CAA4eK1Lwtea0St1MV5nfSg9FrFeU04YKpHvhQ0i4W-tOBw%3D9Qw%40mail.gmail.com [7] https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup@alvherre.pgsql [8] https://www.postgresql.org/message-id/CALDaNm1BEXtvg%3Dfq8FzM-FoYvETTEuvA_Gf8rCAjFr1VrB5aBA%40mail.gmail.com [9] https://www.postgresql.org/message-id/CAJcOf-fyM3075t9%2B%3DB-BSFz2FG%3D5BnDSPX4YtL8k1nnK%3DwjgWA%40mail.gmail.com [10] https://www.postgresql.org/docs/current/sql-createpublication.html [11] https://www.postgresql.org/docs/15/sql-altersubscription.html
Attachment
On 2022-Sep-20, Amit Kapila wrote: > On Mon, Sep 19, 2022 at 8:46 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > This seems a pretty arbitrary restriction. It feels like you're adding > > this restriction precisely so that you don't have to write the code to > > reject the ALTER .. SET SCHEMA if an incompatible configuration is > > detected. But we already have such checks in several cases, so I don't > > see why this one does not seem a good idea. > > > I agree that we have such checks at other places as well and one > somewhat similar is in ATPrepChangePersistence(). > > ATPrepChangePersistence() > { > ... > ... > /* > * Check that the table is not part of any publication when changing to > * UNLOGGED, as UNLOGGED tables can't be published. > */ Right, I think this is a sensible approach. > However, another angle to look at it is that we try to avoid adding > restrictions in other DDL commands for defined publications. Well, it makes sense to avoid restrictions wherever possible. But here, the consequence is that you end up with a restriction in the publication definition that is not very sensible. Imagine if you said "you can't add schema S because it contains an unlogged table". It's absurd. Maybe this can be relaxed in a future release, but it's quite odd. > The intention was to be in sync with FOR ALL TABLES. I guess we can change both (FOR ALL TABLES and IN SCHEMA) later. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On Tue, Sep 20, 2022 at 2:57 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Sep-20, Amit Kapila wrote: > > > On Mon, Sep 19, 2022 at 8:46 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > This seems a pretty arbitrary restriction. It feels like you're adding > > > this restriction precisely so that you don't have to write the code to > > > reject the ALTER .. SET SCHEMA if an incompatible configuration is > > > detected. But we already have such checks in several cases, so I don't > > > see why this one does not seem a good idea. > > > > > I agree that we have such checks at other places as well and one > > somewhat similar is in ATPrepChangePersistence(). > > > > ATPrepChangePersistence() > > { > > ... > > ... > > /* > > * Check that the table is not part of any publication when changing to > > * UNLOGGED, as UNLOGGED tables can't be published. > > */ > > Right, I think this is a sensible approach. > > > However, another angle to look at it is that we try to avoid adding > > restrictions in other DDL commands for defined publications. > > Well, it makes sense to avoid restrictions wherever possible. But here, > the consequence is that you end up with a restriction in the publication > definition that is not very sensible. Imagine if you said "you can't > add schema S because it contains an unlogged table". It's absurd. > > Maybe this can be relaxed in a future release, but it's quite odd. > Yeah, we can relax it in a future release based on some field experience, or maybe we can keep the current restriction of not allowing to add a table when the schema of the table is part of the same publication and try to relax that in a future release based on field experience. > > The intention was to be in sync with FOR ALL TABLES. > > I guess we can change both (FOR ALL TABLES and IN SCHEMA) later. > That sounds reasonable. -- With Regards, Amit Kapila.
On Mon, Sep 19, 2022 at 11:03 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: > For #1 (allowing calls that have schema/table overlap...), there appears > to be both a patch that allows this (reversing[8]), and a suggestion for > dealing with a corner-case that is reasonable, i.e. disallowing adding > schemas to a publication when specifying column-lists. Do we think we > can have consensus on this prior to the RC1 freeze? I am not sure whether we can or should rush a fix in that fast, but I agree with this direction. > For #2 ("ALL TABLES IN SCHEMA" syntax), this was heavily discussed on > the original thread[1][3][4][5][7]. I thought Tom's proposal on the > syntax[3] was reasonable as it "future proofs" for when we allow other > schema-scoped objects to be published and give control over which ones > can be published. All right, well, I still don't like it and think it's confusing, but perhaps I'm in the minority. > I don't think we should change this behavior that's already in logical > replication. While I understand the reasons why "GRANT ... ALL TABLES IN > SCHEMA" has a different behavior (i.e. it's not applied to future > objects) and do not advocate to change it, I have personally been > affected where I thought a permission would be applied to all future > objects, only to discover otherwise. I believe it's more intuitive to > think that "ALL" applies to "everything, always." Nah, there's room for multiple behaviors here. It's reasonable to want to add all the tables currently in the schema to a publication (or grant permissions on them) and it's reasonable to want to include all current and future tables in the schema in a publication (or grant permissions on them) too. The reason I don't like the ALL TABLES IN SCHEMA syntax is that it sounds like the former, but actually is the latter. Based on your link to the email from Tom, I understand now the reason why it's like that, but it's still counterintuitive to me. > For #3 (more superuser-only), in general I do agree that we shouldn't be > adding more of these. However, we have in this release, and not just to > this feature. ALTER SUBSCRIPTION ... SKIP[11] requires superuser. I > think it's easier for us to "relax" privileges (e.g. w/predefined roles) > than to make something "superuser-only" in the future, so I don't see > this being a blocker for v15. The feature will continue to work for > users even if we remove "superuser-only" in the future. Yeah, this is clearly not a release blocker, I think. -- Robert Haas EDB: http://www.enterprisedb.com
On 2022-Sep-13, Kyotaro Horiguchi wrote: > At Mon, 12 Sep 2022 04:26:48 +0000, "houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com> wrote in > > I feel we'd better compare the syntax with the existing publication command: > > FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means publishing > > all the tables in the database *including* tables created in the future. I > > think both the syntax and meaning of ALL TABLES IN SCHEMA are consistent with > > the existing FOR ALL TABLES. > > IMHO, I feel closer to Robert. "ALL TABLES IN SCHEMA" sounds like the > concrete tables at the time of invocation. While I agree that it is > not directly comparable to GRANT, What if we remove the ALL keyword from there? That would leave us with "FOR TABLES IN SCHEMA", which seems to better convey that it doesn't restrict to current tables in there. > but if I see "ALTER PUBLICATION p1 ADD SCHEMA s1", I automatically > translate that into "all tables in the schema s1 at the time of using > this publication". ... but that translation is wrong if replication supports other kinds of objects, as it inevitably will in the near future. Clearly the fact that we spell out TABLES there is important. When we add support for sequences, we could have combinations ADD [ALL] TABLES IN SCHEMA s ADD [ALL] SEQUENCES IN SCHEMA s ADD [ALL] TABLES AND SEQUENCES IN SCHEMA s and at that point, the unadorned ADD SCHEMA one will become ambiguous. > At least, it would cause less confusion when it were "ALT PUB p1 DROP > SCEMA s1" aginst "DROP ALL TABLES IN SCHEMA s1". I'm not sure what you mean here. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "En las profundidades de nuestro inconsciente hay una obsesiva necesidad de un universo lógico y coherente. Pero el universo real se halla siempre un paso más allá de la lógica" (Irulan)
> On Sep 19, 2022, at 8:03 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: > > "When a partitioned table is added to a publication, all of its existing and future partitions are implicitly consideredto be part of the publication."[10] > > Additionally, this is the behavior that is already present in "FOR ALL TABLES": > > "Marks the publication as one that replicates changes for all tables in the database, including tables created in the future."[10] > > I don't think we should change this behavior that's already in logical replication. The existing behavior in logical replication doesn't have any "IN SCHEMA" qualifiers. > While I understand the reasons why "GRANT ... ALL TABLES IN SCHEMA" has a different behavior (i.e. it's not applied tofuture objects) and do not advocate to change it, I have personally been affected where I thought a permission would beapplied to all future objects, only to discover otherwise. I believe it's more intuitive to think that "ALL" applies to"everything, always." The conversation is focusing on what "ALL TABLES" means, but the ambiguous part is what "IN SCHEMA" means. In GRANT it means"currently in schema, computed now." We are about to create confusion by adding the "IN SCHEMA" phrase to publicationcommands meaning "later in schema, computed then." A user who diligently consults the documentation for one commandto discover what "IN SCHEMA" means may fairly, but wrongly, assume it means the same thing in another command. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 9/20/22 10:55 AM, Mark Dilger wrote: > > >> On Sep 19, 2022, at 8:03 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: >> >> "When a partitioned table is added to a publication, all of its existing and future partitions are implicitly consideredto be part of the publication."[10] >> >> Additionally, this is the behavior that is already present in "FOR ALL TABLES": >> >> "Marks the publication as one that replicates changes for all tables in the database, including tables created in thefuture."[10] >> >> I don't think we should change this behavior that's already in logical replication. > > The existing behavior in logical replication doesn't have any "IN SCHEMA" qualifiers. This behavior exists "FOR ALL TABLES" without the "IN SCHEMA" qualifier. This was discussed multiple times on the original thread[1]. > >> While I understand the reasons why "GRANT ... ALL TABLES IN SCHEMA" has a different behavior (i.e. it's not applied tofuture objects) and do not advocate to change it, I have personally been affected where I thought a permission would beapplied to all future objects, only to discover otherwise. I believe it's more intuitive to think that "ALL" applies to"everything, always." > > The conversation is focusing on what "ALL TABLES" means, but the ambiguous part is what "IN SCHEMA" means. In GRANT itmeans "currently in schema, computed now." We are about to create confusion by adding the "IN SCHEMA" phrase to publicationcommands meaning "later in schema, computed then." A user who diligently consults the documentation for one commandto discover what "IN SCHEMA" means may fairly, but wrongly, assume it means the same thing in another command. I tried to diligently read the sections where we talk about granting + privileges[2][3] to see what it says about "ALL * IN SCHEMA". Unless I missed it, and I read through it twice, it does not explicitly state whether or not "GRANT" applies to all objects at only that given moment, or to future objects of that type which are created in that schema. Maybe the behavior is implied or is part of the standard, but it's not currently documented. We do link to "ALTER DEFAULT PRIVILEGES" at the bottom of the GRANT[2] docs, but we don't give any indication as to why. (This is also to say we should document in GRANT that ALL * IN SCHEMA does not apply to future objects; if you need that behavior use ALTER DEFAULT PRIVILEGES. Separate thread :) I understand there is a risk of confusion of the similar grammar across commands, but the current command in logical replication has this is building on the existing behavior. Thanks, Jonathan [1] https://www.postgresql.org/message-id/flat/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ%40mail.gmail.com [2] https://www.postgresql.org/docs/current/sql-grant.html [3] https://www.postgresql.org/docs/current/ddl-priv.html
Attachment
(RMT hat on, unless otherwise noted) On 9/20/22 9:42 AM, Robert Haas wrote: > On Mon, Sep 19, 2022 at 11:03 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: >> For #1 (allowing calls that have schema/table overlap...), there appears >> to be both a patch that allows this (reversing[8]), and a suggestion for >> dealing with a corner-case that is reasonable, i.e. disallowing adding >> schemas to a publication when specifying column-lists. Do we think we >> can have consensus on this prior to the RC1 freeze? > > I am not sure whether we can or should rush a fix in that fast, but I > agree with this direction. The RMT met today to discuss this. We did agree that the above is an open item that should be resolved before this release. While it is an accepted pattern for us to "ERROR" on unsupported behavior and then later introduce said behavior, we do agree with Peter's original post in this thread and would like it resolved. As for the state of the fix, the patch has been iterated on and Amit felt ready to commit it[1]. We do want to hear how others feel about this, but the folks behind this feature have been working on this patch since this was reported. >> For #2 ("ALL TABLES IN SCHEMA" syntax), this was heavily discussed on >> the original thread[1][3][4][5][7]. I thought Tom's proposal on the >> syntax[3] was reasonable as it "future proofs" for when we allow other >> schema-scoped objects to be published and give control over which ones >> can be published. > > All right, well, I still don't like it and think it's confusing, but > perhaps I'm in the minority. The RMT discussed this as well. The RMT feels that there should not be any changes to syntax/behavior for this release. This doesn't preclude future work in this area (e.g. having a toggle for "all future behavior"), but based on all the discussions and existing behavior in this feature, we do not see a need to make changes or delay the release on this. >> I don't think we should change this behavior that's already in logical >> replication. While I understand the reasons why "GRANT ... ALL TABLES IN >> SCHEMA" has a different behavior (i.e. it's not applied to future >> objects) and do not advocate to change it, I have personally been >> affected where I thought a permission would be applied to all future >> objects, only to discover otherwise. I believe it's more intuitive to >> think that "ALL" applies to "everything, always." > > Nah, there's room for multiple behaviors here. It's reasonable to want > to add all the tables currently in the schema to a publication (or > grant permissions on them) and it's reasonable to want to include all > current and future tables in the schema in a publication (or grant > permissions on them) too. The reason I don't like the ALL TABLES IN > SCHEMA syntax is that it sounds like the former, but actually is the > latter. Based on your link to the email from Tom, I understand now the > reason why it's like that, but it's still counterintuitive to me. <PersonalOpinion> I understand your view on "multiple behaviors" and I do agree with your reasoning. I still think we should leave this as is, but perhaps this opens up an option we add later to specify the behavior. </PersonalOpinion> > >> For #3 (more superuser-only), in general I do agree that we shouldn't be >> adding more of these. However, we have in this release, and not just to >> this feature. ALTER SUBSCRIPTION ... SKIP[11] requires superuser. I >> think it's easier for us to "relax" privileges (e.g. w/predefined roles) >> than to make something "superuser-only" in the future, so I don't see >> this being a blocker for v15. The feature will continue to work for >> users even if we remove "superuser-only" in the future. > > Yeah, this is clearly not a release blocker, I think. The RMT concurs. We do recommend future work on "relaxing" the superuser-only requirement. Thanks, Jonathan [1] https://www.postgresql.org/message-id/CAA4eK1LDhoBM8K5uVme8PZ%2BkxNOfVpRh%3DoO42JtFdqBgBuj1bA%40mail.gmail.com
Attachment
> On Sep 20, 2022, at 12:36 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: > > This behavior exists "FOR ALL TABLES" without the "IN SCHEMA" qualifier. This was discussed multiple times on the originalthread[1]. Yes, nobody is debating that as far as I can see. And I do take your point that this stuff was discussed in other threadsquite a while back. > I tried to diligently read the sections where we talk about granting + privileges[2][3] to see what it says about "ALL* IN SCHEMA". Unless I missed it, and I read through it twice, it does not explicitly state whether or not "GRANT" appliesto all objects at only that given moment, or to future objects of that type which are created in that schema. Maybethe behavior is implied or is part of the standard, but it's not currently documented. Interesting. Thanks for that bit of research. > We do link to "ALTER DEFAULT PRIVILEGES" at the bottom of the GRANT[2] docs, but we don't give any indication as to why. > > (This is also to say we should document in GRANT that ALL * IN SCHEMA does not apply to future objects; Yes, I agree this should be documented. > if you need that behavior use ALTER DEFAULT PRIVILEGES. Separate thread :) > > I understand there is a risk of confusion of the similar grammar across commands, but the current command in logical replicationhas this is building on the existing behavior. I don't complain that it is buidling on the existing behavior. I'm *only* concerned about the keywords we're using for this. Consider the following: -- AS ADMIN CREATE USER bob NOSUPERUSER; GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA foo TO bob; SET ROLE bob; CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; We're going to have that fail in pg15 because the FOR ALL TABLES IN SCHEMA option is reserved to superusers. But we agreedthat was a stop-gap solution that we'd potentially loosen in the future. Certainly we'll need wiggle room in the syntaxto perform that loosening: --- Must be superuser for this in pg15, and in subsequent releases. CREATE PUBLICATION bobs_pub FOR ALL FUTURE TABLES IN SCHEMA foo; --- Not supported in pg15, but reserved for some future pg versions to allow --- non-superusers to create publications on tables currently in schema foo, --- assuming they have sufficient privileges on those tables CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; Doing it this way makes the syntax consistent between the GRANT...TO bob and the CREATE PUBLICATION bobs_pub. Surely thismakes more sense? I'm not a huge fan of the keyword "FUTURE" here, but I found a reference to another database that uses that keyword for whatI think is a similar purpose. We should choose *something* for this, though, if we want things to be rational goingforward. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
[personal views, not RMT] On 9/20/22 4:06 PM, Mark Dilger wrote: > I don't complain that it is buidling on the existing behavior. I'm *only* concerned about the keywords we're using forthis. Consider the following: > > -- AS ADMIN > CREATE USER bob NOSUPERUSER; > GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA foo TO bob; > SET ROLE bob; > CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; > > We're going to have that fail in pg15 because the FOR ALL TABLES IN SCHEMA option is reserved to superusers. But we agreedthat was a stop-gap solution that we'd potentially loosen in the future. Certainly we'll need wiggle room in the syntaxto perform that loosening: > > --- Must be superuser for this in pg15, and in subsequent releases. > CREATE PUBLICATION bobs_pub FOR ALL FUTURE TABLES IN SCHEMA foo; > > --- Not supported in pg15, but reserved for some future pg versions to allow > --- non-superusers to create publications on tables currently in schema foo, > --- assuming they have sufficient privileges on those tables > CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; > > Doing it this way makes the syntax consistent between the GRANT...TO bob and the CREATE PUBLICATION bobs_pub. Surely thismakes more sense? When you put it that way, I see your point. However, for the lesser-privileged user though, will the behavior be that it will continue to add all future tables in a schema to the publication so long as they have sufficient privileges on those tables? Or would that mirror the current behavior with GRANT? While I understand it makes it consistent, the one concern I raise is that it means the less privileged user could have a less convenient user experience than the privileged user. Perhaps that's OK, but worth noting. > I'm not a huge fan of the keyword "FUTURE" here, but I found a reference to another database that uses that keyword forwhat I think is a similar purpose. I did try doing research on this prior, but hadn't thought to incorporate "future" into my searches. Doing so, I probably found the same database that you did that used the "FUTURE" word for adding permissions to future objects (and this is fresh, as the docs for it were published last week). That's definitely interesting. I did see some notes on a legacy database system that offered similar advice to what we do for GRANT if you're not using ALTER DEFAULT PRIVILEGES. > We should choose *something* for this, though, if we want things to be rational going forward. That all said, while I understand your point and open to the suggestion on "FUTURE", I'm not convinced on the syntax change. But I'll sleep on it. Jonathan
Attachment
RE: why can't a table be part of the same publication as its schema
From
"houzj.fnst@fujitsu.com"
Date:
On Wednesday, September 21, 2022 4:06 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > On Sep 20, 2022, at 12:36 PM, Jonathan S. Katz <jkatz@postgresql.org> > wrote: > > > > This behavior exists "FOR ALL TABLES" without the "IN SCHEMA" qualifier. > This was discussed multiple times on the original thread[1]. > > Yes, nobody is debating that as far as I can see. And I do take your point that > this stuff was discussed in other threads quite a while back. > > > I tried to diligently read the sections where we talk about granting + > privileges[2][3] to see what it says about "ALL * IN SCHEMA". Unless I missed it, > and I read through it twice, it does not explicitly state whether or not "GRANT" > applies to all objects at only that given moment, or to future objects of that > type which are created in that schema. Maybe the behavior is implied or is part > of the standard, but it's not currently documented. > > Interesting. Thanks for that bit of research. > > > We do link to "ALTER DEFAULT PRIVILEGES" at the bottom of the GRANT[2] > docs, but we don't give any indication as to why. > > > > (This is also to say we should document in GRANT that ALL * IN SCHEMA does > not apply to future objects; > > Yes, I agree this should be documented. > > > if you need that behavior use ALTER DEFAULT PRIVILEGES. Separate thread :) > > > > I understand there is a risk of confusion of the similar grammar across > commands, but the current command in logical replication has this is building > on the existing behavior. > > I don't complain that it is buidling on the existing behavior. I'm *only* > concerned about the keywords we're using for this. Consider the following: > > -- AS ADMIN > CREATE USER bob NOSUPERUSER; > GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA foo TO bob; > SET ROLE bob; > CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; > > We're going to have that fail in pg15 because the FOR ALL TABLES IN SCHEMA > option is reserved to superusers. But we agreed that was a stop-gap solution > that we'd potentially loosen in the future. Certainly we'll need wiggle room in > the syntax to perform that loosening: > > --- Must be superuser for this in pg15, and in subsequent releases. > CREATE PUBLICATION bobs_pub FOR ALL FUTURE TABLES IN SCHEMA foo; > > --- Not supported in pg15, but reserved for some future pg versions to > allow > --- non-superusers to create publications on tables currently in schema foo, > --- assuming they have sufficient privileges on those tables > CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; > > Doing it this way makes the syntax consistent between the GRANT...TO bob and > the CREATE PUBLICATION bobs_pub. Surely this makes more sense? Thanks for the suggestion. My concern is that I am not sure do we really want to add a feature that only publish all the current tables(not future tables). I think, if possible, it would be better to find an approach that can release the superuser restriction for both FOR ALL TABLES and FOR ALL TABLES IN SCHEMA in the future release. I think another solution might be introduce a new publication option (like: include_future). When user execute: CREATE PUBLICATION ... FOR ALL TABLES IN SCHEMA ... WITH (include_future) it means we publish all current and future tables and require superuser. We can set the default value of this option to 'true' and user can set it to false if they only want to publish the current tables and don't want to use superuser. And in this approach, we don't need to change the syntax. Best regards, Hou zj
On 2022-Sep-20, Robert Haas wrote: > > I don't think we should change this behavior that's already in logical > > replication. While I understand the reasons why "GRANT ... ALL TABLES IN > > SCHEMA" has a different behavior (i.e. it's not applied to future > > objects) and do not advocate to change it, I have personally been > > affected where I thought a permission would be applied to all future > > objects, only to discover otherwise. I believe it's more intuitive to > > think that "ALL" applies to "everything, always." > > Nah, there's room for multiple behaviors here. It's reasonable to want > to add all the tables currently in the schema to a publication (or > grant permissions on them) and it's reasonable to want to include all > current and future tables in the schema in a publication (or grant > permissions on them) too. The reason I don't like the ALL TABLES IN > SCHEMA syntax is that it sounds like the former, but actually is the > latter. Based on your link to the email from Tom, I understand now the > reason why it's like that, but it's still counterintuitive to me. I already proposed elsewhere that we remove the ALL keyword from there, which I think serves to reduce confusion (in particular it's no longer parallel to the GRANT one). As in the attached. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "No renuncies a nada. No te aferres a nada."
Attachment
On 9/21/22 10:24 AM, Alvaro Herrera wrote: > On 2022-Sep-20, Robert Haas wrote: > >>> I don't think we should change this behavior that's already in logical >>> replication. While I understand the reasons why "GRANT ... ALL TABLES IN >>> SCHEMA" has a different behavior (i.e. it's not applied to future >>> objects) and do not advocate to change it, I have personally been >>> affected where I thought a permission would be applied to all future >>> objects, only to discover otherwise. I believe it's more intuitive to >>> think that "ALL" applies to "everything, always." >> >> Nah, there's room for multiple behaviors here. It's reasonable to want >> to add all the tables currently in the schema to a publication (or >> grant permissions on them) and it's reasonable to want to include all >> current and future tables in the schema in a publication (or grant >> permissions on them) too. The reason I don't like the ALL TABLES IN >> SCHEMA syntax is that it sounds like the former, but actually is the >> latter. Based on your link to the email from Tom, I understand now the >> reason why it's like that, but it's still counterintuitive to me. > > I already proposed elsewhere that we remove the ALL keyword from there, > which I think serves to reduce confusion (in particular it's no longer > parallel to the GRANT one). As in the attached. [personal, not RMT hat] I'd be OK with this. It would still allow for "FOR SEQUENCES IN SCHEMA" etc. Jonathan
Attachment
On Wed, Sep 21, 2022 at 8:24 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: > > On 9/21/22 10:24 AM, Alvaro Herrera wrote: > > On 2022-Sep-20, Robert Haas wrote: > > > >>> I don't think we should change this behavior that's already in logical > >>> replication. While I understand the reasons why "GRANT ... ALL TABLES IN > >>> SCHEMA" has a different behavior (i.e. it's not applied to future > >>> objects) and do not advocate to change it, I have personally been > >>> affected where I thought a permission would be applied to all future > >>> objects, only to discover otherwise. I believe it's more intuitive to > >>> think that "ALL" applies to "everything, always." > >> > >> Nah, there's room for multiple behaviors here. It's reasonable to want > >> to add all the tables currently in the schema to a publication (or > >> grant permissions on them) and it's reasonable to want to include all > >> current and future tables in the schema in a publication (or grant > >> permissions on them) too. The reason I don't like the ALL TABLES IN > >> SCHEMA syntax is that it sounds like the former, but actually is the > >> latter. Based on your link to the email from Tom, I understand now the > >> reason why it's like that, but it's still counterintuitive to me. > > > > I already proposed elsewhere that we remove the ALL keyword from there, > > which I think serves to reduce confusion (in particular it's no longer > > parallel to the GRANT one). As in the attached. > Thanks for working on this. > [personal, not RMT hat] > > I'd be OK with this. It would still allow for "FOR SEQUENCES IN SCHEMA" etc. > I also think this is reasonable. It can later be extended to have an option to exclude/include future tables with a publication option. Also, if we want to keep it compatible with FOR ALL TABLES syntax, we can later add ALL as an optional keyword in the syntax. -- With Regards, Amit Kapila.
On Wed, Sep 21, 2022 at 1:15 AM Jonathan S. Katz <jkatz@postgresql.org> wrote: > > (RMT hat on, unless otherwise noted) > > On 9/20/22 9:42 AM, Robert Haas wrote: > > On Mon, Sep 19, 2022 at 11:03 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: > >> For #1 (allowing calls that have schema/table overlap...), there appears > >> to be both a patch that allows this (reversing[8]), and a suggestion for > >> dealing with a corner-case that is reasonable, i.e. disallowing adding > >> schemas to a publication when specifying column-lists. Do we think we > >> can have consensus on this prior to the RC1 freeze? > > > > I am not sure whether we can or should rush a fix in that fast, but I > > agree with this direction. > > The RMT met today to discuss this. > > We did agree that the above is an open item that should be resolved > before this release. While it is an accepted pattern for us to "ERROR" > on unsupported behavior and then later introduce said behavior, we do > agree with Peter's original post in this thread and would like it resolved. > As there seems to be an agreement with this direction, I think it is better to commit the patch in this release (before RC1) to avoid users seeing any behavior change in a later release. If the proposed behavior for one of the cases (disallowing adding schemas to a publication when specifying column-lists) turns out to be too restrictive for users, we can make it less restrictive in a future release. I am planning to commit the current patch [1] tomorrow unless RMT or anyone else thinks otherwise. [1] - https://www.postgresql.org/message-id/OS0PR01MB57162E862758402F978725CD944B9%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
FWIW I put this to CI: https://cirrus-ci.com/build/5823276948652032 (master) and everything appears to be OK. If anybody has reservations about this grammar change, please speak up soon, as there's not much time before RC1. The one for 15 just started running: https://cirrus-ci.com/build/4735322423558144 -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
> On Sep 22, 2022, at 8:02 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > FWIW I put this to CI: > https://cirrus-ci.com/build/5823276948652032 (master) > > and everything appears to be OK. If anybody has reservations about this > grammar change, please speak up soon, as there's not much time before RC1. > > The one for 15 just started running: > https://cirrus-ci.com/build/4735322423558144 [personal hat, not RMT] Looks like it passed. No objections. Jonathan