Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Column Filtering in Logical Replication |
Date | |
Msg-id | CALDaNm0yN8LFRETa-0SV1EGH6_VsFJPmpx_fZ3EjSoCb1zfF-A@mail.gmail.com Whole thread Raw |
In response to | Re: Column Filtering in Logical Replication (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
Responses |
Re: Column Filtering in Logical Replication
|
List | pgsql-hackers |
On Fri, Sep 24, 2021 at 6:55 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2021-Sep-23, Amit Kapila wrote: > > > Alvaro, do you have any thoughts on these proposed grammar changes? > > Yeah, I think pubobj_name remains a problem in that you don't know its > return type -- could be a String or a RangeVar, and the user of that > production can't distinguish. So you're still (unnecessarily, IMV) > stashing an object of undetermined type into ->object. > > I think you should get rid of both pubobj_name and pubobj_expr and do > somethine like this: > > /* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */ > PublicationObjSpec: TABLE ColId > { > $$ = makeNode(PublicationObjSpec); > $$->pubobjtype = PUBLICATIONOBJ_TABLE; > $$->rangevar = makeRangeVarFromQualifiedName($1, NULL, @1, yyscanner); > } > | TABLE ColId indirection > { > $$ = makeNode(PublicationObjSpec); > $$->pubobjtype = PUBLICATIONOBJ_TABLE; > $$->rangevar = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner); > } > | ALL TABLES IN_P SCHEMA ColId > { > $$ = makeNode(PublicationObjSpec); > $$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA; > $$->name = $4; > } > | ALL TABLES IN_P SCHEMA CURRENT_SCHEMA /* XXX should this be "IN_P CURRENT_SCHEMA"? */ > { > $$ = makeNode(PublicationObjSpec); > $$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA; > $$->name = $4; > } > | ColId > { > $$ = makeNode(PublicationObjSpec); > $$->name = $1; > $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; > } > | ColId indirection > { > $$ = makeNode(PublicationObjSpec); > $$->rangevar = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner); > $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; > } > | CURRENT_SCHEMA > { > $$ = makeNode(PublicationObjSpec); > $$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA; > } > ; Apart from the issue that Hou San pointed, I found one issue with introduction of PUBLICATIONOBJ_CURRSCHEMA, I was not able to differentiate if it is table or schema in the following cases: CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA CURRENT_SCHEMA; CREATE PUBLICATION pub1 FOR ALL TABLES IN SCHEMA sch1, CURRENT_SCHEMA; CREATE PUBLICATION pub1 FOR table t1, CURRENT_SCHEMA; The differentiation is required to differentiate and add a schema or a table. I felt it was better to use PUBLICATIONOBJ_CONTINUATION in case of CURRENT_SCHEMA in multiple object cases like: PublicationObjSpec: TABLE relation_expr { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_TABLE; $$->rangevar = $2; } | ALL TABLES IN_P SCHEMA ColId { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA; $$->name = $5; $$->location = @5; } | ALL TABLES IN_P SCHEMA CURRENT_SCHEMA /* XXX should this be "IN_P CURRENT_SCHEMA"? */ { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA; $$->name = "CURRENT_SCHEMA"; $$->location = @5; } | ColId { $$ = makeNode(PublicationObjSpec); $$->name = $1; $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; $$->location = @1; } | ColId indirection { $$ = makeNode(PublicationObjSpec); $$->rangevar = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; $$->location = @1; } | CURRENT_SCHEMA { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; $$->name = "CURRENT_SCHEMA"; $$->location = @1; } | extended_relation_expr /* grammar like tablename * , ONLY tablename, ONLY ( tablename )*/ { $$ = makeNode(PublicationObjSpec); /*$$->rangevar = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner); */ $$->rangevar = $1; $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; } ; I'm ok with your suggestion along with the above proposed changes. I felt the changes proposed at [1] were also fine. Let's change it to whichever is better, easily extendable and can handle the Column filtering project, ALL TABLES IN SCHEMA, ALL SEQUENCES IN SCHEMA projects, and other projects in the future. Based on that we can check in the parser changes independently and then the remaining series of the patches can be rebased on top of it accordingly. Thoughts? > so in AlterPublicationStmt you would have stanzas like > > | ALTER PUBLICATION name ADD_P pub_obj_list > { > AlterPublicationStmt *n = makeNode(AlterPublicationStmt); > n->pubname = $3; > n->pubobjects = preprocess_pubobj_list($5); > n->action = DEFELEM_ADD; > $$ = (Node *)n; > } > > where preprocess_pubobj_list (defined right after processCASbits and > somewhat mimicking it and SplitColQualList) takes all > PUBLICATIONOBJ_CONTINUATION and turns them into either > PUBLICATIONOBJ_TABLE entries or PUBLICATIONOBJ_REL_IN_SCHEMA entries, > depending on what the previous entry was. (And of course if there is no > previous entry, raise an error immediately). Note that node > PublicationObjSpec now has two fields, one for RangeVar and another for > a plain name, and tables always use the second one, except when they are > continuations, but of course those continuations that use name are > turned into rangevars in the preprocess step. I think that would make > the code in ObjectsInPublicationToOids less messy. I agree with this. I will make the changes for this in the next version. > (I don't think using the string "CURRENT_SCHEMA" is a great solution. > Did you try having a schema named CURRENT_SCHEMA?) Here CURRENT_SCHEMA is not used for the schema name, it will be replaced with the name of the schema that is first in the search path. [1] - https://www.postgresql.org/message-id/CALDaNm1R-xbQvz4LU5OXu3KKwbWOz3uDcT_YjRU6V0R5FZDYDg%40mail.gmail.com Regards, Vignesh
pgsql-hackers by date: