Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Column Filtering in Logical Replication
Date
Msg-id 202109241325.eag5g6mpvoup@alvherre.pgsql
Whole thread Raw
In response to Re: Column Filtering in Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Column Filtering in Logical Replication  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Re: Column Filtering in Logical Replication  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On 2021-Sep-23, Amit Kapila wrote:

> Alvaro, do you have any thoughts on these proposed grammar changes?

Yeah, I think pubobj_name remains a problem in that you don't know its
return type -- could be a String or a RangeVar, and the user of that
production can't distinguish.  So you're still (unnecessarily, IMV)
stashing an object of undetermined type into ->object.

I think you should get rid of both pubobj_name and pubobj_expr and do
somethine like this:

/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec:    TABLE ColId
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->pubobjtype = PUBLICATIONOBJ_TABLE;
                        $$->rangevar = makeRangeVarFromQualifiedName($1, NULL, @1, yyscanner);
                    }
            | TABLE ColId indirection
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->pubobjtype = PUBLICATIONOBJ_TABLE;
                        $$->rangevar = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
                    }
            | ALL TABLES IN_P SCHEMA ColId
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA;
                        $$->name = $4;
                    }
            | ALL TABLES IN_P SCHEMA CURRENT_SCHEMA    /* XXX should this be "IN_P CURRENT_SCHEMA"? */
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
                        $$->name = $4;
                    }
            | ColId
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->name = $1;
                        $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
                    }
            | ColId indirection
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->rangevar = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
                        $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
                    }
            | CURRENT_SCHEMA
                    {
                        $$ = makeNode(PublicationObjSpec);
                        $$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
                    }
        ;

so in AlterPublicationStmt you would have stanzas like

            | ALTER PUBLICATION name ADD_P pub_obj_list
                {
                    AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
                    n->pubname = $3;
                    n->pubobjects = preprocess_pubobj_list($5);
                    n->action = DEFELEM_ADD;
                    $$ = (Node *)n;
                }

where preprocess_pubobj_list (defined right after processCASbits and
somewhat mimicking it and SplitColQualList) takes all
PUBLICATIONOBJ_CONTINUATION and turns them into either
PUBLICATIONOBJ_TABLE entries or PUBLICATIONOBJ_REL_IN_SCHEMA entries,
depending on what the previous entry was.  (And of course if there is no
previous entry, raise an error immediately).  Note that node
PublicationObjSpec now has two fields, one for RangeVar and another for
a plain name, and tables always use the second one, except when they are
continuations, but of course those continuations that use name are
turned into rangevars in the preprocess step.  I think that would make
the code in ObjectsInPublicationToOids less messy.

(I don't think using the string "CURRENT_SCHEMA" is a great solution.
Did you try having a schema named CURRENT_SCHEMA?)

I verified that bison is happy with the grammar I proposed; I also
verified that you can add opt_column_list to the stanzas for tables, and
it remains happy.

-- 
Álvaro Herrera              Valdivia, Chile  —  https://www.EnterpriseDB.com/
Y una voz del caos me habló y me dijo
"Sonríe y sé feliz, podría ser peor".
Y sonreí. Y fui feliz.
Y fue peor.



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Alvaro Herrera
Date:
Subject: Re: error_severity of brin work item