Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm0mSSrvHNRnC67f0HWMpoLW9UzxGVXimhwbRtKjE7Aa-Q@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Logical Replication of sequences
Re: Logical Replication of sequences
List pgsql-hackers
On Mon, 1 Jul 2024 at 12:57, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for the patch v20240625-0002
>
> ======
> Commit Message
>
> 1.
> This commit enhances logical replication by enabling the inclusion of all
> sequences in publications. This improvement facilitates seamless
> synchronization of sequence data during operations such as
> CREATE SUBSCRIPTION, REFRESH PUBLICATION, and REFRESH PUBLICATION SEQUENCES.
>
> ~
>
> Isn't this description getting ahead of the functionality a bit? For
> example, it talks about operations like REFRESH PUBLICATION SEQUENCES
> but AFAIK that syntax does not exist just yet.
>
> ~~~
>
> 2.
> The commit message should mention that you are only introducing new
> syntax for "FOR ALL SEQUENCES" here, but syntax for "FOR SEQUENCE" is
> being deferred to some later patch. Without such a note it is not
> clear why the gram.y syntax and docs seemed only half done.
>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 3.
>     <varlistentry id="sql-createpublication-params-for-all-tables">
>      <term><literal>FOR ALL TABLES</literal></term>
> +    <term><literal>FOR ALL SEQUENCES</literal></term>
>      <listitem>
>       <para>
> -      Marks the publication as one that replicates changes for all tables in
> -      the database, including tables created in the future.
> +      Marks the publication as one that replicates changes for all tables or
> +      sequences in the database, including tables created in the future.
>
> It might be better here to keep descriptions for "ALL TABLES" and "ALL
> SEQUENCES" separated, otherwise the wording does not quite seem
> appropriate for sequences (e.g. where it says "including tables
> created in the future").
>
> ~~~
>
> NITPICK - missing spaces
> NITPICK - removed Oxford commas since previously there were none
>
> ~~~
>
> 4.
> +   If <literal>FOR TABLE</literal>, <literal>FOR ALL TABLES</literal>,
> +   <literal>FOR ALL SEQUENCES</literal>,or <literal>FOR TABLES IN
> SCHEMA</literal>
> +   are not specified, then the publication starts out with an empty set of
> +   tables.  That is useful if tables or schemas are to be added later.
>
> It seems like "FOR ALL SEQUENCES" is out of place since it is jammed
> between other clauses referring to TABLES. Would it be better to
> mention SEQUENCES last in the list?
>
> ~~~
>
> 5.
> +   rights on the table.  The <command>FOR ALL TABLES</command>,
> +   <command>FOR ALL SEQUENCES</command>, and
>     <command>FOR TABLES IN SCHEMA</command> clauses require the invoking
>
> ditto of #4 above.
>
> ======
> src/backend/catalog/pg_publication.c
>
> GetAllSequencesPublicationRelations:
>
> NITPICK - typo /relation/relations/
>
> ======
> src/backend/commands/publicationcmds.c
>
> 6.
> + foreach(lc, stmt->for_all_objects)
> + {
> + char    *val = strVal(lfirst(lc));
> +
> + if (strcmp(val, "tables") == 0)
> + for_all_tables = true;
> + else if (strcmp(val, "sequences") == 0)
> + for_all_sequences = true;
> + }
>
> Consider the foreach_ptr macro to slightly simplify this code.
> Actually, this whole logic seems cumbersome -- can’t the parser assign
> flags automatically. Please see my more detailed comment #10 below
> about this in gram.y
>
> ~~~
>
> 7.
>   /* FOR ALL TABLES requires superuser */
> - if (stmt->for_all_tables && !superuser())
> + if (for_all_tables && !superuser())
>   ereport(ERROR,
>   (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>   errmsg("must be superuser to create FOR ALL TABLES publication")));
>
> + /* FOR ALL SEQUENCES requires superuser */
> + if (for_all_sequences && !superuser())
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + errmsg("must be superuser to create FOR ALL SEQUENCES publication")));
> +
>
> The current code is easy to read, but I wonder if it should try harder
> to share common code, or at least a common translatable message like
> "must be superuser to create %s publication".
>
> ~~~
>
> 8.
> - else
> +
> + /*
> + * If the publication might have either tables or sequences (directly or
> + * through a schema), process that.
> + */
> + if (!for_all_tables || !for_all_sequences)
>
> I did not understand why this code cannot just say "else" like before,
> because the direct or through-schema syntax cannot be specified at the
> same time as "FOR ALL ...", so why is the more complicated condition
> necessary? Also, the similar code in AlterPublicationOptions() was not
> changed to be like this.
>
> ======
> src/backend/parser/gram.y
>
> 9. comment
>
>   *
>   * CREATE PUBLICATION FOR ALL TABLES [WITH options]
>   *
> + * CREATE PUBLICATION FOR ALL SEQUENCES [WITH options]
> + *
>   * CREATE PUBLICATION FOR pub_obj [, ...] [WITH options]
>
> The comment is not quite correct because actually you are allowing
> simultaneous FOR ALL TABLES, SEQUENCES. It should be more like:
>
> CREATE PUBLICATION FOR ALL pub_obj_type [,...] [WITH options]
>
> pub_obj_type is one of:
> TABLES
> SEQUENCES
>
> ~~~
>
> 10.
> +pub_obj_type: TABLES
> + { $$ = (Node *) makeString("tables"); }
> + | SEQUENCES
> + { $$ = (Node *) makeString("sequences"); }
> + ;
> +
> +pub_obj_type_list: pub_obj_type
> + { $$ = list_make1($1); }
> + | pub_obj_type_list ',' pub_obj_type
> + { $$ = lappend($1, $3); }
> + ;
>
> IIUC the only thing you need is a flag to say if FOR ALL TABLE is in
> effect and another flag to say if FOR ALL SEQUENCES is in effect. So,
> It seemed clunky to build up a temporary list of "tables" or
> "sequences" strings here, which is subsequently scanned by
> CreatePublication to be turned back into booleans.
>
> Can't we just change the CreatePublicationStmt field to have:
>
> A) a 'for_all_types' bitmask instead of a list:
> 0x0000 means FOR ALL is not specified
> 0x0001 means ALL TABLES
> 0x0010 means ALL SEQUENCES
>
> Or, B) have 2 boolean fields ('for_all_tables' and 'for_all_sequences')
>
> ...where the gram.y code can be written to assign the flag/s values directly?
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 11.
>   if (pubinfo->puballtables)
>   appendPQExpBufferStr(query, " FOR ALL TABLES");
>
> + if (pubinfo->puballsequences)
> + appendPQExpBufferStr(query, " FOR ALL SEQUENCES");
> +
>
> Hmm. Is that correct? It looks like a possible bug, because if both
> flags are true it will give invalid syntax like "FOR ALL TABLES FOR
> ALL SEQUENCES" instead of "FOR ALL TABLES, SEQUENCES"
>
> ======
> src/bin/pg_dump/t/002_pg_dump.pl
>
> 12.
> This could also try the test scenario of both FOR ALL being
> simultaneously set ("FOR ALL TABLES, SEQUENCES") to check for bugs
> like the suspected one in dump.c review comment #11 above.
>
> ======
> src/bin/psql/describe.c
>
> 13.
> + if (pset.sversion >= 170000)
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname AS \"%s\",\n"
> +   "  pg_catalog.pg_get_userbyid(pubowner) AS \"%s\",\n"
> +   "  puballtables AS \"%s\",\n"
> +   "  puballsequences AS \"%s\",\n"
> +   "  pubinsert AS \"%s\",\n"
> +   "  pubupdate AS \"%s\",\n"
> +   "  pubdelete AS \"%s\"",
> +   gettext_noop("Name"),
> +   gettext_noop("Owner"),
> +   gettext_noop("All tables"),
> +   gettext_noop("All sequences"),
> +   gettext_noop("Inserts"),
> +   gettext_noop("Updates"),
> +   gettext_noop("Deletes"));
> + else
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname AS \"%s\",\n"
> +   "  pg_catalog.pg_get_userbyid(pubowner) AS \"%s\",\n"
> +   "  puballtables AS \"%s\",\n"
> +   "  pubinsert AS \"%s\",\n"
> +   "  pubupdate AS \"%s\",\n"
> +   "  pubdelete AS \"%s\"",
> +   gettext_noop("Name"),
> +   gettext_noop("Owner"),
> +   gettext_noop("All tables"),
> +   gettext_noop("Inserts"),
> +   gettext_noop("Updates"),
> +   gettext_noop("Deletes"));
> +
>
> IMO this should be coded differently so that only the
> "puballsequences" column is guarded by the (pset.sversion >= 170000),
> and everything else is the same as before. This suggested way would
> also be consistent with the existing code version checks (e.g. for
> "pubtruncate" or for "pubviaroot").
>
> ~~~
>
> NITPICK - Add blank lines
> NITPICK - space in "ncols ++"
>
> ======
> src/bin/psql/tab-complete.c
>
> 14.
> Hmm. When I tried this, it didn't seem to be working properly.
>
> For example "CREATE PUBLICATION pub1 FOR ALL" only completes with
> "TABLES" but not "SEQUENCES".
> For example "CREATE PUBLICATION pub1 FOR ALL SEQ" doesn't complete
> "SEQUENCES" properly
>
> ======
> src/include/catalog/pg_publication.h
>
> NITPICK - move the extern to be adjacent to others like it.
>
> ======
> src/include/nodes/parsenodes.h
>
> 15.
> - bool for_all_tables; /* Special publication for all tables in db */
> + List    *for_all_objects; /* Special publication for all objects in
> + * db */
>  } CreatePublicationStmt;
>
> I felt this List logic is a bit strange. See my comment #10 in gram.y
> for more details.
>
> ~~~
>
> 16.
> - bool for_all_tables; /* Special publication for all tables in db */
> + List    *for_all_objects; /* Special publication for all objects in
> + * db */
>
> Ditto comment #15 in AlterPublicationStmt
>
> ======
> src/test/regress/sql/publication.sql
>
> 17.
> +CREATE SEQUENCE testpub_seq0;
> +CREATE SEQUENCE pub_test.testpub_seq1;
> +
> +SET client_min_messages = 'ERROR';
> +CREATE PUBLICATION testpub_forallsequences FOR ALL SEQUENCES;
> +RESET client_min_messages;
> +
> +SELECT pubname, puballtables, puballsequences FROM pg_publication
> WHERE pubname = 'testpub_forallsequences';
> +\d+ pub_test.testpub_seq1
>
> Should you also do "\d+ tespub_seq0" here? Otherwise what was the
> point of defining the seq0 sequence being in this test?
>
> ~~~
>
> 18.
> Maybe there are missing test cases for different syntax combinations like:
>
> FOR ALL TABLES, SEQUENCES
> FOR ALL SEQUENCES, TABLES
>
> Note that the current list logic of this patch even considers my
> following bogus statement syntax is OK.
>
> test_pub=# CREATE PUBLICATION pub_silly FOR ALL TABLES, SEQUENCES,
> TABLES, TABLES, TABLES, SEQUENCES;
> CREATE PUBLICATION
> test_pub=#
>
> ======
> 99.
> Please also refer to the attached nitpicks patch which implements all
> the cosmetic issues identified above as NITPICKS.

Thank you for your feedback. I have addressed all the comments in the
attached patch.

Regards,
Vignesh

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal
Next
From: vignesh C
Date:
Subject: Re: Logical Replication of sequences