Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers
From | Shubham Khanna |
---|---|
Subject | Re: Pgoutput not capturing the generated columns |
Date | |
Msg-id | CAHv8RjKgUQweSr+OqnTO7S5a1kHV2bAZ7+9RfoN9aWbKoyouyg@mail.gmail.com Whole thread Raw |
In response to | Re: Pgoutput not capturing the generated columns (vignesh C <vignesh21@gmail.com>) |
List | pgsql-hackers |
On Thu, Oct 17, 2024 at 3:59 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 16 Oct 2024 at 23:25, Shubham Khanna > <khannashubham1197@gmail.com> wrote: > > > > On Wed, Oct 9, 2024 at 9:08 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, 8 Oct 2024 at 11:37, Shubham Khanna <khannashubham1197@gmail.com> wrote: > > > > > > > > On Fri, Oct 4, 2024 at 9:36 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > > > > > Hi Shubham, here are my review comments for v36-0001. > > > > > > > > > > ====== > > > > > 1. General - merge patches > > > > > > > > > > It is long past due when patches 0001 and 0002 should've been merged. > > > > > AFAIK the split was only because historically these parts had > > > > > different authors. But, keeping them separated is not helpful anymore. > > > > > > > > > > ====== > > > > > src/backend/catalog/pg_publication.c > > > > > > > > > > 2. > > > > > Bitmapset * > > > > > -pub_collist_validate(Relation targetrel, List *columns) > > > > > +pub_collist_validate(Relation targetrel, List *columns, bool pubgencols) > > > > > > > > > > Since you removed the WARNING, this parameter 'pubgencols' is unused > > > > > so it should also be removed. > > > > > > > > > > ====== > > > > > src/backend/replication/pgoutput/pgoutput.c > > > > > > > > > > 3. > > > > > /* > > > > > - * If the publication is FOR ALL TABLES then it is treated the same as > > > > > - * if there are no column lists (even if other publications have a > > > > > - * list). > > > > > + * To handle cases where the publish_generated_columns option isn't > > > > > + * specified for all tables in a publication, we must create a column > > > > > + * list that excludes generated columns. So, the publisher will not > > > > > + * replicate the generated columns. > > > > > */ > > > > > - if (!pub->alltables) > > > > > + if (!(pub->alltables && pub->pubgencols)) > > > > > > > > > > I still found that comment hard to understand. Does this mean to say > > > > > something like: > > > > > > > > > > ------ > > > > > Process potential column lists for the following cases: > > > > > > > > > > a. Any publication that is not FOR ALL TABLES. > > > > > > > > > > b. When the publication is FOR ALL TABLES and > > > > > 'publish_generated_columns' is false. > > > > > A FOR ALL TABLES publication doesn't have user-defined column lists, > > > > > so all columns will be replicated by default. However, if > > > > > 'publish_generated_columns' is set to false, column lists must still > > > > > be created to exclude any generated columns from being published > > > > > ------ > > > > > > > > > > ====== > > > > > src/test/regress/sql/publication.sql > > > > > > > > > > 4. > > > > > +SET client_min_messages = 'WARNING'; > > > > > +CREATE TABLE gencols (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED); > > > > > > > > > > AFAIK you don't need to keep changing 'client_min_messages', > > > > > particularly now that you've removed the WARNING message that was > > > > > previously emitted. > > > > > > > > > > ~ > > > > > > > > > > 5. > > > > > nit - minor comment changes. > > > > > > > > > > ====== > > > > > Please refer to the attachment which implements any nits from above. > > > > > > > > > > > > > I have fixed all the given comments. Also, I have created a new 0003 > > > > patch for the TAP-Tests related to the '011_generated.pl' file. I am > > > > planning to merge 0001 and 0003 patches once they will get fixed. > > > > The attached patches contain the required changes. > > > > > > Few comments: > > > 1) Since we are no longer throwing an error for generated columns, the > > > function header comments also need to be updated accordingly " Checks > > > for and raises an ERROR for any; unknown columns, system columns, > > > duplicate columns or generated columns." > > > - if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated) > > > - ereport(ERROR, > > > - > > > errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > > > - errmsg("cannot use generated > > > column \"%s\" in publication column list", > > > - colname)); > > > - > > > > > > 2) Tab completion missing for "PUBLISH_GENERATED_COLUMNS" option in > > > ALTER PUBLICATION ... SET ( > > > postgres=# alter publication pub2 set (PUBLISH > > > PUBLISH PUBLISH_VIA_PARTITION_ROOT > > > > > > 3) I was able to compile without this include, may be this is not required: > > > --- a/src/backend/replication/logical/tablesync.c > > > +++ b/src/backend/replication/logical/tablesync.c > > > @@ -118,6 +118,7 @@ > > > #include "utils/builtins.h" > > > #include "utils/lsyscache.h" > > > #include "utils/memutils.h" > > > +#include "utils/rel.h" > > > > > > 4) You can include "\dRp+ pubname" after each of the create/alter > > > publication to verify the columns that will be published: > > > +-- Test the 'publish_generated_columns' parameter enabled or disabled for > > > +-- different scenarios with/without generated columns in column lists. > > > +CREATE TABLE gencols (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED); > > > + > > > +-- Generated columns in column list, when 'publish_generated_columns'=false > > > +CREATE PUBLICATION pub1 FOR table gencols(a, gen1) WITH > > > (publish_generated_columns=false); > > > > > > +-- Generated columns in column list, when 'publish_generated_columns'=true > > > +CREATE PUBLICATION pub2 FOR table gencols(a, gen1) WITH > > > (publish_generated_columns=true); > > > + > > > +-- Generated columns in column list, then set > > > 'publication_generate_columns'=false > > > +ALTER PUBLICATION pub2 SET (publish_generated_columns = false); > > > + > > > +-- Remove generate columns from column list, when > > > 'publish_generated_columns'=false > > > +ALTER PUBLICATION pub2 SET TABLE gencols(a); > > > + > > > +-- Add generated columns in column list, when 'publish_generated_columns'=false > > > +ALTER PUBLICATION pub2 SET TABLE gencols(a, gen1); > > > > > > > I have fixed all the given comments. The attached patches contain the > > required changes. > > Few comments: > 1) File mode change is not required: > src/test/subscription/t/011_generated.pl | 354 +++++++++++++++++++++++ > 1 file changed, 354 insertions(+) > mode change 100644 => 100755 src/test/subscription/t/011_generated.pl > > diff --git a/src/test/subscription/t/011_generated.pl > b/src/test/subscription/t/011_generated.pl > old mode 100644 > new mode 100755 > index 8b2e5f4708..d1f2718078 > --- a/src/test/subscription/t/011_generated.pl > +++ b/src/test/subscription/t/011_generated.pl > > 2) Here copy_data=true looks obvious no need to mention again and > again in comments: > +# Create table and subscription with copy_data=true. > +$node_subscriber->safe_psql( > + 'postgres', qq( > + CREATE TABLE tab_gen_to_nogen (a int, b int); > + CREATE SUBSCRIPTION regress_sub1_gen_to_nogen CONNECTION > '$publisher_connstr' PUBLICATION regress_pub1_gen_to_nogen WITH > (copy_data = true); > +)); > + > +# Create table and subscription with copy_data=true. > +$node_subscriber->safe_psql( > + 'test_pgc_true', qq( > + CREATE TABLE tab_gen_to_nogen (a int, b int); > + CREATE SUBSCRIPTION regress_sub2_gen_to_nogen CONNECTION > '$publisher_connstr' PUBLICATION regress_pub2_gen_to_nogen WITH > (copy_data = true); > +)); > + > +# Wait for initial sync. > +$node_subscriber->wait_for_subscription_sync($node_publisher, > + 'regress_sub1_gen_to_nogen', 'postgres'); > +$node_subscriber->wait_for_subscription_sync($node_publisher, > + 'regress_sub2_gen_to_nogen', 'test_pgc_true'); > + > +# Initial sync test when publish_generated_columns=false and copy_data=true. > +# Verify that column 'b' is not replicated. > +$result = $node_subscriber->safe_psql('postgres', > + "SELECT a, b FROM tab_gen_to_nogen"); > +is( $result, qq(1| > +2| > +3|), 'tab_gen_to_nogen initial sync, when publish_generated_columns=false'); > + > +# Initial sync test when publish_generated_columns=true and copy_data=true. > +$result = $node_subscriber->safe_psql('test_pgc_true', > + "SELECT a, b FROM tab_gen_to_nogen"); > +is( $result, qq(1|2 > +2|4 > +3|6), > + 'tab_gen_to_nogen initial sync, when publish_generated_columns=true'); > > 3) The database test_pgc_true and also can be cleaned as it is not > required after this: > +# cleanup > +$node_subscriber->safe_psql('postgres', > + "DROP SUBSCRIPTION regress_sub1_gen_to_nogen"); > +$node_subscriber->safe_psql('test_pgc_true', > + "DROP SUBSCRIPTION regress_sub2_gen_to_nogen"); > +$node_publisher->safe_psql( > + 'postgres', qq( > + DROP PUBLICATION regress_pub1_gen_to_nogen; > + DROP PUBLICATION regress_pub2_gen_to_nogen; > +)); > > 4) There is no error message verification in this test, let's add the > error verification: > +# ============================================================================= > +# Misc test. > +# > +# A "normal -> generated" replication fails, reporting an error that the > +# subscriber side column is missing. > +# > +# In this test case we use DROP EXPRESSION to change the subscriber generated > +# column into a normal column, then verify replication works ok. > +# ============================================================================= > > 5) > 5.a) If possible have one regular column and one generated column in the tables > +# -------------------------------------------------- > +# Testcase: Publisher replicates the column list data including generated > +# columns even though publish_generated_columns option is false. > +# -------------------------------------------------- > + > +# Create table and publications. > +$node_publisher->safe_psql( > + 'postgres', qq( > + CREATE TABLE gen_to_nogen (a int, b int, gen1 int GENERATED > ALWAYS AS (a * 2) STORED, gen2 int GENERATED ALWAYS AS (a * 2) > STORED); > + CREATE TABLE gen_to_nogen2 (c int, d int, gen1 int GENERATED > ALWAYS AS (c * 2) STORED, gen2 int GENERATED ALWAYS AS (c * 2) > STORED); > + CREATE TABLE nogen_to_gen2 (c int, d int, gen1 int GENERATED > ALWAYS AS (c * 2) STORED, gen2 int GENERATED ALWAYS AS (c * 2) > STORED); > + CREATE PUBLICATION pub1 FOR table gen_to_nogen(a, b, gen2), > gen_to_nogen2, nogen_to_gen2(gen1) WITH > (publish_generated_columns=false); > +)); > > 5.b) Try to have same columns in all the tables > > 6) These are inserting two records: > +# Insert data to verify incremental replication > +$node_publisher->safe_psql( > + 'postgres', qq( > + INSERT INTO gen_to_nogen VALUES (2), (3); > + INSERT INTO gen_to_nogen2 VALUES (2), (3); > + INSERT INTO nogen_to_gen2 VALUES (2), (3); > +)); > > I felt you wanted this to be: > +# Insert data to verify incremental replication > +$node_publisher->safe_psql( > + 'postgres', qq( > + INSERT INTO gen_to_nogen VALUES (2, 3); > + INSERT INTO gen_to_nogen2 VALUES (2, 3); > + INSERT INTO nogen_to_gen2 VALUES (2, 3); > +)); I have fixed all the comments and posted the v40 patches for them. Please refer to the updated v40 Patches here in [1]. See [1] for the changes added. [1] https://www.postgresql.org/message-id/CAHv8RjLviXAWtB3Kcn1A1jPpqORpkNay1y2U%2B55K64sqwCdrGw%40mail.gmail.com Thanks and Regards, Shubham Khanna.
pgsql-hackers by date: