Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Pgoutput not capturing the generated columns |
Date | |
Msg-id | CAHut+Pu3HGgwTwGJOauA6sgBtoMNzY70rimEMtg61Te28_M6oQ@mail.gmail.com Whole thread Raw |
In response to | Re: Pgoutput not capturing the generated columns (Shlok Kyal <shlok.kyal.oss@gmail.com>) |
Responses |
Re: Pgoutput not capturing the generated columns
|
List | pgsql-hackers |
Hi, here are my review comments for patch v7-0001. ====== 1. GENERAL - \dRs+ Shouldn't the new SUBSCRIPTION parameter be exposed via "describe" (e.g. \dRs+ mysub) the same as the other boolean parameters? ====== Commit message 2. When 'include_generated_columns' is false then the PUBLICATION col-list will ignore any generated cols even when they are present in a PUBLICATION col-list ~ Maybe you don't need to mention "PUBLICATION col-list" twice. SUGGESTION When 'include_generated_columns' is false, generated columns are not replicated, even when present in a PUBLICATION col-list. ~~~ 2. CREATE SUBSCRIPTION test1 connection 'dbname=postgres host=localhost port=9999 'publication pub1; ~ 2a. (I've questioned this one in previous reviews) What exactly is the purpose of this statement in the commit message? Was this supposed to demonstrate the usage of the 'include_generated_columns' parameter? ~ 2b. /publication/ PUBLICATION/ ~~~ 3. If the subscriber-side column is also a generated column then thisoption has no effect; the replicated data will be ignored and the subscriber column will be filled as normal with the subscriber-side computed or default data. ~ Missing space: /thisoption/this option/ ====== .../expected/decoding_into_rel.out 4. +-- When 'include-generated-columns' is not set +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); + data +------------------------------------------------------------- + BEGIN + table public.gencoltable: INSERT: a[integer]:1 b[integer]:2 + table public.gencoltable: INSERT: a[integer]:2 b[integer]:4 + table public.gencoltable: INSERT: a[integer]:3 b[integer]:6 + COMMIT +(5 rows) Why is the default value here equivalent to 'include-generated-columns' = '1' here instead of '0'? The default for the CREATE SUBSCRIPTION parameter 'include_generated_columns' is false, and IMO it seems confusing for these 2 defaults to be different. Here I think it should default to '0' *regardless* of what the previous functionality might have done -- e.g. this is a "test decoder" so the parameter should behave sensibly. ====== .../test_decoding/sql/decoding_into_rel.sql NITPICK - wrong comments. ====== doc/src/sgml/protocol.sgml 5. + <varlistentry> + <term>include_generated_columns</term> + <listitem> + <para> + Boolean option to enable generated columns. This option controls + whether generated columns should be included in the string + representation of tuples during logical decoding in PostgreSQL. + The default is false. + </para> + </listitem> + </varlistentry> + Does the protocol version need to be bumped to support this new option and should that be mentioned on this page similar to how all other version values are mentioned? ====== doc/src/sgml/ref/create_subscription.sgml NITPICK - some missing words/sentence. NITPICK - some missing <literal> tags. NITPICK - remove duplicated sentence. NITPICK - add another <para>. ====== src/backend/commands/subscriptioncmds.c 6. #define SUBOPT_ORIGIN 0x00008000 +#define SUBOPT_include_generated_columns 0x00010000 Please use UPPERCASE for consistency with other macros. ====== .../libpqwalreceiver/libpqwalreceiver.c 7. + if (options->proto.logical.include_generated_columns && + PQserverVersion(conn->streamConn) >= 170000) + appendStringInfoString(&cmd, ", include_generated_columns 'on'"); + IMO it makes more sense to say 'true' here instead of 'on'. It seems like this was just cut/paste from the above code (where 'on' was sensible). ====== src/include/catalog/pg_subscription.h 8. @@ -98,6 +98,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW * slots) in the upstream database are enabled * to be synchronized to the standbys. */ + bool subincludegencol; /* True if generated columns must be published */ + Not fixed as claimed. This field name ought to be plural. /subincludegencol/subincludegencols/ ~~~ 9. char *origin; /* Only publish data originating from the * specified origin */ + bool includegencol; /* publish generated column data */ } Subscription; Not fixed as claimed. This field name ought to be plural. /includegencol/includegencols/ ====== src/test/subscription/t/031_column_list.pl 10. +$node_publisher->safe_psql('postgres', + "CREATE TABLE tab2 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) STORED)" +); + +$node_publisher->safe_psql('postgres', + "CREATE TABLE tab3 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a + 10) STORED)" +); + $node_subscriber->safe_psql('postgres', "CREATE TABLE tab1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 22) STORED, c int)" ); +$node_subscriber->safe_psql('postgres', + "CREATE TABLE tab2 (a int PRIMARY KEY, b int)" +); + +$node_subscriber->safe_psql('postgres', + "CREATE TABLE tab3 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a + 20) STORED)" +); IMO the test needs lots more comments to describe what it is doing: For example, the setup deliberately has made: * publisher-side tab2 has generated col 'b' but subscriber-side tab2 has NON-gnerated col 'b'. * publisher-side tab3 has generated col 'b' but subscriber-side tab2 has DIFFERENT COMPUTATION generated col 'b'. So it will be better to have comments to explain all this instead of having to figure it out. ~~~ 11. # data for initial sync $node_publisher->safe_psql('postgres', "INSERT INTO tab1 (a) VALUES (1), (2), (3)"); +$node_publisher->safe_psql('postgres', + "INSERT INTO tab2 (a) VALUES (1), (2), (3)"); $node_publisher->safe_psql('postgres', - "CREATE PUBLICATION pub1 FOR ALL TABLES"); + "CREATE PUBLICATION pub1 FOR TABLE tab1"); +$node_publisher->safe_psql('postgres', + "CREATE PUBLICATION pub2 FOR TABLE tab2"); +$node_publisher->safe_psql('postgres', + "CREATE PUBLICATION pub3 FOR TABLE tab3"); + # Wait for initial sync of all subscriptions $node_subscriber->wait_for_subscription_sync; my $result = $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab1"); is( $result, qq(1|22 2|44 3|66), 'generated columns initial sync'); ~ IMO (and for completeness) it would be better to INSERT data for all the tables and alsot to validate that tables tab2 and tab3 has zero rows replicated. Yes, I know there is 'copy_data=false', but it is just easier to see all the tables instead of guessing why some are omitted, and anyway this test case will be needed after the next patch implements the COPY support for gen-cols. ~~~ 12. +$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (4), (5)"); + +$node_publisher->wait_for_catchup('sub2'); + +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab2"); +is( $result, qq(4|8 +5|10), 'generated columns replicated to non-generated column on subscriber'); + +$node_publisher->safe_psql('postgres', "INSERT INTO tab3 VALUES (4), (5)"); + +$node_publisher->wait_for_catchup('sub3'); + +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab3"); +is( $result, qq(4|24 +5|25), 'generated columns replicated to non-generated column on subscriber'); + Here also I think there should be explicit comments about what these cases are testing, what results you are expecting, and why. The comments will look something like the message parameter of those safe_psql(...) e.g. # confirm generated columns ARE replicated when the subscriber-side column is not generated e.g. # confirm generated columns are NOT replicated when the subscriber-side column is also generated ====== 99. Please also see my nitpicks attachment patch for various other cosmetic and docs problems, and apply theseif you agree: - documentation wording/rendering - wrong comments - spacing - etc. ====== Kind Regards, Peter Smith. Fujitsu Australia
Attachment
pgsql-hackers by date: