Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers

From vignesh C
Subject Re: Pgoutput not capturing the generated columns
Date
Msg-id CALDaNm1PMyyHSR9u9DCtytnFJenoWMbtnecREprv5SX760j-Yg@mail.gmail.com
Whole thread Raw
In response to Re: Pgoutput not capturing the generated columns  (Shubham Khanna <khannashubham1197@gmail.com>)
Responses Re: Pgoutput not capturing the generated columns
List pgsql-hackers
On Fri, 14 Jun 2024 at 15:52, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> > Thanks for the updated patch, few comments:
> > 1) The option name seems wrong here:
> > In one place include_generated_column is specified and other place
> > include_generated_columns is specified:
> >
> > +               else if (IsSet(supported_opts,
> > SUBOPT_INCLUDE_GENERATED_COLUMN) &&
> > +                                strcmp(defel->defname,
> > "include_generated_column") == 0)
> > +               {
> > +                       if (IsSet(opts->specified_opts,
> > SUBOPT_INCLUDE_GENERATED_COLUMN))
> > +                               errorConflictingDefElem(defel, pstate);
> > +
> > +                       opts->specified_opts |= SUBOPT_INCLUDE_GENERATED_COLUMN;
> > +                       opts->include_generated_column = defGetBoolean(defel);
> > +               }
>
> Fixed.
>
> > diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
> > index d453e224d9..e8ff752fd9 100644
> > --- a/src/bin/psql/tab-complete.c
> > +++ b/src/bin/psql/tab-complete.c
> > @@ -3365,7 +3365,7 @@ psql_completion(const char *text, int start, int end)
> >                 COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
> >                                           "disable_on_error",
> > "enabled", "failover", "origin",
> >                                           "password_required",
> > "run_as_owner", "slot_name",
> > -                                         "streaming",
> > "synchronous_commit", "two_phase");
> > +                                         "streaming",
> > "synchronous_commit", "two_phase","include_generated_columns");
> >
> > 2) This small data table need not have a primary key column as it will
> > create an index and insertion will happen in the index too.
> > +-- check include-generated-columns option with generated column
> > +CREATE TABLE gencoltable (a int PRIMARY KEY, b int GENERATED ALWAYS
> > AS (a * 2) STORED);
> > +INSERT INTO gencoltable (a) VALUES (1), (2), (3);
> > +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
> > NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
> > 'include-generated-columns', '1');
>
> Fixed.
>
> > 3) Please add a test case for this:
> > +          set to <literal>false</literal>. If the subscriber-side
> > column is also a
> > +          generated column then this option has no effect; the
> > replicated data will
> > +          be ignored and the subscriber column will be filled as
> > normal with the
> > +          subscriber-side computed or default data.
>
> Added the required test case.
>
> > 4) You can use a new style of ereport to remove the brackets around errcode
> > 4.a)
> > +                       else if (!parse_bool(strVal(elem->arg),
> > &data->include_generated_columns))
> > +                               ereport(ERROR,
> > +
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +                                                errmsg("could not
> > parse value \"%s\" for parameter \"%s\"",
> > +
> > strVal(elem->arg), elem->defname)));
> >
> > 4.b) similarly here too:
> > +               ereport(ERROR,
> > +                               (errcode(ERRCODE_SYNTAX_ERROR),
> > +               /*- translator: both %s are strings of the form
> > "option = value" */
> > +                                       errmsg("%s and %s are mutually
> > exclusive options",
> > +                                               "copy_data = true",
> > "include_generated_column = true")));
> >
> > 4.c) similarly here too:
> > +                       if (include_generated_columns_option_given)
> > +                               ereport(ERROR,
> > +                                               (errcode(ERRCODE_SYNTAX_ERROR),
> > +                                                errmsg("conflicting
> > or redundant options")));
>
> Fixed.
>
> > 5) These variable names can be changed to keep it smaller, something
> > like gencol or generatedcol or gencolumn, etc
> > +++ b/src/include/catalog/pg_subscription.h
> > @@ -98,6 +98,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
> > BKI_SHARED_RELATION BKI_ROW
> >   * slots) in the upstream database are enabled
> >   * to be synchronized to the standbys. */
> >
> > + bool subincludegeneratedcolumn; /* True if generated columns must be
> > published */
> > +
> >  #ifdef CATALOG_VARLEN /* variable-length fields start here */
> >   /* Connection string to the publisher */
> >   text subconninfo BKI_FORCE_NOT_NULL;
> > @@ -157,6 +159,7 @@ typedef struct Subscription
> >   List    *publications; /* List of publication names to subscribe to */
> >   char    *origin; /* Only publish data originating from the
> >   * specified origin */
> > + bool includegeneratedcolumn; /* publish generated column data */
> >  } Subscription;
>
> Fixed.
>
> The attached Patch contains the suggested changes.

Few comments:
1) Here tab1 and tab2 are exactly the same tables, just check if the
table tab1 itself can be used for your tests.
@@ -24,20 +24,50 @@ $node_publisher->safe_psql('postgres',
        "CREATE TABLE tab1 (a int PRIMARY KEY, b int GENERATED ALWAYS
AS (a * 2) STORED)"
 );
+$node_publisher->safe_psql('postgres',
+       "CREATE TABLE tab2 (a int PRIMARY KEY, b int GENERATED ALWAYS
AS (a * 2) STORED)"
+);

2) We can document  that the include_generate_columns option cannot be altered.

3) You can mention that include-generated-columns is true by default
and generated column data will be selected
+-- When 'include-generated-columns' is not set
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
+                            data
+-------------------------------------------------------------
+ BEGIN
+ table public.gencoltable: INSERT: a[integer]:1 b[integer]:2
+ table public.gencoltable: INSERT: a[integer]:2 b[integer]:4
+ table public.gencoltable: INSERT: a[integer]:3 b[integer]:6
+ COMMIT
+(5 rows)

4)  The comment seems to be wrong here, the comment says b will not be
replicated but b is being selected:
-- When 'include-generated-columns' = '1' the generated column 'b'
values will not be replicated
INSERT INTO gencoltable (a) VALUES (1), (2), (3);
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
'include-generated-columns', '1');
                            data
-------------------------------------------------------------
 BEGIN
 table public.gencoltable: INSERT: a[integer]:1 b[integer]:2
 table public.gencoltable: INSERT: a[integer]:2 b[integer]:4
 table public.gencoltable: INSERT: a[integer]:3 b[integer]:6
 COMMIT
(5 rows)

5)  Similarly here too the comment seems to be wrong, the comment says
b will not replicated but b is not being selected:
INSERT INTO gencoltable (a) VALUES (4), (5), (6);
-- When 'include-generated-columns' = '0' the generated column 'b'
values will be replicated
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
'include-generated-columns', '0');
                      data
------------------------------------------------
 BEGIN
 table public.gencoltable: INSERT: a[integer]:4
 table public.gencoltable: INSERT: a[integer]:5
 table public.gencoltable: INSERT: a[integer]:6
 COMMIT
(5 rows)

6) SUBOPT_include_generated_columns change it to SUBOPT_GENERATED to
keep the name consistent:
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -72,6 +72,7 @@
 #define SUBOPT_FAILOVER                                0x00002000
 #define SUBOPT_LSN                                     0x00004000
 #define SUBOPT_ORIGIN                          0x00008000
+#define SUBOPT_include_generated_columns               0x00010000

7) The comment style seems to be inconsistent, both of them can start
in lower case
+-- check include-generated-columns option with generated column
+CREATE TABLE gencoltable (a int, b int GENERATED ALWAYS AS (a * 2) STORED);
+INSERT INTO gencoltable (a) VALUES (1), (2), (3);
+-- When 'include-generated-columns' is not set
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
+                            data
+-------------------------------------------------------------
+ BEGIN
+ table public.gencoltable: INSERT: a[integer]:1 b[integer]:2
+ table public.gencoltable: INSERT: a[integer]:2 b[integer]:4
+ table public.gencoltable: INSERT: a[integer]:3 b[integer]:6
+ COMMIT
+(5 rows)
+
+-- When 'include-generated-columns' = '1' the generated column 'b'
values will not be replicated

8) This could be changed to remove the insert statements by using
pg_logical_slot_peek_changes:
-- When 'include-generated-columns' is not set
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
-- When 'include-generated-columns' = '1' the generated column 'b'
values will not be replicated
INSERT INTO gencoltable (a) VALUES (1), (2), (3);
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
'include-generated-columns', '1');
INSERT INTO gencoltable (a) VALUES (4), (5), (6);
-- When 'include-generated-columns' = '0' the generated column 'b'
values will be replicated
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
'include-generated-columns', '0');
to:
-- When 'include-generated-columns' is not set
SELECT data FROM pg_logical_slot_peek_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
-- When 'include-generated-columns' = '1' the generated column 'b'
values will not be replicated
SELECT data FROM pg_logical_slot_peek_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
'include-generated-columns', '1');
-- When 'include-generated-columns' = '0' the generated column 'b'
values will be replicated
SELECT data FROM pg_logical_slot_peek_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
'include-generated-columns', '0');

9) In commit message  the  option used is wrong
include_generated_columns should actually be
include-generated-columns:
Usage from test_decoding plugin:
SELECT data FROM pg_logical_slot_get_changes('slot2', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1',
                                      'include_generated_columns','1');

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Shlok Kyal
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Amit Langote
Date:
Subject: Re: SQL/JSON query functions context_item doc entry and type requirement