Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers

From Shubham Khanna
Subject Re: Pgoutput not capturing the generated columns
Date
Msg-id CAHv8RjKxAaa55OyP8k86ka_tDV38qGHFGxM73gF_ky-BdikZ4Q@mail.gmail.com
Whole thread Raw
In response to Re: Pgoutput not capturing the generated columns  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Tue, May 21, 2024 at 12:23 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi,
>
> AFAICT this v2-0001 patch differences from v1 is mostly about adding
> the new CREATE SUBSCRIPTION option. Specifically, I don't think it is
> addressing any of my previous review comments for patch v1. [1]. So
> these comments below are limited only to the new option code; All my
> previous review comments probably still apply.
>
> ======
> Commit message
>
> 1. (General)
> The commit message is seriously lacking background explanation to describe:
> - What is the current behaviour w.r.t. generated columns
> - What is the problem with the current behaviour?
> - What exactly is this patch doing to address that problem?

Added the information related to this inside the Patch.

> 2.
> New option generated_option is added in create subscription. Now if this
> option is specified as 'true' during create subscription, generated
> columns in the tables, present in publisher (to which this subscription is
> subscribed) can also be replicated.
>
> -
>
> 2A.
> "generated_option" is not the name of the new option.
>
> ~
>
> 2B.
> "create subscription" stmt should be UPPERCASE; will also be more
> readable if the option name is quoted.
>
> ~
>
> 2C.
> Needs more information like under what condition is this option ignored etc.

Fixed.

> ======
> doc/src/sgml/ref/create_subscription.sgml
>
> 3.
> +       <varlistentry id="sql-createsubscription-params-with-generated-column">
> +        <term><literal>generated-column</literal> (<type>boolean</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies whether the generated columns present in the tables
> +          associated with the subscription should be replicated. The default is
> +          <literal>false</literal>.
> +         </para>
> +
> +         <para>
> +          This parameter can only be set true if copy_data is set to false.
> +          This option works fine when a generated column (in
> publisher) is replicated to a
> +          non-generated column (in subscriber). Else if it is
> replicated to a generated
> +          column, it will ignore the replicated data and fill the
> column with computed or
> +          default data.
> +         </para>
> +        </listitem>
> +       </varlistentry>
>
> 3A.
> There is a typo in the name "generated-column" because we should use
> underscores (not hyphens) for the option names.
>
> ~
>
> 3B.
> This it is not a good option name because there is no verb so it
> doesn't mean anything to set it true/false -- actually there IS a verb
> "generate" but we are not saying generate = true/false, so this name
> is also quite confusing.
>
> I think "include_generated_columns" would be much better, but if
> others think that name is too long then maybe "include_generated_cols"
> or "include_gen_cols" or similar. Of course, whatever if the final
> decision should be propagated same thru all the code comments, params,
> fields, etc.
>
> ~
>
> 3C.
> copy_data and false should be marked up as <literal> fonts in the sgml
>
> ~
>
> 3D.
>
> Suggest re-word this part. Don't need to explain when it "works fine".
>
> BEFORE
> This option works fine when a generated column (in publisher) is
> replicated to a non-generated column (in subscriber). Else if it is
> replicated to a generated column, it will ignore the replicated data
> and fill the column with computed or default data.
>
> SUGGESTION
> If the subscriber-side column is also a generated column then this
> option has no effect; the replicated data will be ignored and the
> subscriber column will be filled as normal with the subscriber-side
> computed or default data.

Fixed.

> ======
> src/backend/commands/subscriptioncmds.c
>
> 4. AlterSubscription
>     SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
>     SUBOPT_PASSWORD_REQUIRED |
>     SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
> -   SUBOPT_ORIGIN);
> +   SUBOPT_ORIGIN | SUBOPT_GENERATED_COLUMN);
>
> Hmm. Is this correct? If ALTER is not allowed (later in this patch
> there is a message "toggling generated_column option is not allowed."
> then why are we even saying that SUBOPT_GENERATED_COLUMN is a
> support_opt for ALTER?

Fixed.

> 5.
> + if (IsSet(opts.specified_opts, SUBOPT_GENERATED_COLUMN))
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("toggling generated_column option is not allowed.")));
> + }
>
> 5A.
> I suspect this is not even needed if the 'supported_opt' is fixed per
> the previous comment.
>
> ~
>
> 5B.
> But if this message is still needed then I think it should say "ALTER
> is not allowed" (not "toggling is not allowed") and also the option
> name should be quoted as per the new guidelines for error messages.
>
> ======
> src/backend/replication/logical/proto.c

Fixed.

> 6. logicalrep_write_tuple
>
> - if (att->attisdropped || att->attgenerated)
> + if (att->attisdropped)
>   continue;
>
>   if (!column_in_column_list(att->attnum, columns))
>   continue;
>
> + if (att->attgenerated && !publish_generated_column)
> +
>
> Calling column_in_column_list() might be a more expensive operation
> than checking just generated columns flag so maybe reverse the order
> and check the generated columns first for a tiny performance gain.

Fixed.

> 7.
> - if (att->attisdropped || att->attgenerated)
> + if (att->attisdropped)
>   continue;
>
>   if (!column_in_column_list(att->attnum, columns))
>   continue;
>
> + if (att->attgenerated && !publish_generated_column)
> + continue;
>
> ditto #6

Fixed.

> 8. logicalrep_write_attrs
>
> - if (att->attisdropped || att->attgenerated)
> + if (att->attisdropped)
>   continue;
>
>   if (!column_in_column_list(att->attnum, columns))
>   continue;
>
> + if (att->attgenerated && !publish_generated_column)
> + continue;
> +
>
> ditto #6

Fixed.

> 9.
> - if (att->attisdropped || att->attgenerated)
> + if (att->attisdropped)
>   continue;
>
>   if (!column_in_column_list(att->attnum, columns))
>   continue;
>
> + if (att->attgenerated && !publish_generated_column)
> + continue;
>
> ditto #6
>
> ======
> src/include/catalog/pg_subscription.h

Fixed.

> 10. CATALOG
>
> + bool subgeneratedcolumn; /* True if generated colums must be published */
>
> /colums/columns/
>
> ======
> src/test/regress/sql/publication.sql

Fixed.

> 11.
> --- error: generated column "d" can't be in list
> +-- ok
>
>
> Maybe change "ok" to say like "ok: generated cols can be in the list too"

Fixed.

> 12.
> GENERAL - Missing CREATE SUBSCRIPTION test?
> GENERAL - Missing ALTER SUBSCRIPTION test?
>
> How come this patch adds a new CREATE SUBSCRIPTION option but does not
> seem to include any test case for that option in either the CREATE
> SUBSCRIPTION or ALTER SUBSCRIPTION regression tests?

Added the test cases for the same.

Patch v4-0001 contains all the changes required. See [1] for the changes added.

[1] https://www.postgresql.org/message-id/CAHv8RjJcOsk%3Dy%2BvJ3y%2BvXhzR9ZUzUEURvS_90hQW3MNfJ5di7A%40mail.gmail.com

Thanks and Regards,
Shubham Khanna.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pgsql: Add more SQL/JSON constructor functions
Next
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns