Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers

From Shubham Khanna
Subject Re: Pgoutput not capturing the generated columns
Date
Msg-id CAHv8RjLQAUvZKZykFttUzBg0bW3pjieRGfV88-WZThw70WY-YQ@mail.gmail.com
Whole thread Raw
In response to Pgoutput not capturing the generated columns  (Rajendra Kumar Dangwal <dangwalrajendra888@gmail.com>)
List pgsql-hackers
On Wed, Oct 9, 2024 at 11:13 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi, here are my review comments for patch v37-0001.
>
> ======
> Commit message
>
> 1.
> Example usage of subscription option:
> CREATE PUBLICATION FOR TABLE tab_gencol WITH (publish_generated_columns
> = true);
>
> ~
>
> This is wrong -- it's not a "subscription option". Better to just say
> "Example usage:"
>
> ~~~
>
> 2.
> When 'copy_data' is true, during the initial sync, the data is replicated from
> the publisher to the subscriber using the COPY command. The normal COPY
> command does not copy generated columns, so when 'publish_generated_columns'
> is true...
>
> ~
>
> By only mentioning the "when ... is true" case this description does
> not cover the scenario when 'publish_generated_columns' is false when
> the publication column list has a generated column.
>
> ~~~
>
> 3.
> typo - /replication of generated column/replication of generated columns/
> typo - /filed/filled/
> typo - 'pg_publicataion' catalog
>
> ======
> src/backend/replication/logical/tablesync.c
>
> make_copy_attnamelist:
> 4.
> nit - missing word in a comment
>
> ~~~
>
> fetch_remote_table_info:
> 5.
> + appendStringInfo(&cmd,
>   "  FROM pg_catalog.pg_attribute a"
>   "  LEFT JOIN pg_catalog.pg_index i"
>   "       ON (i.indexrelid = pg_get_replica_identity_index(%u))"
>   " WHERE a.attnum > 0::pg_catalog.int2"
> - "   AND NOT a.attisdropped %s"
> + "   AND NOT a.attisdropped", lrel->remoteid);
> +
> + appendStringInfo(&cmd,
>   "   AND a.attrelid = %u"
>   " ORDER BY a.attnum",
> - lrel->remoteid,
> - (walrcv_server_version(LogRepWorkerWalRcvConn) >= 120000 ?
> -   "AND a.attgenerated = ''" : ""),
>   lrel->remoteid);
>
> Version v37-0001 has removed a condition previously between these two
> appendStringInfo's. But, that now means there is no reason to keep
> these statements separated. These should be combined now to use one
> appendStringInfo.
>
> ~
>
> 6.
> + if (server_version >= 120000)
> + remotegenlist[natt] = DatumGetBool(slot_getattr(slot, 5, &isnull));
> +
>
> Are you sure the version check for 120000 is correct? IIUC, this 5
> matches the 'attgenerated' column, but the SQL for that was
> constructed using a different condition:
> if (server_version >= 180000)
>   appendStringInfo(&cmd, ", a.attgenerated != ''");
>
> It is this 120000 versus 180000 difference that makes me suspicious of
> a potential mistake.
>
> ~~~
>
> 7.
> + /*
> + * If the column is generated and neither the generated column option
> + * is specified nor it appears in the column list, we will skip it.
> + */
> + if (remotegenlist[natt] && !has_pub_with_pubgencols &&
> + !bms_is_member(attnum, included_cols))
> + {
> + ExecClearTuple(slot);
> + continue;
> + }
>
> 7b.
> I am also suspicious about how this condition interacts with the other
> condition (shown below) that came earlier:
> /* If the column is not in the column list, skip it. */
> if (included_cols != NULL && !bms_is_member(attnum, included_cols))
>
> Something doesn't seem right. e.g. If we can only get here by passing
> the earlier condition, then it means we already know the generated
> condition was *not* a member of a column list.... in which case that
> should affect this new condition and the new comment too.
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> pgoutput_column_list_init:
>
> 8.
>   /*
> - * If the publication is FOR ALL TABLES then it is treated the same as
> - * if there are no column lists (even if other publications have a
> - * list).
> + * Process potential column lists for the following cases: a. Any
> + * publication that is not FOR ALL TABLES. b. When the publication is
> + * FOR ALL TABLES and 'publish_generated_columns' is false. FOR ALL
> + * TABLES publication doesn't have user-defined column lists, so all
> + * columns will be replicated by default. However, if
> + * 'publish_generated_columns' is set to false, column lists must
> + * still be created to exclude any generated columns from being
> + * published.
>   */
>
> nit - please reformat this comment so the bullets are readable
>

I have fixed all the comments and posted the v39 patches for them.
Please refer to the updated v39 Patches here in [1]. See [1] for the
changes added.

[1] https://www.postgresql.org/message-id/CAHv8RjLjb%2B98i5ZQUphivxdOZ3hSGLfq2SiWQetUvk8zGyAQwQ%40mail.gmail.com

Thanks and Regards,
Shubham Khanna.



pgsql-hackers by date:

Previous
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns