Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers

From Shlok Kyal
Subject Re: Pgoutput not capturing the generated columns
Date
Msg-id CANhcyEWZH08g1gnxNxd-QeHESYniTmUdCULivS84qwKuK5zqTg@mail.gmail.com
Whole thread Raw
In response to RE: Pgoutput not capturing the generated columns  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
On Tue, 25 Jun 2024 at 18:49, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shlok,
>
> Thanks for updating patches! Below are my comments, maybe only for 0002.
>
> 01. General
>
> IIUC, we are not discussed why ALTER SUBSCRIPTION ... SET include_generated_columns
> is prohibit. Previously, it seems okay because there are exclusive options. But now,
> such restrictions are gone. Do you have a reason in your mind? It is just not considered
> yet?
We donot support ALTER SUBSCRIPTION to alter
'include_generated_columns'. Suppose initially the user has a logical
replication setup. Publisher has
table t1 with columns (c1 int, c2 int generated always as (c1*2)) and
subscriber has table t1 with columns (c1 int, c2 int). And initially
'incude_generated_column' is true.
Now if we 'ALTER SUBSCRIPTION' to set 'include_generated_columns' as
false. Initial rows will have data for c2 on the subscriber table, but
will not have value after alter. This may be an inconsistent
behaviour.


> 02. General
>
> According to the doc, we allow to alter a column to non-generated one, by ALTER
> TABLE ... ALTER COLUMN ... DROP EXPRESSION command. Not sure, what should be
> when the command is executed on the subscriber while copying the data? Should
> we continue the copy or restart? How do you think?
COPY of data will happen in a single transaction, so if we execute
'ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSION' command, It will
take place after the whole COPY command will finish. So I think it
will not create any issue.

> 03. Tes tcode
>
> IIUC, REFRESH PUBLICATION can also lead the table synchronization. Can you add
> a test for that?
Added

> 04. Test code (maybe for 0001)
>
> Please test the combination with TABLE ... ALTER COLUMN ... DROP EXPRESSION command.
Added

> 05. logicalrep_rel_open
>
> ```
> +            /*
> +             * In case 'include_generated_columns' is 'false', we should skip the
> +             * check of missing attrs for generated columns.
> +             * In case 'include_generated_columns' is 'true', we should check if
> +             * corresponding column for the generated column in publication column
> +             * list is present in the subscription table.
> +             */
> +            if (!MySubscription->includegencols && attr->attgenerated)
> +            {
> +                entry->attrmap->attnums[i] = -1;
> +                continue;
> +            }
> ```
>
> This comment is not very clear to me, because here we do not skip anything.
> Can you clarify the reason why attnums[i] is set to -1 and how will it be used?
This part of the code is removed to address some comments.

> 06. make_copy_attnamelist
>
> ```
> +    gencollist = palloc0(MaxTupleAttributeNumber * sizeof(bool));
> ```
>
> I think this array is too large. Can we reduce a size to (desc->natts * sizeof(bool))?
> Also, the free'ing should be done.
I have changed the name 'gencollist' to 'localgenlist' to make the
name more consistent. Also
size should be (rel->remoterel.natts * sizeof(bool)) as I am storing
if a column is generated like 'localgenlist[attnum] = true;'
where 'attnum' is corresponding attribute number on publisher side.

> 07. make_copy_attnamelist
>
> ```
> +    /* Loop to handle subscription table generated columns. */
> +    for (int i = 0; i < desc->natts; i++)
> ```
>
> IIUC, the loop is needed to find generated columns on the subscriber side, right?
> Can you clarify as comment?
Fixed

> 08. copy_table
>
> ```
> +    /*
> +     * Regular table with no row filter and 'include_generated_columns'
> +     * specified as 'false' during creation of subscription.
> +     */
> ```
>
> I think this comment is not correct. After patching, all tablesync command becomes
> like COPY (SELECT ...) if include_genereted_columns is set to true. Is it right?
> Can we restrict only when the table has generated ones?
Fixed

Please refer to v14 patch for the changes [1].


[1]: https://www.postgresql.org/message-id/CANhcyEW95M_usF1OJDudeejs0L0%2BYOEb%3DdXmt_4Hs-70%3DCXa-g%40mail.gmail.com

Thanks and Regards,
Shlok Kyal



pgsql-hackers by date:

Previous
From: Shlok Kyal
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Daniel Gustafsson
Date:
Subject: Re: Problem while installing PostgreSQL using make