Thread: Fix pg_publication_tables to exclude generated columns

Fix pg_publication_tables to exclude generated columns

From
"shiy.fnst@fujitsu.com"
Date:
Hi hackers,

I noticed that there is a problem about system view pg_publication_tables when
looking into [1]. The column "attnames" contains generated columns when no
column list is specified, but generated columns shouldn't be included because
they are not replicated (see send_relation_and_attrs()).

I think one way to fix it is to modify pg_publication_tables query to exclude
generated columns. But in this way, we need to bump catalog version when fixing
it in back-branch. Another way is to modify function
pg_get_publication_tables()'s return value to contain all supported columns if
no column list is specified, and we don't need to change system view.

Attach the patch for HEAD, and we can ignore the changes of the system view in
PG15.

[1]
https://www.postgresql.org/message-id/OSZPR01MB631087C65BA81E1FEE5A60D2FDF59%40OSZPR01MB6310.jpnprd01.prod.outlook.com

Regards,
Shi yu



Attachment

Re: Fix pg_publication_tables to exclude generated columns

From
Amit Kapila
Date:
On Mon, Jan 9, 2023 at 5:29 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> I noticed that there is a problem about system view pg_publication_tables when
> looking into [1]. The column "attnames" contains generated columns when no
> column list is specified, but generated columns shouldn't be included because
> they are not replicated (see send_relation_and_attrs()).
>
> I think one way to fix it is to modify pg_publication_tables query to exclude
> generated columns. But in this way, we need to bump catalog version when fixing
> it in back-branch. Another way is to modify function
> pg_get_publication_tables()'s return value to contain all supported columns if
> no column list is specified, and we don't need to change system view.
>

That sounds like a reasonable approach to fix the issue.

-- 
With Regards,
Amit Kapila.



Re: Fix pg_publication_tables to exclude generated columns

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Mon, Jan 9, 2023 at 5:29 PM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
>> I think one way to fix it is to modify pg_publication_tables query to exclude
>> generated columns. But in this way, we need to bump catalog version when fixing
>> it in back-branch. Another way is to modify function
>> pg_get_publication_tables()'s return value to contain all supported columns if
>> no column list is specified, and we don't need to change system view.

> That sounds like a reasonable approach to fix the issue.

We could just not fix it in the back branches.  I'd argue that this is
as much a definition change as a bug fix, so it doesn't really feel
like something to back-patch anyway.

            regards, tom lane



RE: Fix pg_publication_tables to exclude generated columns

From
"shiy.fnst@fujitsu.com"
Date:
On Mon, Jan 9, 2023 11:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Mon, Jan 9, 2023 at 5:29 PM shiy.fnst@fujitsu.com
> > <shiy.fnst@fujitsu.com> wrote:
> >> I think one way to fix it is to modify pg_publication_tables query to exclude
> >> generated columns. But in this way, we need to bump catalog version when
> fixing
> >> it in back-branch. Another way is to modify function
> >> pg_get_publication_tables()'s return value to contain all supported columns
> if
> >> no column list is specified, and we don't need to change system view.
>
> > That sounds like a reasonable approach to fix the issue.
>
> We could just not fix it in the back branches.  I'd argue that this is
> as much a definition change as a bug fix, so it doesn't really feel
> like something to back-patch anyway.
>

If this is not fixed in back-branch, in some cases we will get an error when
creating/refreshing subscription because we query pg_publication_tables in
column list check.

e.g.

-- publisher
CREATE TABLE test_mix_4 (a int PRIMARY KEY, b int, c int, d int GENERATED ALWAYS AS (a + 1) STORED);
CREATE PUBLICATION pub_mix_7 FOR TABLE test_mix_4 (a, b, c);
CREATE PUBLICATION pub_mix_8 FOR TABLE test_mix_4;

-- subscriber
CREATE TABLE test_mix_4 (a int PRIMARY KEY, b int, c int, d int);

postgres=# CREATE SUBSCRIPTION sub1 CONNECTION 'port=5432' PUBLICATION pub_mix_7, pub_mix_8;
ERROR:  cannot use different column lists for table "public.test_mix_4" in different publications

I think it might be better to fix it in back-branch. And if we fix it by
modifying pg_get_publication_tables(), we don't need to bump catalog version in
back-branch, I think this seems acceptable.

Regards,
Shi yu



Re: Fix pg_publication_tables to exclude generated columns

From
Amit Kapila
Date:
On Tue, Jan 10, 2023 at 8:38 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Mon, Jan 9, 2023 11:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > On Mon, Jan 9, 2023 at 5:29 PM shiy.fnst@fujitsu.com
> > > <shiy.fnst@fujitsu.com> wrote:
> > >> I think one way to fix it is to modify pg_publication_tables query to exclude
> > >> generated columns. But in this way, we need to bump catalog version when
> > fixing
> > >> it in back-branch. Another way is to modify function
> > >> pg_get_publication_tables()'s return value to contain all supported columns
> > if
> > >> no column list is specified, and we don't need to change system view.
> >
> > > That sounds like a reasonable approach to fix the issue.
> >
> > We could just not fix it in the back branches.  I'd argue that this is
> > as much a definition change as a bug fix, so it doesn't really feel
> > like something to back-patch anyway.
> >
>
> If this is not fixed in back-branch, in some cases we will get an error when
> creating/refreshing subscription because we query pg_publication_tables in
> column list check.
>
> e.g.
>
> -- publisher
> CREATE TABLE test_mix_4 (a int PRIMARY KEY, b int, c int, d int GENERATED ALWAYS AS (a + 1) STORED);
> CREATE PUBLICATION pub_mix_7 FOR TABLE test_mix_4 (a, b, c);
> CREATE PUBLICATION pub_mix_8 FOR TABLE test_mix_4;
>
> -- subscriber
> CREATE TABLE test_mix_4 (a int PRIMARY KEY, b int, c int, d int);
>
> postgres=# CREATE SUBSCRIPTION sub1 CONNECTION 'port=5432' PUBLICATION pub_mix_7, pub_mix_8;
> ERROR:  cannot use different column lists for table "public.test_mix_4" in different publications
>
> I think it might be better to fix it in back-branch. And if we fix it by
> modifying pg_get_publication_tables(), we don't need to bump catalog version in
> back-branch, I think this seems acceptable.
>

So, if we don't backpatch then it could lead to an error when it
shouldn't have which is clearly a bug. I think we should backpatch
this unless Tom or others are against it.

-- 
With Regards,
Amit Kapila.



Re: Fix pg_publication_tables to exclude generated columns

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
>> On Mon, Jan 9, 2023 11:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We could just not fix it in the back branches.  I'd argue that this is
>>> as much a definition change as a bug fix, so it doesn't really feel
>>> like something to back-patch anyway.

> So, if we don't backpatch then it could lead to an error when it
> shouldn't have which is clearly a bug. I think we should backpatch
> this unless Tom or others are against it.

This isn't a hill that I'm ready to die on ... but do we have any field
complaints about this?  If not, I still lean against a back-patch.
I think there's a significant risk of breaking case A while fixing
case B when we change this behavior, and that's something that's
better done only in a major release.

            regards, tom lane



Re: Fix pg_publication_tables to exclude generated columns

From
Amit Kapila
Date:
On Wed, Jan 11, 2023 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> >> On Mon, Jan 9, 2023 11:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> We could just not fix it in the back branches.  I'd argue that this is
> >>> as much a definition change as a bug fix, so it doesn't really feel
> >>> like something to back-patch anyway.
>
> > So, if we don't backpatch then it could lead to an error when it
> > shouldn't have which is clearly a bug. I think we should backpatch
> > this unless Tom or others are against it.
>
> This isn't a hill that I'm ready to die on ... but do we have any field
> complaints about this?  If not, I still lean against a back-patch.
> I think there's a significant risk of breaking case A while fixing
> case B when we change this behavior, and that's something that's
> better done only in a major release.
>

Fair enough, but note that there is a somewhat related problem for
dropped columns [1] as well. While reviewing that it occurred to me
that generated columns also have a similar problem which leads to this
thread (it would have been better if there is a mention of the same in
the initial email). Now, as symptoms are similar, I think we shouldn't
back-patch that as well, otherwise, it will appear to be partially
fixed. What do you think?

[1] -
https://www.postgresql.org/message-id/OSZPR01MB631087C65BA81E1FEE5A60D2FDF59%40OSZPR01MB6310.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.



RE: Fix pg_publication_tables to exclude generated columns

From
"shiy.fnst@fujitsu.com"
Date:
On Wed, Jan 11, 2023 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Wed, Jan 11, 2023 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > >> On Mon, Jan 9, 2023 11:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>> We could just not fix it in the back branches.  I'd argue that this is
> > >>> as much a definition change as a bug fix, so it doesn't really feel
> > >>> like something to back-patch anyway.
> >
> > > So, if we don't backpatch then it could lead to an error when it
> > > shouldn't have which is clearly a bug. I think we should backpatch
> > > this unless Tom or others are against it.
> >
> > This isn't a hill that I'm ready to die on ... but do we have any field
> > complaints about this?  If not, I still lean against a back-patch.
> > I think there's a significant risk of breaking case A while fixing
> > case B when we change this behavior, and that's something that's
> > better done only in a major release.
> >
> 
> Fair enough, but note that there is a somewhat related problem for
> dropped columns [1] as well. While reviewing that it occurred to me
> that generated columns also have a similar problem which leads to this
> thread (it would have been better if there is a mention of the same in
> the initial email). Now, as symptoms are similar, I think we shouldn't
> back-patch that as well, otherwise, it will appear to be partially
> fixed. What do you think?
> 
> [1] - https://www.postgresql.org/message-
> id/OSZPR01MB631087C65BA81E1FEE5A60D2FDF59%40OSZPR01MB6310.jpnpr
> d01.prod.outlook.com
> 

I agree to only fix them on HEAD.

I merged this patch and the one in [1] as they are similar problems. Please
see the attached patch.

I removed the changes in tablesync.c which simplified the query in
fetch_remote_table_info(), because it only works for publishers of v16. Those
changes are based on pg_get_publication_tables() returning all columns when no
column list is specified, but publishers of v15 return NULL in that case.

[1]
https://www.postgresql.org/message-id/OSZPR01MB631087C65BA81E1FEE5A60D2FDF59%40OSZPR01MB6310.jpnprd01.prod.outlook.com


Regards,
Shi yu

Attachment

Re: Fix pg_publication_tables to exclude generated columns

From
Amit Kapila
Date:
On Thu, Jan 12, 2023 at 12:33 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Wed, Jan 11, 2023 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jan 11, 2023 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > >> On Mon, Jan 9, 2023 11:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >>> We could just not fix it in the back branches.  I'd argue that this is
> > > >>> as much a definition change as a bug fix, so it doesn't really feel
> > > >>> like something to back-patch anyway.
> > >
> > > > So, if we don't backpatch then it could lead to an error when it
> > > > shouldn't have which is clearly a bug. I think we should backpatch
> > > > this unless Tom or others are against it.
> > >
> > > This isn't a hill that I'm ready to die on ... but do we have any field
> > > complaints about this?  If not, I still lean against a back-patch.
> > > I think there's a significant risk of breaking case A while fixing
> > > case B when we change this behavior, and that's something that's
> > > better done only in a major release.
> > >
> >
> > Fair enough, but note that there is a somewhat related problem for
> > dropped columns [1] as well. While reviewing that it occurred to me
> > that generated columns also have a similar problem which leads to this
> > thread (it would have been better if there is a mention of the same in
> > the initial email). Now, as symptoms are similar, I think we shouldn't
> > back-patch that as well, otherwise, it will appear to be partially
> > fixed. What do you think?
> >
> > [1] - https://www.postgresql.org/message-
> > id/OSZPR01MB631087C65BA81E1FEE5A60D2FDF59%40OSZPR01MB6310.jpnpr
> > d01.prod.outlook.com
> >
>
> I agree to only fix them on HEAD.
>
> I merged this patch and the one in [1] as they are similar problems. Please
> see the attached patch.
>

Pushed.

-- 
With Regards,
Amit Kapila.