Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION - Mailing list pgsql-hackers

From japin
Subject Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION
Date
Msg-id MEYP282MB16694A239E4937EA6A753D65B6A70@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
Whole thread Raw
In response to Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Fri, 15 Jan 2021 at 14:50, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> On Fri, Jan 15, 2021 at 11:33 AM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>>
>> > On Thu, Jan 14, 2021 at 5:36 PM Li Japin <japinli@hotmail.com> wrote
>> > > Do we really need to access PUBLICATIONRELMAP in this patch? What if
>> > > we just set it to false in the else condition of (if (publish &&
>> > > (relkind != RELKIND_PARTITIONED_TABLE || pub->pubviaroot)))
>> > >
>> > > Thank for you review. I agree with you, it doesn’t need to access
>> > > PUBLICATIONRELMAP, since We already get the publication oid in
>> > > GetRelationPublications(relid) [1], which also access
>> > > PUBLICATIONRELMAP.  If the current pub->oid does not in pubids, the
>> > publish is false, so we do not need publish the table.
>> >
>> > +1. This is enough.
>> >
>> > > I have another question, the data->publications is a list, when did it
>> > has more then one items?
>> >
>> > IIUC, when the single table is associated with multiple publications, then
>> > data->publications will have multiple entries. Though I have not tried,
>> > we can try having two or three publications for the same table and verify
>> > that.
>> >
>> > > 0001 - change as you suggest.
>> > > 0002 - does not change.
>>
>>
>> Hi
>>
>> In 0001 patch
>>
>> The comments says " Relation is not associated with the publication anymore "
>> That comment was accurate when check is_relation_part_of_publication() in the last patch.
>>
>> But when put the code in the else branch, not only the case in the comments(Relation is dropped),
>> Some other case such as "partitioned tables" will hit else branch too.
>>
>> Do you think we need fix the comment a little to make it accurate?
>
> Thanks, that comment needs a rework, in fact the else condition
> also(because we may fall into else branch even when publish is true
> and (relkind != RELKIND_PARTITIONED_TABLE || pub->pubviaroot) is
> false, but our intention(to fix the bug reported here) is to have the
> else condition only when publish is false. And also instead of just
> relying on the publish variable which can get set even when the
> relation is not in the publication but ancestor_published is true, we
> can just have something like below to fix the bug reported here:
>
>             if (publish &&
>                 (relkind != RELKIND_PARTITIONED_TABLE || pub->pubviaroot))
>             {
>                 entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
>                 entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
>                 entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
>                 entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
>             }
>             else if (!list_member_oid(pubids, pub->oid))
>             {
>                 /*
>                  * Relation is not associated with the publication anymore i.e
>                  * it would have been dropped from the publication. So we don't
>                  * need to publish the changes for it.
>                  */
>                 entry->pubactions.pubinsert = false;
>                 entry->pubactions.pubupdate = false;
>                 entry->pubactions.pubdelete = false;
>                 entry->pubactions.pubtruncate = false;
>             }
>
> So this way, the fix only focuses on what we have reported here and it
> doesn't cause any regressions may be, and the existing comment becomes
> appropriate.
>
> Thoughts?
>

I'm sorry, the 0001 patch is totally wrong.  If we have only one publication, it works,
however, this is a special case.  If we have multiple publication in one subscription,
it doesn't work.  Here is a usecase.

Step (1)
-- On publisher
CREATE TABLE t1 (a int);
CREATE TABLE t2 (a int);
CREATE PUBLICATION mypub1 FOR TABLE t1 WITH (publish='insert');
CREATE PUBLICATION mypub2 FOR TABLE t2;

Step (2)
-- On subscriber
CREATE TABLE t1 (a int);
CREATE TABLE t2 (a int);
CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION mypub1, mypub2;

Step (3)
-- On publisher
INSERT INTO t1 VALUES (1);

Step (4)
-- On subscriber
SELECT * FROM t1;

In step (4), we cannot get the records, because there has two publications in
data->publications, so we will check one by one.
The first publication is "mypub1" and entry->pubactions.pubinsert will be set
true, however, in the second round, the publication is "mypub2", and we cannot
find pub->oid in pubids (only oid of "mypub1"), so it will set all entry->pubactions
to false, and nothing will be replicate to subscriber.

My apologies!

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



pgsql-hackers by date:

Previous
From: "lchch1990@sina.cn"
Date:
Subject: Re: Wrong HINT during database recovery when occur a minimal wal.
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Asynchronous Append on postgres_fdw nodes.