Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Date
Msg-id CAD21AoC9rqwW_kXn+XKg0pguLwGozQnFfvZc7_GFqf9o5jSNcA@mail.gmail.com
Whole thread Raw
In response to [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
Hi,

On Mon, Aug 2, 2021 at 10:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
>
> Hi hackers,
>
> When testing some other logical replication related patches, I found two
> unexpected behaviours about ALTER SUBSCRIPTION ADD/DROP PUBLICATION.
>
> (1)
> when I execute the following sqls[1], the data of tables that registered to
> 'pub' wasn't copied to the subscriber side which means tablesync worker didn't
> start.
>
> -----sub originally had 2 pub nodes(pub,pub2)
> ALTER SUBSCRIPTION sub drop PUBLICATION pub;
> ALTER SUBSCRIPTION sub add PUBLICATION pub;
> -----
>
> (2)
> And when I execute the following sqls, the data of table registered to 'pub2'
> are synced again.
>
> -----sub originally had 2 pub nodes(pub,pub2)
> ALTER SUBSCRIPTION sub drop PUBLICATION pub;
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
> -----

I could reproduce this issue by the above steps.

>
> After looking into this problem, I think the reason is the
> [alter sub add/drop publication] misused the function
> AlterSubscription_refresh().
>
> For DROP cases:
>
> Currently, in function AlterSubscription_refresh(), it will first fetch the
> target tables from the target publication, and also fetch the tables in
> subscriber side from pg_subscription_rel. Then it will check each table from
> local pg_subscription_rel, if the table does not exists in the tables fetched
> from the target publication then drop it.
>
> The logic above only works for SET PUBLICATION. However, When DROP PUBLICATION,
> the tables fetched from target publication is actually the tables that need to
> be dropped. If reuse the above logic, it will drop the wrong table which result
> in unexpected behavioud in (1) and (2).(ADD PUBLICATION have similar problem).

Yes. For example, suppose that the publisher has two publications pub1
and pub2 for table1 and table2, respecitively. And we create a
subscription subscribing to those two publications.

postgres(1:74454)=# \dRs sub
          List of subscriptions
 Name |  Owner   | Enabled | Publication
------+----------+---------+-------------
 sub  | masahiko | t       | {pub1,pub2}
(1 row)

postgres(1:74454)=# select srsubid, srrelid::regclass::text,
srsubstate, srsublsn from pg_subscription_rel;
 srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
   16394 | table1  | r          | 0/170F388
   16394 | table2  | r          | 0/170F388
(2 rows)

After dropping pub1 from the subscription, 'table2' associated with
'pub2' is removed:

postgres(1:74454)=# alter subscription sub drop publication pub1;
ALTER SUBSCRIPTION
postgres(1:74454)=# select srsubid, srrelid::regclass::text,
srsubstate, srsublsn from pg_subscription_rel;
 srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
   16394 | table1  | r          | 0/170F388
(1 row)

> So, I think we'd better fix this problem. I tried add some additional check in
> AlterSubscription_refresh() which can avoid the problem like the attached
> patch. Not sure do we need to further refactor.

I've not looked at the patch deeply yet but I think that the following
one line change seems to fix the cases you reported, although I migit
be missing something:

diff --git a/src/backend/commands/subscriptioncmds.c
b/src/backend/commands/subscriptioncmds.c
index 22ae982328..c74d6d51d6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -1068,7 +1068,7 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
                    PreventInTransactionBlock(isTopLevel, "ALTER
SUBSCRIPTION with refresh");

                    /* Only refresh the added/dropped list of publications. */
-                   sub->publications = stmt->publication;
+                   sub->publications = publist;

                    AlterSubscription_refresh(sub, opts.copy_data);
                }

Also, I think we need to add some regression tests for this.
Currently, subscription.sql has tests for ADD/DROP PUBLICATION but
they don't check pg_subscription_rel.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Next
From: Masahiko Sawada
Date:
Subject: Re: Failed transaction statistics to measure the logical replication progress