Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Date
Msg-id CAD21AoDvEGk2vNBQao3_mg7csGPQfdV97dS_AEsy=3Dq9PXChw@mail.gmail.com
Whole thread Raw
In response to RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
List pgsql-hackers
On Thu, Aug 5, 2021 at 11:40 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, August 5, 2021 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote
> > I've reviewed v2 patch. Here are some comments:
> >
> > +           if (type == ALTER_SUBSCRIPTION_SET_PUBLICATION ||
> > +               type == ALTER_SUBSCRIPTION_REFRESH)
> > +               drop_table = !bsearch(&relid, pubrel_local_oids,
> > +                                     list_length(pubrel_names),
> > +                                     sizeof(Oid), oid_cmp);
> > +           else if (type == ALTER_SUBSCRIPTION_DROP_PUBLICATION)
> > +               drop_table = bsearch(&relid, pubrel_local_oids,
> > +                                    list_length(pubrel_names),
> > +                                    sizeof(Oid), oid_cmp);
> >
> > IIUC, in DROP PUBLICATION cases, we don't need to refer to the tables in the
> > publication on the publisher that we're dropping in order to check whether to
> > remove the relation. If we drop a table from the publication on the publisher
> > before dropping the publication from the subscription on the subscriber, the
> > pg_subscription_rel entry for the table remains. For example, please consider the
> > following case:
> >
> > On publisher and subscriber:
> > create table t1 (a int);
> > create table t2 (a int);
> > create table t3 (a int);
> >
> > On publisher:
> > create publication pub12 for table t1, t2; create publication pub3 for table t3;
> >
> > On subscriber:
> > create subscription sub connection 'dbname=postgres' publication pub12, pub3;
> >
> > On publisher:
> > alter publication pub12 drop table t2;
> >
> > On subscriber:
> > alter subscription sub drop publication pub12; select srsubid,
> > srrelid::regclass::text, srsubstate, srsublsn from pg_subscription_rel;
> >
> >  srsubid | srrelid | srsubstate | srsublsn
> > ---------+---------+------------+-----------
> >    16393 | t3      | r          | 0/1707CE0
> >    16393 | t2      | r          | 0/1707CE0
> >
> > With v2 patch, the pg_subscription_rel has the entry for 't2' but it should not
> > exist.
> >
> > But that doesn't mean that drop_table should always be true in DROP
> > PULIBCATION cases. Since the tables associated with two publications can
> > overlap, we cannot remove the relation from pg_subscription_rel if it is also
> > associated with the remaining publication. For example:
> >
> > On publisher and subscriber:
> > create table t1 (a int);
> > create table t2 (a int);
> > create table t3 (a int);
> >
> > On publisher:
> > create publication pub12 for table t1, t2; create publication pub13 for table t1, t3;
> >
> > On subscriber:
> > create subscription sub connection 'dbname=postgres' publication pub12,
> > pub13;
> >
> > On subscriber:
> > alter subscription sub drop publication pub12; select srsubid,
> > srrelid::regclass::text, srsubstate, srsublsn from pg_subscription_rel;  srsubid |
> > srrelid | srsubstate | srsublsn
> > ---------+---------+------------+-----------
> >    16393 | t3      | r          | 0/17080B0
> > (1 row)
> >
> > With v2 patch, the pg_subscription_rel has only one entry for t3 but it should
> > have also the one for t1 since it's still subscribing to pub13.
> >
> > To summary, I think that what we want to do in DROP SUBSCRIPTION cases is to
> > drop relations from pg_subscription_rel that are no longer included in the set of
> > after-calculated publications. In the latter example, when ALTER
> > SUBSCRIPTION ... DROP PUBLICATION pub12, we should compare the current
> > set of relations associated with {pub12, pub13} to the set of relcations associated
> > with pub13, not pub12. Then we can find out t2 is no longer associated with the
> > after-calculated publication (i.g., pub13) so remove it.
>
> Thanks for the review. You are right, I missed this point.
> Attach new version patch which fix these problems(Also add a new pl-test).
>

Thank you for updating the patch!

Here are some comments:

I think that it still could happen that DROP PULIBCATION misses
dropping relations. Please consider the following case:

On publisher and subscriber:
create table t1 (a int);
create table t2 (a int);
create table t3 (a int);

On publisher:
create publication pub12 for table t1, t2;
create publication pub3 for table t3;

On subscriber:
create subscription sub connection 'dbname=postgres' publication pub12, pub3;

On publisher:
alter publication pub3 add table t1;

On subscriber:
alter subscription sub drop publication pub12;
select srsubid, srrelid::regclass::text, srsubstate, srsublsn from
pg_subscription_rel;
 srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
   16393 | t3      | r          | 0/1707CE0
   16393 | t1      | r          | 0/1707D18

With v3 patch, pg_subscription_rel has entries for t3 and t1. But if
ALTER SUBSCRIPTION ... DROP PUBLICATION pub12 refreshes only relations
associated with pub12, t1 should be removed and be added when we
refresh pub3.

---
-AlterSubscription_refresh(Subscription *sub, bool copy_data)
+AlterSubscription_refresh(Subscription *sub, bool copy_data,
+                         AlterSubscriptionType type)

I think it's better to add comments to this function. The meaning of
*sub varies depending on 'type'. For example, only in ADD PUBLICATION
cases, 'sub' has only newly-added publications to this function. In
other cases, 'sub' has a new set of publications.

---
+           if (type != ALTER_SUBSCRIPTION_DROP_PUBLICATION)
            {
-               AddSubscriptionRelState(sub->oid, relid,
-                                       copy_data ? SUBREL_STATE_INIT
: SUBREL_STATE_READY,
-                                       InvalidXLogRecPtr);
-               ereport(DEBUG1,
-                       (errmsg_internal("table \"%s.%s\" added to
subscription \"%s\"",
-                                        rv->schemaname, rv->relname,
sub->name)));
+               /* Check for supported relkind. */
+               CheckSubscriptionRelkind(get_rel_relkind(relid),
+                                       rv->schemaname, rv->relname);

I think CheckSubscriptionRelkind() is necessary even in
non-DROP-PUBLICATION cases.

---
+       if (type != ALTER_SUBSCRIPTION_ADD_PUBLICATION)
+       {
+           qsort(pubrel_local_oids, list_length(pubrel_names),
+                 sizeof(Oid), oid_cmp);
+           ntables_to_drop = list_length(subrel_states);
+       }

I think we can break here in ADD PUBLICATION cases, which seems more
readable to me. That way, we don't need ntables_to_drop. Also, we
palloc the memory for sub_remove_rels even in ADD PUBLICATION cases
but I think we don't need to do that.

---
+# Create tables on publisher
+$node_publisher->safe_psql('postgres', "CREATE TABLE temp1 (a int)");
+$node_publisher->safe_psql('postgres', "CREATE TABLE temp2 (a int)");
+$node_publisher->safe_psql('postgres', "CREATE TABLE temp3 (a int)");
+
+# Create tables on subscriber
+$node_subscriber->safe_psql('postgres', "CREATE TABLE temp1 (a int)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE temp2 (a int)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE temp3 (a int)");

Can we execute multiple statements in one safe_psql?

---
+# check initial pg_subscription_rel
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(3),
+ 'three relation was subscribed');

s/three relation/three relations/

---
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_subscription_rel");
+
+is($result, qq(2),
+ 'check one relation was removed from subscribed list');
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_subscription_rel where
srrelid::regclass::text = 'temp1'");
+is($result, qq(0),
+ 'check relation temp1 was removed from subscribed list');

At some places, the test checks the number of total tuples and then
does an existence check. I think we can check the contents of
pg_subscription_rel instead. For example, in the above checks, we can
check if pg_subscription_rel has entries for temp2 and temp3.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: A qsort template
Next
From: Tom Lane
Date:
Subject: Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS