Re: [HACKERS] Re: Alter subscription..SET - NOTICE message is comingfor table which is already removed - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Re: Alter subscription..SET - NOTICE message is comingfor table which is already removed
Date
Msg-id CAD21AoD6N+9WoQjH-+AGV3iq2qKNFnqS_uNtD_8sFcddGgHGNA@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Re: Alter subscription..SET - NOTICE message is coming for tablewhich is already removed  (tushar <tushar.ahuja@enterprisedb.com>)
Responses Re: [HACKERS] Re: Alter subscription..SET - NOTICE message is comingfor table which is already removed  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Re: [HACKERS] Re: Alter subscription..SET - NOTICE message is comingfor table which is already removed  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Re: [HACKERS] Re: Alter subscription..SET - NOTICE message is comingfor table which is already removed  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On Thu, May 25, 2017 at 9:54 PM, tushar <tushar.ahuja@enterprisedb.com> wrote:
> On 05/25/2017 03:38 PM, tushar wrote:
>>
>> While performing - Alter subscription..SET  , I found that NOTICE message
>> is coming duplicate next time , which  is not needed anymore.
>
> There is an another example - where i am getting "ERROR: subscription table
> 16435 in subscription 16684 does not exist" in standby log file
>
> 2017-05-25 13:51:48.825 BST [32138] NOTICE:  removed subscription for table
> public.t96
> 2017-05-25 13:51:48.825 BST [32138] NOTICE:  removed subscription for table
> public.t97
> 2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for table
> public.t98
> 2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for table
> public.t99
> 2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for table
> public.t100
> 2017-05-25 13:51:48.827 BST [32138] LOG:  duration: 35.404 ms statement:
> alter subscription c1 set publication p1 refresh;
> 2017-05-25 13:51:49.192 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.198 BST [32368] LOG:  logical replication sync for
> subscription c1, table t16 started
> 2017-05-25 13:51:49.198 BST [32368] ERROR:  subscription table 16429 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.199 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.200 BST [32065] LOG:  worker process: logical
> replication worker for subscription 16684 sync 16429 (PID 32368) exited with
> exit code 1
> 2017-05-25 13:51:49.204 BST [32369] LOG:  logical replication sync for
> subscription c1, table t17 started
> 2017-05-25 13:51:49.204 BST [32369] ERROR:  subscription table 16432 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.205 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.205 BST [32065] LOG:  worker process: logical
> replication worker for subscription 16684 sync 16432 (PID 32369) exited with
> exit code 1
> 2017-05-25 13:51:49.209 BST [32370] LOG:  logical replication sync for
> subscription c1, table t18 started
> 2017-05-25 13:51:49.209 BST [32370] ERROR:  subscription table 16435 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.210 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.210 BST [32065] LOG:  worker process: logical
> replication worker for subscription 16684 sync 16435 (PID 32370) exited with
> exit code 1
> 2017-05-25 13:51:49.213 BST [32371] LOG:  logical replication sync for
> subscription c1, table t19 started
> 2017-05-25 13:51:49.213 BST [32371] ERROR:  subscription table 16438 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.214 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
>
>
> Steps to reproduce -
> X cluster ->
> create 100 tables , publish all tables (create publication pub for table
> t1,t2,t2..........t100;)
> create one more table (create table t101(n int), create publication ,
> publish only that table (create publication p1 for table t101;)
>
> Y Cluster ->
> create subscription (create subscription c1 connection 'host=localhost
> port=5432 user=centos ' publication pub;
> alter subscription c1 set publication p1 refresh;
> alter subscription c1 set publication pub refresh;
> alter subscription c1 set publication p1 refresh;
>
> check the log file.
>

Thanks.

I think this cause is that the relation status entry could be deleted
by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
starting. Attached patch fixes issues reported on this thread so far.

However there is one more problem here; if the relation status entry
is deleted while corresponding table sync worker is waiting to be
changed its status, the table sync worker can be orphaned in waiting
status. In this case, should table sync worker check the relation
status and exits if the relation status record gets removed? Or should
ALTER SUBSCRIPTION update status of table sync worker to UNKNOWN?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: [HACKERS] [PATCH] relocation truncated to fit: citus buildfailure on s390x
Next
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Fix GetOldestXmin comment