Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table
Date
Msg-id 64f76401-156f-7e16-0bf6-fc8e4daf272b@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table
Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table
List pgsql-hackers
On 13/06/17 21:49, Peter Eisentraut wrote:
> On 6/13/17 02:33, Noah Misch wrote:
>>> Steps to reproduce -
>>> X cluster -> create 100 tables , publish all tables  (create publication pub
>>> for all tables);
>>> Y Cluster -> create 100 tables ,create subscription(create subscription sub
>>> connection 'user=centos host=localhost' publication pub;
>>> Y cluster ->drop subscription - drop subscription sub;
>>>
>>> check the log file on Y cluster.
>>>
>>> Sometime , i have seen this error on psql prompt and drop subscription
>>> operation got failed at first attempt.
>>>
>>> postgres=# drop subscription sub;
>>> ERROR:  tuple concurrently updated
>>> postgres=# drop subscription sub;
>>> NOTICE:  dropped replication slot "sub" on publisher
>>> DROP SUBSCRIPTION
>>
>> [Action required within three days.  This is a generic notification.]
> 
> It's being worked on.  Let's see by Thursday.
> 

Attached fixes it (it was mostly about order of calls). I also split the
SetSubscriptionRelState into 2 separate interface while I was changing
it, because now that the update_only bool was added it has become quite
strange to have single interface for what is basically two separate
functions.

There are still couple of remaining issues from this thread though.
Namely the AccessExclusiveLock of the pg_subscription catalog which is
not very pretty, but we need a way to block launcher from accessing the
subscription which is being dropped and make sure it will not start new
workers for it afterwards. Question is how however as by the time
launcher can lock individual subscription it is already processing it.
So it looks to me like we'd need to reread the catalog with new snapshot
after the lock was acquired which seems bit wasteful (I wonder if we
could just AcceptInvalidationMessages and refetch from syscache). Any
better ideas?

Other related problem is locking of subscriptions during operations on
them, especially AlterSubscription seems like it should lock the
subscription itself. I did that in 0002.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: [HACKERS] memory fields from getrusage()
Next
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] logical replication busy-waiting on a lock