On Thu, Jun 15, 2017 at 11:49 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> On 15/06/17 15:52, Peter Eisentraut wrote:
>> On 6/15/17 02:41, Petr Jelinek wrote:
>>> Hmm, forcibly stopping currently running table sync is not what was
>>> intended, I'll have to look into it. We should not be forcibly stopping
>>> anything except the main apply worker during drop subscription (and we
>>> do that only because we can't drop the remote replication slot otherwise).
>>
>> The change being complained about was specifically to address the
>> problem described in the commit message:
>>
>> Stop table sync workers when subscription relation entry is removed
>>
>> When a table sync worker is in waiting state and the subscription table
>> entry is removed because of a concurrent subscription refresh, the
>> worker could be left orphaned. To avoid that, explicitly stop the
>> worker when the pg_subscription_rel entry is removed.
>>
>>
>> Maybe that wasn't the best solution. Alternatively, the tablesync
>> worker has to check itself whether the subscription relation entry has
>> disappeared, or we need a post-commit check to remove orphaned workers.
>>
>
> Ah I missed that commit/discussion, otherwise I would have probably
> complained. I don't like killing workers in the DDL command, as I said
> the only reason we do it in DropSubscription is to free the slot for
> dropping.
>
> I think that either of the options you suggested now would be better.
> We'll need that for stopping the tablesync which is in progress during
> DropSubscription as well as those will currently still keep running. I
> guess we could simply just register syscache callback, the only problem
> with that is we'd need to AcceptInvalidationMessages regularly while we
> do the COPY which is not exactly free so maybe killing at the end of
> transaction would be better (both for refresh and drop)?
>
Attached patch makes table sync worker check its relation subscription
state at the end of COPY and exits if it has disappeared, instead of
killing table sync worker in DDL. There is still a problem that a
table sync worker for a large table can hold a slot for a long time
even after its state is deleted. But it would be for PG11 item.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers