[HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION |
Date | |
Msg-id | 20170201.173623.66249355.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
Responses |
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
|
List | pgsql-hackers |
Hello, while looking another bug, I found that standby cannot shutdown after DROP SUBSCRIPTION. standby=# CREATE SUBSCRPTION sub1 ... standby=# .... standby=# DROP SUBSCRIPTION sub1; Ctrl-C to the standby fails to work. ApplyLauncherMain is waiting LogicalRepLauncherLock forever. The culprit is DropSbuscription. It acquires LogicalRepLauncherLock but never releases. The attached patch fixes it. Most part of the fucntion is now enclosed by PG_TRY-CATCH since some functions can throw exceptions. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 5de9999..f07143e 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -511,51 +511,67 @@ DropSubscription(DropSubscriptionStmt *stmt) /* Protect against launcher restarting the worker.*/ LWLockAcquire(LogicalRepLauncherLock, LW_EXCLUSIVE); - /* Kill the apply worker so that the slot becomes accessible. */ - logicalrep_worker_stop(subid); - - /* Remove the origin tracking if exists. */ - snprintf(originname, sizeof(originname), "pg_%u", subid); - originid = replorigin_by_name(originname, true); - if (originid != InvalidRepOriginId) - replorigin_drop(originid); - - /* If the user asked to not drop the slot, we are done mow.*/ - if (!stmt->drop_slot) - { - heap_close(rel, NoLock); - return; - } - /* - * Otherwise drop the replication slot at the publisher node using - * the replication connection. + * replorigin_drop can throw an exception. we must release + * LogicalRepLauncherLock for the case. */ - load_file("libpqwalreceiver", false); + PG_TRY(); + { + /* Kill the apply worker so that the slot becomes accessible. */ + logicalrep_worker_stop(subid); - initStringInfo(&cmd); - appendStringInfo(&cmd, "DROP_REPLICATION_SLOT \"%s\"", slotname); + /* Remove the origin tracking if exists. */ + snprintf(originname, sizeof(originname), "pg_%u", subid); + originid = replorigin_by_name(originname, true); + if (originid != InvalidRepOriginId) + replorigin_drop(originid); - wrconn = walrcv_connect(conninfo, true, subname, &err); - if (wrconn == NULL) - ereport(ERROR, - (errmsg("could not connect to publisher when attempting to " - "drop the replication slot \"%s\"", slotname), - errdetail("The error was: %s", err))); + /* If the user asked to not drop the slot, we are done now.*/ + if (stmt->drop_slot) + { + /* + * Otherwise drop the replication slot at the publisher node using + * the replication connection. + */ + load_file("libpqwalreceiver", false); + + initStringInfo(&cmd); + appendStringInfo(&cmd, "DROP_REPLICATION_SLOT \"%s\"", slotname); + + wrconn = walrcv_connect(conninfo, true, subname, &err); + if (wrconn == NULL || !walrcv_command(wrconn, cmd.data, &err)) + { + if (wrconn == NULL) + ereport(ERROR, + (errmsg("could not connect to publisher when " + "attempting to drop the " + "replication slot \"%s\"", slotname), + errdetail("The error was: %s", err))); - if (!walrcv_command(wrconn, cmd.data, &err)) - ereport(ERROR, - (errmsg("could not drop the replication slot \"%s\" on publisher", - slotname), - errdetail("The error was: %s", err))); - else - ereport(NOTICE, - (errmsg("dropped replication slot \"%s\" on publisher", - slotname))); + ereport(ERROR, + (errmsg("could not drop the replication slot \"%s\" on publisher", + slotname), + errdetail("The error was: %s", err))); + } - walrcv_disconnect(wrconn); + ereport(NOTICE, + (errmsg("dropped replication slot \"%s\" on publisher", + slotname))); - pfree(cmd.data); + pfree(cmd.data); + } + } + PG_CATCH(); + { + LWLockRelease(LogicalRepLauncherLock); + PG_RE_THROW(); + } + PG_END_TRY(); + + LWLockRelease(LogicalRepLauncherLock); + + if (wrconn) + walrcv_disconnect(wrconn); heap_close(rel, NoLock);}
pgsql-hackers by date: