On Thu, 17 Jul 2025 at 00:36, PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 18988
> Logged by: Alexander Lakhin
> Email address: exclusion@gmail.com
> PostgreSQL version: 18beta1
> Operating system: Ubuntu 24.04
> Description:
>
> The following script:
> createdb ndb
> echo "
> CREATE SUBSCRIPTION testsub CONNECTION 'dbname=ndb' PUBLICATION testpub WITH
> (connect = false);
> " | psql
>
> echo "
> DROP SUBSCRIPTION testsub
> " | psql &
> sleep 1
> timeout 30 psql ndb -c "SELECT 1" || echo "TIMEOUT"
>
> makes DROP SUBSCRIPTION stuck on waiting for a connection to drop a slot,
> while this connection is waiting for a lock for relation 6100
> (pg_subscription), locked by DROP SUBSCRIPTION:
> law 1545967 1545946 0 21:10 ? 00:00:00 postgres: law regression
> [local] DROP SUBSCRIPTION
> law 1545968 1545946 0 21:10 ? 00:00:00 postgres: walsender law
> ndb [local] startup waiting
>
> With debug_discard_caches = 1 or under some lucky circumstances (I
> encountered these), this leads to inability to connect to any database.
>
> Reproduced on REL_13_STABLE .. master.
Thanks, I was able to reproduce the issue using the steps provided.
The problem occurs because: When dropping a subscription, it takes an
AccessExclusiveLock on the pg_subscription system tables to prevent
the launcher from restarting the worker. During this process, it also
attempts to connect to the publisher in order to drop the replication
slot. As we are connecting to a newly created database, it may not yet
have initialized its catalog caches. As part of the backend startup,
it attempts to build the cache hierarchy via:
RelationCacheInitializePhase3 → InitCatalogCachePhase2 →
InitCatCachePhase2 This cache initialization requires acquiring a
shared lock on pg_subscription, since it is one of the syscache-backed
catalog tables. But that shared lock is blocked by the
AccessExclusiveLock already held by the dropping process. As a result,
the new backend hangs waiting for the lock, and the original DROP
SUBSCRIPTION process cannot proceed, leading to a self-blocking
scenario.
In this specific case, no replication slot was created during
subscription creation as the connect option was specified as false.
Therefore, I believe the system should skip connecting to the
publisher when dropping the subscription. I've attached a patch that
addresses this behavior. Thoughts?
Regards,
Vignesh