Re: CREATE SUBSCRIPTION hangs indefinitely - Mailing list pgsql-admin

From Kouber Saparev
Subject Re: CREATE SUBSCRIPTION hangs indefinitely
Date
Msg-id CAN4RuQt5tTd+wphb7so7n1W6upEC=Z_oo628oWMW51Hk2R3xNQ@mail.gmail.com
Whole thread Raw
In response to CREATE SUBSCRIPTION hangs indefinitely  (Kouber Saparev <kouber@gmail.com>)
List pgsql-admin
We reproduced the same behaviour after upgrading to PostgreSQL 17.3 yesterday. Creating a new subscription pointing to a streaming replica requires restart of the primary database, which is reducing the flexibility of logical replication a lot.

На ср, 12.02.2025 г. в 20:19 Kouber Saparev <kouber@gmail.com> написа:
Hi folks,

We have a streaming replication setup running PostgreSQL 17.2 as follows:
- Server A (primary)
- Server B (replica)

We also have another separate PostgreSQL 17.2 (Server C), that is replicating some of the tables from the cluster above logically.

When we try to CREATE SUBSCRIPTION from Server C to Server B (streaming replica), the command hangs indefinitely. Eventually we tried to create the replication slot manually on Server B, but pg_create_logical_replication_slot() also hangs. With the log_min_messages set to debug5 we get the output below.

db=# SELECT pg_create_logical_replication_slot('xxx', 'pgoutput');
DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
DEBUG:  xmin required by slots: data 900062771, catalog 900062771
DEBUG:  find_in_dynamic_libpath: trying "/usr/lib/postgresql/17/lib/pgoutput"
DEBUG:  find_in_dynamic_libpath: trying "/usr/lib/postgresql/17/lib/pgoutput.so"
DEBUG:  searching for logical decoding starting point, starting at 1C64/B9B92760
DEBUG:  switched to timeline 1 valid until 0/0

At this stage the command is hanging. Now, as we did our own research through the available resources online, we suspected that the command could be blocked by some long lasting write transaction. Given that Server B is a read-only replica, the only "writing" part could be the WAL stream coming from the primary. And indeed, when we restarted the PostgreSQL service on the primary Server A, the slot creation was let through.

LOG:  logical decoding found consistent point at 1C64/B9B927D8
DETAIL:  There are no running transactions.
DEBUG:  CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
LOG:  duration: 26063.058 ms  statement: select pg_create_logical_replication_slot('xxx', 'pgoutput');
 pg_create_logical_replication_slot
────────────────────────────────────
 (xxx,1C64/B9B92810)
(1 row)

Is there some better way to "unhang" the slot creation other than restarting our primary server? We also tried pg_wal_replay_pause(), without luck.

Note that none of this behaviour is described in the documentation for CREATE SUBSCRIPTION.

Regards,
--
Kouber Saparev

pgsql-admin by date:

Previous
From: Ron Johnson
Date:
Subject: Feature Request: multithreaded reindexdb --concurrently of a single table
Next
From: Ram Pratap Maurya
Date:
Subject: PG15 DB Error :::could not receive data from client: Connection reset by peer