Re: Subscription stuck at initialize state - Mailing list pgsql-general
From | Vijaykumar Jain |
---|---|
Subject | Re: Subscription stuck at initialize state |
Date | |
Msg-id | CAM+6J962X11n9Pic_3izg5GsUtvqAdUQAVCebpbAQBuHvB+fwQ@mail.gmail.com Whole thread Raw |
In response to | Re: Subscription stuck at initialize state (Abhishek Bhola <abhishek.bhola@japannext.co.jp>) |
Responses |
Re: Subscription stuck at initialize state
|
List | pgsql-general |
trimming the email, to avoid noise.
I spent a lot of time trying multiple options/combinations and finally managed to replicate your debug output.
this is when i have ,
postgres=# show max_logical_replication_workers;
max_logical_replication_workers
---------------------------------
2
(1 row)
on publisher
postgres@controller:~$ psql -p 5001
psql (14.1 (Ubuntu 14.1-2.pgdg20.04+1))
Type "help" for help.
postgres=# create table t_bytea(id int primary key, somebytea bytea);
CREATE TABLE
postgres=# create table t1(id int);
CREATE TABLE
postgres=# create table t2(id int);
CREATE TABLE
postgres=# create table t3(id int);
CREATE TABLE
postgres=# create publication mypub1 for table t1;
CREATE PUBLICATION
postgres=# create publication mypub2 for table t2;
CREATE PUBLICATION
postgres=# create publication mypub3 for table t3;
CREATE PUBLICATION
postgres=# create publication mypub4 for table t3;
CREATE PUBLICATION
postgres=# create publication mypub5 for table t_bytea;
CREATE PUBLICATION
postgres=# insert into t_bytea select x,repeat(repeat('xxx', 100), 1000)::bytea from generate_series(1, 1000) x;
INSERT 0 1000
on subscriber
postgres@controller:~$ psql -p 5002
psql (14.1 (Ubuntu 14.1-2.pgdg20.04+1))
Type "help" for help.
postgres=# create table t_bytea(id int primary key, somebytea bytea);
CREATE TABLE
postgres=# create table t1(id int);
CREATE TABLE
postgres=# create table t2(id int);
CREATE TABLE
postgres=# create table t3(id int);
CREATE TABLE
postgres=# create table t4(id int);
CREATE TABLE
postgres=# create table t5(id int);
CREATE TABLE
postgres=# create subscription mysub1 connection 'port=5001' publication mypub1;
NOTICE: created replication slot "mysub1" on publisher
CREATE SUBSCRIPTION
postgres=# create subscription mysub2 connection 'port=5001' publication mypub2;
NOTICE: created replication slot "mysub2" on publisher
CREATE SUBSCRIPTION
postgres=# create subscription mysub3 connection 'port=5001' publication mypub3;
NOTICE: created replication slot "mysub3" on publisher
CREATE SUBSCRIPTION
postgres=# create subscription mysub4 connection 'port=5001' publication mypub4;
NOTICE: created replication slot "mysub4" on publisher
CREATE SUBSCRIPTION
postgres=# create subscription mysub5 connection 'port=5001' publication mypub5;
NOTICE: created replication slot "mysub5" on publisher
CREATE SUBSCRIPTION
postgres=# select count(1) from t_bytea;
count
-------
0
(1 row)
postgres=# table pg_subscription_rel;
srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
16406 | 16391 | r | 0/1722838
16407 | 16394 | i |
16408 | 16397 | i |
16409 | 16397 | i |
16410 | 16384 | i |
(5 rows)
# as expected no data in t_bytea (as it could not get any worker)
postgres=# select count(1) from t_bytea;
count
-------
0
(1 row)
but logs clearly state the problem
2022-02-03 23:18:31.107 IST [3430] LOG: logical replication table synchronization worker for subscription "mysub1", table "t1" has started
2022-02-03 23:18:31.138 IST [3430] LOG: logical replication table synchronization worker for subscription "mysub1", table "t1" has finished
2022-02-03 23:18:40.730 IST [3433] LOG: logical replication apply worker for subscription "mysub2" has started
2022-02-03 23:18:40.737 IST [3433] WARNING: out of logical replication worker slots
2022-02-03 23:18:40.737 IST [3433] HINT: You might need to increase max_logical_replication_workers.
2022-02-03 23:18:45.865 IST [3433] WARNING: out of logical replication worker slots
#publisher logs
2022-02-03 23:18:31.096 IST [3427] STATEMENT: CREATE_REPLICATION_SLOT "mysub1" LOGICAL pgoutput NOEXPORT_SNAPSHOT
2022-02-03 23:18:31.106 IST [3429] LOG: starting logical decoding for slot "mysub1"
2022-02-03 23:18:31.106 IST [3429] DETAIL: Streaming transactions committing after 0/1722800, reading WAL from 0/17227C8.
2022-02-03 23:18:31.106 IST [3429] STATEMENT: START_REPLICATION SLOT "mysub1" LOGICAL 0/0 (proto_version '2', publication_names '"mypub1"')
2022-02-03 23:18:31.106 IST [3429] LOG: logical decoding found consistent point at 0/17227C8
2022-02-03 23:18:31.106 IST [3429] DETAIL: There are no running transactions.
2022-02-03 23:18:31.106 IST [3429] STATEMENT: START_REPLICATION SLOT "mysub1" LOGICAL 0/0 (proto_version '2', publication_names '"mypub1"')
2022-02-03 23:18:31.129 IST [3431] LOG: logical decoding found consistent point at 0/1722800
2022-02-03 23:18:31.129 IST [3431] DETAIL: There are no running transactions.
2022-02-03 23:18:31.129 IST [3431] STATEMENT: CREATE_REPLICATION_SLOT "pg_16406_sync_16391_7060540926182153512" LOGICAL pgoutput USE_SNAPSHOT
2022-02-03 23:18:31.135 IST [3431] LOG: starting logical decoding for slot "pg_16406_sync_16391_7060540926182153512"
2022-02-03 23:18:31.135 IST [3431] DETAIL: Streaming transactions committing after 0/1722838, reading WAL from 0/1722800.
2022-02-03 23:18:31.135 IST [3431] STATEMENT: START_REPLICATION SLOT "pg_16406_sync_16391_7060540926182153512" LOGICAL 0/1722838 (proto_version '2', publication_names '"mypub1"')
2022-02-03 23:18:35.718 IST [3432] LOG: logical decoding found consistent point at 0/1722838
2022-02-03 23:18:35.718 IST [3432] DETAIL: There are no running transactions.
2022-02-03 23:18:35.718 IST [3432] STATEMENT: CREATE_REPLICATION_SLOT "mysub2" LOGICAL pgoutput NOEXPORT_SNAPSHOT
2022-02-03 23:18:40.737 IST [3434] LOG: starting logical decoding for slot "mysub2"
2022-02-03 23:18:40.737 IST [3434] DETAIL: Streaming transactions committing after 0/1722870, reading WAL from 0/1722838.
2022-02-03 23:18:40.737 IST [3434] STATEMENT: START_REPLICATION SLOT "mysub2" LOGICAL 0/0 (proto_version '2', publication_names '"mypub2"')
2022-02-03 23:18:40.737 IST [3434] LOG: logical decoding found consistent point at 0/1722838
2022-02-03 23:18:40.737 IST [3434] DETAIL: There are no running transactions.
2022-02-03 23:18:40.737 IST [3434] STATEMENT: START_REPLICATION SLOT "mysub2" LOGICAL 0/0 (proto_version '2', publication_names '"mypub2"')
2022-02-03 23:18:40.857 IST [3435] LOG: logical decoding found consistent point at 0/1722870
2022-02-03 23:18:40.857 IST [3435] DETAIL: There are no running transactions.
2022-02-03 23:18:40.857 IST [3435] STATEMENT: CREATE_REPLICATION_SLOT "mysub3" LOGICAL pgoutput NOEXPORT_SNAPSHOT
2022-02-03 23:18:46.091 IST [3437] LOG: logical decoding found consistent point at 0/17228A8
2022-02-03 23:18:46.091 IST [3437] DETAIL: There are no running transactions.
2022-02-03 23:18:46.091 IST [3437] STATEMENT: CREATE_REPLICATION_SLOT "mysub4" LOGICAL pgoutput NOEXPORT_SNAPSHOT
2022-02-03 23:18:51.941 IST [3439] LOG: logical decoding found consistent point at 0/17228E0
2022-02-03 23:18:51.941 IST [3439] DETAIL: There are no running transactions.
2022-02-03 23:18:51.941 IST [3439] STATEMENT: CREATE_REPLICATION_SLOT "mysub5" LOGICAL pgoutput NOEXPORT_SNAPSHOT
# solution
so i bump the max_logical_replication_workers = 10 and restart the db servers.
as there are workers available, replication catches up and tables in sync.
# on subscriber
psql (14.1 (Ubuntu 14.1-2.pgdg20.04+1))
Type "help" for help.
postgres=# select count(1) from t_bytea;
count
-------
0
(1 row)
postgres=# table pg_subscription_rel;
srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
16406 | 16391 | r | 0/1722838
16407 | 16394 | r | 0/2000110
16408 | 16397 | r | 0/2000148
16409 | 16397 | r | 0/2000148
16410 | 16384 | r | 0/20001B8
(5 rows)
postgres=# select count(1) from t_bytea;
count
-------
1000
(1 row)
so two things,
can you check logs for warning messages of out of logical replication worker slots on subscribers ?
can you try bumping them, and check if it catches up (srsubstate should either be r(READY) or d (INITIAL COPY))
also monitor logs for both publisher and subscriber.
I also figured out all the debugging steps I requested earlier were useless.
Thanks,
Vijay
pgsql-general by date: