Re: Broken logical replication - Mailing list pgsql-general

From Игорь Выскорко
Subject Re: Broken logical replication
Date
Msg-id afc348be-2e24-7d90-379b-f55852e4f1eb@yandex.ru
Whole thread Raw
List pgsql-general
Hi experts!
I didn't notice html tags in previous message. Sorry for that. Hope that 
was the only reason of no answers :)

Original message was:
Giving 2 postgres servers:
1. Master -  PostgreSQL 10.16
2. Slave - PostgreSQL 13.5
Logical replication was configured and worked fine between them.
At some point software (russian accounting soft: 1C) which is using 
master DB, decided to do some db-related tasks. I know only about full 
reindexing - all indexes are built again - probably there were more DDL...
Ok, when it's finished I found that replication was broken:
1. Publication object exists but all publication tables were dropped. I 
mean not tables itself but:
select * from pg_publication_tables gave 0 rows. Ok, I've added them 
again - didn't help
2. In log I've found repeated bunch of records:
янв 20 23:33:29 testpg postgres[6291]: [9-1] 2022-01-20 16:33:29.087 UTC 
[6291] ERROR:  replication slot "subscr" is active for PID 3984
янв 20 23:33:34 testpg postgres[6295]: [7-1] 2022-01-20 16:33:34.101 UTC 
[6295] LOG:  connection received: host=192.168.7.225 port=43428
янв 20 23:33:34 testpg postgres[6295]: [8-1] 2022-01-20 16:33:34.102 UTC 
[6295] LOG:  replication connection authorized: user=rep_user
янв 20 23:33:34 testpg postgres[6295]: [9-1] 2022-01-20 16:33:34.104 UTC 
[6295] ERROR:  replication slot "subscr" is active for PID 3984
янв 20 23:33:39 testpg postgres[6298]: [7-1] 2022-01-20 16:33:39.117 UTC 
[6298] LOG:  connection received: host=192.168.7.225 port=43470
янв 20 23:33:39 testpg postgres[6298]: [8-1] 2022-01-20 16:33:39.118 UTC 
[6298] LOG:  replication connection authorized: user=rep_user
янв 20 23:33:39 testpg postgres[6298]: [9-1] 2022-01-20 16:33:39.537 UTC 
[6298] LOG:  starting logical decoding for slot "subscr"
янв 20 23:33:39 testpg postgres[6298]: [9-2] 2022-01-20 16:33:39.537 UTC 
[6298] DETAIL:  streaming transactions committing after 1E0/3449C020, 
reading WAL from 1D9/EEAD19E8
янв 20 23:33:39 testpg postgres[6298]: [10-1] 2022-01-20 16:33:39.538 
UTC [6298] LOG:  logical decoding found consistent point at 1D9/EEAD19E8
янв 20 23:33:39 testpg postgres[6298]: [10-2] 2022-01-20 16:33:39.538 
UTC [6298] DETAIL:  There are no running transactions.
3. Replication status:
db=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid              | 6298
usesysid         | 16384
usename          | rep_user
application_name | subscr
client_addr      | 192.168.7.225
client_hostname  |
client_port      | 43470
backend_start    | 2022-01-20 16:33:39.117019+00
backend_xmin     |
state            | catchup
sent_lsn         | 1DE/E849BFD8
write_lsn        | 1E0/3449C020
flush_lsn        | 1E0/3449C020
replay_lsn       | 1E0/3449C020
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
sent_lsn is lower than write_lsn, what? Is it legal?)
4. pg_wal directory was bloated - it's ok I suppose because of opened 
replication slot.
So, my question here: is it possible to fix replication without full 
restart (truncating tables in slave and copy all data)?



pgsql-general by date:

Previous
From: Michael Paquier
Date:
Subject: Re: could not open relation with OID
Next
From: Merlin Moncure
Date:
Subject: Re: Counting the number of repeated phrases in a column