On Fri, 2020-11-13 at 15:24 +0200, Radoslav Nedyalkov wrote: > On a very busy master-standby setup which runs typical olap processing - > long living , massive writes statements, we're getting on the standby: > > ERROR: canceling statement due to conflict with recovery > FATAL: terminating connection due to conflict with recovery > > The weird thing is that cancellations happen usually after standby has experienced > some huge delay(2h), still not at the allowed maximum(3h). Even recently run statements > got cancelled when the delay is already at zero. > > Sometimes the situation got relaxed after an hour or so. > Restarting the server instantly helps. > > It is pg11.8, centos7, hugepages, shared_buffers 196G from 748G. > > What phenomenon could we be facing?
Hard to say. Perhaps an unusual kind of replication conflict?
What is in "pg_stat_database_conflicts" on the standby server?