Hi Team,
We are using Postgresql 16.9 in production and with large database about 25TB of size. We have patroni setup with replica instance and DR patroni setup with patroni streaming.
We have high volume and frequent commit in the database. There are few large tables for which we asked client to execute queries on DR/Replica instances but these queries are start getting failed with "canceling statement due to conflict with recovery" and "terminating statement due to conflict with recovery" error.
As I understand the behavior is correct but we need to get rid of this issue.
I gone through the old posts and some documentation and got to know that below parameters can help to reduce this error.
max_standby_streaming_delay
max_standby_archive_delay
hot_standby_feedback = off
Our queries are running for long period that makes me to set this value to some minutes/hours (lets set 900s) which is not feasible for production as it will start impacting the replication lag. Also, the queries will fail if it reaches to mentioned thresholds.
If I set these parameters to "-1" (disable) then there will be direct impact on replication lag which will impact further queries on replica node and DR cluster.
Can you please guide If any other better solution present for such scenario?
Thanks & Regards,
Ishan Joshi