Thread: PostgreSQL 9.2 high replication lag
This message is encrypted. Both the Public Key and the GPG encrypted message are included in this email so that you can verify its origin.
Attachment
This message is encrypted. Both the Public Key and the GPG encrypted message are included in this email so that you can verify its origin.
On Wednesday, July 14th, 2021 at 1:09 PM, Lucas <root@sud0.nz> wrote:
Hello all,I have a cluster of PostgreSQL 9.2 servers (yes, I know... we're working on a migration to PG 13) that has one master and one slave.I know that the streaming replication is expected to have delays, but this is getting worse and worse everyday. You can see the replication gap is reaching >10 minutes quite often.The servers are hosted in AWS as EC2 instances and they're both in different AZs. I also know that the streaming replication was first introduced in PG 9.0, so we're using a very old version of it.. I'm sure it has improved over the years...What can I look for to make this lag not so high? Is there anything I could change in my postgresql.conf files? Any tips?Thanks!---Regards,LucasThis message is encrypted. Both the Public Key and the GPG encrypted message are included in this email so that you can verify its origin.
Attachment
According to the documentation, max_standby_streaming_delay is a configuration parameter determining how long a standby server should wait before canceling queries that conflict with pending WAL entries received via streaming replication.My question then is: Which queries, if the slave can only receive SELECTs?
On Tue, 2021-07-13 at 20:14 -0700, David G. Johnston wrote: > On Tue, Jul 13, 2021 at 8:01 PM Lucas <root@sud0.nz> wrote: > > According to the documentation, max_standby_streaming_delay is a configuration parameter > > determining how long a standby server should wait before canceling queries that conflict > > with pending WAL entries received via streaming replication. > > > > My question then is: Which queries, if the slave can only receive SELECTs? > > The select queries... > > Follow the link in the docs for max_stanbdy_steaming_delay to: > > https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT > > for the details. Perhaps you might find the following article interesting, where I tried to discuss this topic in some depth: https://www.cybertec-postgresql.com/en/streaming-replication-conflicts-in-postgresql/ Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tue, 2021-07-13 at 20:14 -0700, David G. Johnston wrote:
On Tue, Jul 13, 2021 at 8:01 PM Lucas root@sud0.nz wrote:
According to the documentation, max_standby_streaming_delay is a configuration parameter
determining how long a standby server should wait before canceling queries that conflict
with pending WAL entries received via streaming replication.
My question then is: Which queries, if the slave can only receive SELECTs?
The select queries...
Follow the link in the docs for max_stanbdy_steaming_delay to:
https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT
for the details.
Perhaps you might find the following article interesting, where I tried to
discuss this topic in some depth:
https://www.cybertec-postgresql.com/en/streaming-replication-conflicts-in-postgresql/
Attachment
‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐On Wednesday, July 14th, 2021 at 5:30 PM, Laurenz Albe laurenz.albe@cybertec.at wrote:On Tue, 2021-07-13 at 20:14 -0700, David G. Johnston wrote:
On Tue, Jul 13, 2021 at 8:01 PM Lucas root@sud0.nz wrote:
According to the documentation, max_standby_streaming_delay is a configuration parameter
determining how long a standby server should wait before canceling queries that conflict
with pending WAL entries received via streaming replication.
My question then is: Which queries, if the slave can only receive SELECTs?
The select queries...
Follow the link in the docs for max_stanbdy_steaming_delay to:
https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT
for the details.
Perhaps you might find the following article interesting, where I tried to
discuss this topic in some depth:
https://www.cybertec-postgresql.com/en/streaming-replication-conflicts-in-postgresql/
Thanks!I have changed max_standby_streaming_delay from -1 to 120s. However, I can still see the replication being delayed for up to 3 minutes quite frequent.I will try to see if I can find what kind of query is causing that... Maybe via pg_stat_database_conflicts?
Attachment
On Fri, Jul 16, 2021 at 12:38 AM Lucas <root@sud0.nz> wrote: > However, it hasn't changed and the replication lag is still high all the time. > Since the documentation states that max_standby_streaming_delay "[...]it is the maximum total time allowed to apply WAL data once it has been received from the primary server[...]", could it be the trivial explaination that the master is serving WALs with an high delay? Luca
On Fri, Jul 16, 2021 at 12:38 AM Lucas <root@sud0.nz> wrote:
> However, it hasn't changed and the replication lag is still high all the time.
>
Since the documentation states that max_standby_streaming_delay
"[...]it is the maximum total time allowed to apply WAL data once it
has been received from the primary server[...]", could it be the
trivial explaination that the master is serving WALs with an high
delay?
Luca
Version 9.2 is very old and has a lot of issues related to streaming replication. Additionally, it is already EOL.
Furthermore, max_standby_streaming_delay has no relation with the streaming lag. It's associated with queries conflicting with data. This parameter will not speed up the WAL apply operation.
Kindly consider upgrading to a supported version.
Attachment
This message is encrypted. Both the Public Key and the GPG encrypted message are included in this email so that you can verify its origin.
Version 9.2 is very old and has a lot of issues related to streaming replication. Additionally, it is already EOL.Agreed.Furthermore, max_standby_streaming_delay has no relation with the streaming lag. It's associated with queries conflicting with data. This parameter will not speed up the WAL apply operation.Yes, I know. But this parameter determines how long the standby server should wait before canceling standby queries that conflict with about-to-be-applied WAL entries. So, there could be lots of queries conflicting with about-to-be-applied WAL entries.Again, I was not sure about what is causing the delay so I needed to try any possibility.Kindly consider upgrading to a supported version.Yes. We have a project in progress to upgrade.On top of that, I can say that I have restarted the slave and the replication is now normal.You can see above that the replication delay is not passing 3 minutes, which is the max_standby_streaming_delay, but the actual average is 5 seconds. So, looks like rebooting the EC2 instance resolved the issue, but I do not know why.
Attachment
Hello all,I have a cluster of PostgreSQL 9.2 servers (yes, I know... we're working on a migration to PG 13) that has one master and one slave.I know that the streaming replication is expected to have delays, but this is getting worse and worse everyday. You can see the replication gap is reaching >10 minutes quite often.The servers are hosted in AWS as EC2 instances and they're both in different AZs. I also know that the streaming replication was first introduced in PG 9.0, so we're using a very old version of it.. I'm sure it has improved over the years...What can I look for to make this lag not so high? Is there anything I could change in my postgresql.conf files? Any tips?Thanks!---Regards,LucasThis message is encrypted. Both the Public Key and the GPG encrypted message are included in this email so that you can verify its origin.
Attachment
On Fri, 2021-08-20 at 01:33 +0000, Lucas wrote: > After setting max_standby_streaming_delay to 120s it got a lot better. > But the replication delay is still happening quite often, except this time goes up to 120s only. That's exactly what this parameter should do. If you don't want the delays, either reduce the value (and get more canceled queries) or try to reduce the number of conflicts, for example by setting "hot_standby_feedback = on". Note that you will never be able to completely get rid of replication colflicts; for example, there are buffer pin conflicts or lock conflicts caused by autovacuum truncation. See this article for more: https://www.cybertec-postgresql.com/en/streaming-replication-conflicts-in-postgresql/ If you want a standby that has no apply delays and no canceled queries is usually not possible. Consider using two standby servers for these two purposes. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
I realise you may have other complications, but we have upgraded numerous v9.6 servers (on AWS) to v12 in the last year, with basically no problems in the upgrade, and many less problems with streaming. It might take less time to complete the upgrade??
Attachment
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Fri, 2021-08-20 at 01:33 +0000, Lucas wrote:
After setting max_standby_streaming_delay to 120s it got a lot better.
But the replication delay is still happening quite often, except this time goes up to 120s only.
That's exactly what this parameter should do.
If you don't want the delays, either reduce the value (and get more canceled queries)
or try to reduce the number of conflicts, for example by setting "hot_standby_feedback = on".
Note that you will never be able to completely get rid of replication colflicts;
for example, there are buffer pin conflicts or lock conflicts caused by autovacuum
truncation.
See this article for more:
https://www.cybertec-postgresql.com/en/streaming-replication-conflicts-in-postgresql/
If you want a standby that has no apply delays and no canceled queries is usually
not possible. Consider using two standby servers for these two purposes.