Thread: synchronous streaming replication
I have a question. Its stays at the intersection of software engineering and PostgreSQL.
I have configured streaming synchronous replication and whit setting "synchronous_commit=remote_apply" to make sure that the slave will always respond the same as the MASTER (this is a developers' requirement that the MASTER always responds the same as SLAVE). I set "hot_standby_feedback=on" and "max_standby_streaming_delay=-1",
max_standby_streaming_delay set to -1 to make MASTER wait indefinitely before SELECT conflicts on the SLAVE will end.
Here's where the problem arises, because not long after the replication has been started some serious delays occur in the form of "replay_lag" - which rather indicates the appearance of conflicts; the replication stops working properly.
From the server logs it appears that UPDATE (select for update) has occurred on the MASTER, and SELECT queries are in progress on SLAVE causing replication conflicts, with setting "max_standby_streaming_delay=-1" they never ends and there are huge lags.
From the findings with the developers it emerged that they do not want me to set max_standby_streaming_delay to a value after which the queries conflicted with replication will be canceled.
- So I'm wondering if, in this configuration, it can work properly at all without setting, for example, "max_standby_streaming_delay=30" ?
- On the other hand I wonder if the application should not be developed in such a way to support replication of PostgreSQL configured as a streaming synchronous replication cluster with "synchronous_commit=remote_apply" ?
- Or perhaps "synchronous streaming replication" is a bad choice, maybe logical replication would be better ?
- What are the best practices?
- Perhaps you just need to force/teach applications to work with synchronous replication in such a way that when the SELECT causes conflicts with replication such queries are canceled and the application should resend/repeat query ?
- I also think that after setting, for example, "max_standby_streaming_delay=30" queries (addressed) to the database should be very well optimized, so that too long queries are not canceled too frequently?
- Do you know any books focused on applications adapted to work in postgresql synchronous streaming replication environment i.e. High Availability?
On Sat, 2022-10-22 at 17:28 +0200, Joseph Kennedy wrote: > Hi,I have a question. Its stays at the intersection of software engineering and PostgreSQL. > > I have configured streaming synchronous replication and whit setting "synchronous_commit=remote_apply" > to make sure that the slave will always respond the same as the MASTER (this is a developers' > requirement that the MASTER always responds the same as SLAVE). I set "hot_standby_feedback=on" > and "max_standby_streaming_delay=-1", > max_standby_streaming_delay set to -1 to make MASTER wait indefinitely before SELECT conflicts on the SLAVE will end. > > Here's where the problem arises, because not long after the replication has been started some serious > delays occur in the form of "replay_lag" - which rather indicates the appearance of conflicts; > the replication stops working properly. > > From the server logs it appears that UPDATE (select for update) has occurred on the MASTER, and SELECT > queries are in progress on SLAVE causing replication conflicts, with setting "max_standby_streaming_delay=-1" > they never ends and there are huge lags. > > From the findings with the developers it emerged that they do not want me to set max_standby_streaming_delay > to a value after which the queries conflicted with replication will be canceled. > * So I'm wondering if, in this configuration, it can work properly at all without setting, for example, > "max_standby_streaming_delay=30" ? > * On the other hand I wonder if the application should not be developed in such a way to support replication > of PostgreSQL configured as a streaming synchronous replication cluster with "synchronous_commit=remote_apply" ? > * Or perhaps "synchronous streaming replication" is a bad choice, maybe logical replication would be better ? > * What are the best practices? > * Perhaps you just need to force/teach applications to work with synchronous replication in such a way that > when the SELECT causes conflicts with replication such queries are canceled and the application should resend/repeatquery ? > * I also think that after setting, for example, "max_standby_streaming_delay=30" queries (addressed) to the > database should be very well optimized, so that too long queries are not canceled too frequently? > * Do you know any books focused on applications adapted to work in postgresql synchronous streaming > replication environment i.e. High Availability? This can never work properly. If you have synchronous replication with "synchronous_commit = remote_apply", COMMIT on the primary will wait until the information has been replayed on the standby. If you set "max_standby_streaming_delay = -1", replication can be delayed indefinitely long in the event of a replication conflict, so COMMIT can take arbitrarily long. You can reduce replication conflicts (by setting "hot_standby_feedback = on" and by altering all tables to set "vacuum_truncate = off"), but you will never get rid of them completely. You will either have to accept stale ready on the standby (by setting "synchronous_commit" to something lower) or you have to accept canceled queries on the standby (by lowering "max_standby_streaming_delay"). Yours, Laurenz Albe
On Sat, 2022-10-22 at 17:28 +0200, Joseph Kennedy wrote:
> Hi,I have a question. Its stays at the intersection of software engineering and PostgreSQL.
>
> I have configured streaming synchronous replication and whit setting "synchronous_commit=remote_apply"
> to make sure that the slave will always respond the same as the MASTER (this is a developers'
> requirement that the MASTER always responds the same as SLAVE). I set "hot_standby_feedback=on"
> and "max_standby_streaming_delay=-1",
> max_standby_streaming_delay set to -1 to make MASTER wait indefinitely before SELECT conflicts on the SLAVE will end.
>
> Here's where the problem arises, because not long after the replication has been started some serious
> delays occur in the form of "replay_lag" - which rather indicates the appearance of conflicts;
> the replication stops working properly.
>
> From the server logs it appears that UPDATE (select for update) has occurred on the MASTER, and SELECT
> queries are in progress on SLAVE causing replication conflicts, with setting "max_standby_streaming_delay=-1"
> they never ends and there are huge lags.
>
> From the findings with the developers it emerged that they do not want me to set max_standby_streaming_delay
> to a value after which the queries conflicted with replication will be canceled.
> * So I'm wondering if, in this configuration, it can work properly at all without setting, for example,
> "max_standby_streaming_delay=30" ?
> * On the other hand I wonder if the application should not be developed in such a way to support replication
> of PostgreSQL configured as a streaming synchronous replication cluster with "synchronous_commit=remote_apply" ?
> * Or perhaps "synchronous streaming replication" is a bad choice, maybe logical replication would be better ?
> * What are the best practices?
> * Perhaps you just need to force/teach applications to work with synchronous replication in such a way that
> when the SELECT causes conflicts with replication such queries are canceled and the application should resend/repeat query ?
> * I also think that after setting, for example, "max_standby_streaming_delay=30" queries (addressed) to the
> database should be very well optimized, so that too long queries are not canceled too frequently?
> * Do you know any books focused on applications adapted to work in postgresql synchronous streaming
> replication environment i.e. High Availability?
This can never work properly. If you have synchronous replication with "synchronous_commit = remote_apply",
COMMIT on the primary will wait until the information has been replayed on the standby. If you set
"max_standby_streaming_delay = -1", replication can be delayed indefinitely long in the event of a replication
conflict, so COMMIT can take arbitrarily long.
You can reduce replication conflicts (by setting "hot_standby_feedback = on" and by altering all tables to
set "vacuum_truncate = off"), but you will never get rid of them completely.
You will either have to accept stale ready on the standby (by setting "synchronous_commit" to something lower)
or you have to accept canceled queries on the standby (by lowering "max_standby_streaming_delay").
Yours,
Laurenz Albe