Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Date
Msg-id CALj2ACUVeUg1DjXzRADfZT431snzEEQmmC4Ns_oRe0f0NDm9vA@mail.gmail.com
Whole thread Raw
List pgsql-hackers
On Thu, Dec 23, 2021 at 5:53 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:
>
> Hi Hackers,
>
> I am considering implementing RPO (recovery point objective) enforcement feature for Postgres where the WAL writes on
theprimary are stalled when the WAL distance between the primary and standby exceeds the configured
(replica_lag_in_bytes)threshold. This feature is useful particularly in the disaster recovery setups where primary and
standbyare in different regions and synchronous replication can't be set up for latency and performance reasons yet
requiressome level of RPO enforcement. 

+1 for the idea in general. However, blocking writes on primary seems
an extremely radical idea. The replicas can fall behind transiently at
times and blocking writes on the primary may stop applications failing
for these transient times. This is not a problem if the applications
have retry logic for the writes. How about blocking writes on primary
if the replicas fall behind the primary for a certain period of time?

> The idea here is to calculate the lag between the primary and the standby (Async?) server during XLogInsert and block
thecaller until the lag is less than the threshold value. We can calculate the max lag by iterating over
ReplicationSlotCtl->replication_slots.

The "falling behind" can also be quantified by the number of
write-transactions on the primary. I think it's good to have the users
choose what the "falling behind" means for them. We can have something
like the "recovery_target" param with different options name, xid,
time, lsn.

> If this is not something we don't want to do in the core, at least adding a hook for XlogInsert is of great value.

IMHO, this feature may not be needed by everyone, the hook-way seems
reasonable so that the postgres vendors can provide different
implementations (for instance they can write an extension that
implements this hook which can block writes on primary, write some log
messages, inform some service layer of the replicas falling behind the
primary etc.). If we were to have the hook in XLogInsert which gets
called so frequently or XLogInsert is a hot-path, the hook really
should do as little work as possible, otherwise the write operations
latency may increase.

> A few other scenarios I can think of with the hook are:
>
> Enforcing RPO as described above
> Enforcing rate limit and slow throttling when sync standby is falling behind (could be flush lag or replay lag)
> Transactional log rate governance - useful for cloud providers to provide SKU sizes based on allowed WAL writes.
>
> Thoughts?

The hook can help to achieve the above objectives but where to place
it and what parameters it should take as input (or what info it should
emit out of the server via the hook) are important too.

Having said all, the RPO feature can also be implemented outside of
the postgres, a simple implementation could be - get the primary
current wal lsn using pg_current_wal_lsn and all the replicas
restart_lsn using pg_replication_slot, if they differ by certain
amount, then issue ALTER SYSTEM SET READ ONLY command [1] on the
primary, this requires the connections to the server and proper access
rights. This feature can also be implemented as an extension (without
the hook) which doesn't require any connections to the server yet can
access the required info primary current_wal_lsn, restart_lsn of the
replication slots etc, but the RPO enforcement may not be immediate as
the server doesn't have any hooks in XLogInsert or some other area.

[1] - https://www.postgresql.org/message-id/CAAJ_b967uKBiW6gbHr5aPzweURYjEGv333FHVHxvJmMhanwHXA%40mail.gmail.com

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Failed transaction statistics to measure the logical replication progress
Next
From: Andrey Lepikhov
Date:
Subject: Re: Add index scan progress to pg_stat_progress_vacuum