Re: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id CALj2ACXePMrQF894xZH3zy4i-3VK-ufxvEdUAMRGg=iUcJ348w@mail.gmail.com
Whole thread Raw
In response to RE: Time delayed LR (WAS Re: logical replication restrictions)  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: Time delayed LR (WAS Re: logical replication restrictions)
List pgsql-hackers
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear hackers,
>
> I rebased and refined my PoC. Followings are the changes:

Thanks.

Apologies for being late here. Please bear with me if I'm repeating
any of the discussed points.

I'm mainly trying to understand the production level use-case behind
this feature, and for that matter, recovery_min_apply_delay. AFAIK,
people try to keep the replication lag as minimum as possible i.e.
near zero to avoid the extreme problems on production servers - wal
file growth, blocked vacuum, crash and downtime.

The proposed feature commit message and existing docs about
recovery_min_apply_delay justify the reason as 'offering opportunities
to correct data loss errors'. If someone wants to enable
recovery_min_apply_delay/min_apply_delay on production servers, I'm
guessing their values will be in hours, not in minutes; for the simple
reason that when a data loss occurs, people/infrastructure monitoring
postgres need to know it first and need time to respond with
corrective actions to recover data loss. When these parameters are
set, the primary server mustn't be generating too much WAL to avoid
eventual crash/downtime. Who would really want to be so defensive
against somebody who may or may not accidentally cause data loss and
enable these features on production servers (especially when these can
take down the primary server) and live happily with the induced
replication lag?

AFAIK, PITR is what people use for recovering from data loss errors in
production.

IMO, before we even go implement the apply delay feature for logical
replication, it's worth to understand if induced replication lags have
any production level significance. We can also debate if providing
apply delay hooks is any better with simple out-of-the-box extensions
as opposed to the core providing these features.

Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: WAL Insertion Lock Improvements
Next
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: Memory leak from ExecutorState context?