Re: min_safe_lsn column in pg_replication_slots view - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: min_safe_lsn column in pg_replication_slots view
Date
Msg-id 20200619.121356.2101874112165807899.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: min_safe_lsn column in pg_replication_slots view  (Michael Paquier <michael@paquier.xyz>)
Responses Re: min_safe_lsn column in pg_replication_slots view
List pgsql-hackers
At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> >> It is a little unclear to me how this or any proposed patch will solve
> >> the original problem reported by Fujii-San?  Basically, the problem
> >> arises because we don't have an interlock between when the checkpoint
> >> removes the WAL segment and the view tries to acquire the same.  Am, I
> >> missing something?
> 
> The proposed patch fetches the computation of the minimum LSN across
> all slots before taking ReplicationSlotControlLock so its value gets
> more lossy, and potentially older than what the slots actually
> include.  So it is an attempt to take the safest spot possible.

Minimum LSN (lastRemovedSegNo) is not protected by the lock. That
makes no defference.

> Honestly, I find a bit silly the design to compute and use the same
> minimum LSN value for all the tuples returned by
> pg_get_replication_slots, and you can actually get a pretty good

I see it as silly.  I think I said upthread that it was the distance
to the point where the slot loses a segment, and it was rejected but
just removing it makes us unable to estimate the distance so it is
there.

> estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
> directly with what pg_replication_slot provides as we have a min()
> aggregate for pg_lsn.

min(lastRemovedSegNo) is the earliest value. It is enough to read it
at the first then use it in all slots.

> For these reasons, I think that we should remove for now this
> information from the view, and reconsider this part more carefully for
> 14~ with a clear definition of how much lossiness we are ready to
> accept for the information provided here, if necessary.  We could for
> example just have a separate SQL function that just grabs this value
> (or a more global SQL view for XLogCtl data that includes this data).

I think, we need at least one of the "distance" above or min_safe_lsn
in anywhere reachable from users.

> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
> 
> We should really not do that anyway for a monitoring view.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "movead.li@highgo.ca"
Date:
Subject: Re: POC and rebased patch for CSN based snapshots
Next
From: Jeff Davis
Date:
Subject: Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans