Re: min_safe_lsn column in pg_replication_slots view - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: min_safe_lsn column in pg_replication_slots view
Date
Msg-id 3518948c-c175-c028-c2a4-78d85bd4c41f@oss.nttdata.com
Whole thread Raw
In response to Re: min_safe_lsn column in pg_replication_slots view  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: min_safe_lsn column in pg_replication_slots view
List pgsql-hackers

On 2020/06/23 10:10, Kyotaro Horiguchi wrote:
> At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>
>>
>> On 2020/06/22 21:01, Amit Kapila wrote:
>>> On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz>
>>> wrote:
>>>>
>>>> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
>>>>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
>>>>>> Isn't this information specific to checkpoints, so maybe better to
>>>>>> display in view pg_stat_bgwriter?
>>>>>
>>>>> Not sure that's a good match.  If we decide to expose that, a separate
>>>>> function returning a LSN based on the segment number from
>>>>> XLogGetLastRemovedSegno() sounds fine to me, like
>>>>> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
>>>>> mind?
>>>>
>>>> I was thinking on this one for the last couple of days, and came up
>>>> with the name pg_wal_oldest_lsn(), as per the attached, traking the
>>>> oldest WAL location still available.
>>
>> Thanks for the patch!
>>
>> + <literal>NULL</literal> if no WAL segments have been removed since
>> +        startup.
>>
>> Isn't this confusing? I think that we should store the last removed
>> WAL segment to somewhere (e.g., pg_control) and restore it at
>> the startup, so that we can see the actual value even after the
>> startup.
>> Or we should scan pg_wal directory and find the "minimal" WAL segment
>> and return its LSN.
> 
> Running a separate scan on pg_wal at startup or first time the oldest
> WAL segno is referenced is something that was rejected before. But
> with the current behavior we don't find the last removed segment until
> any slot loses a segment if all WAL files are retained by a slot.

Because scanning pg_wal can be heavy operation especially when
max_wal_size is high and there are lots of WAL files? If so, it might
be better to save the value in pg_control as I told upthread.

However I'm not sure the use case of this function yet...

> FWIW
> I recently proposed a patch to find the oldest WAL file while trying
> removing old WAL files.
> 
>>> I feel such a function is good to have but I am not sure if there is a
>>> need to tie it with the removal of min_safe_lsn column.
>>
>> We should expose the LSN calculated from
>> "the current WAL LSN - max(wal_keep_segments * 16MB,
>> max_slot_wal_keep_size)"?
>> This indicates the minimum LSN of WAL files that are guaraneed to be
>> currently retained by wal_keep_segments and max_slot_wal_keep_size.
>> That is, if checkpoint occurs when restart_lsn of replication slot is
>> smaller than that minimum LSN, some required WAL files may be removed.
>> So DBAs can periodically monitor and compare restart_lsn and that
>> minimum
>> LSN. If they see frequently that difference of those LSN is very
>> small,
>> they can decide to increase wal_keep_segments or
>> max_slot_wal_keep_size,
>> to prevent required WAL files from being removed. Thought?
> 
> I'm not sure about the consensus here about showing that number in the
> view. It is similar to "remain" in the earlier versions of this patch
> but a bit simpler. It would be usable in a similar way. I can live
> with either numbers.

It's useless to display this value in each replication slot in the view.
I'm thinking to expose it as a function.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view
Next
From: David Rowley
Date:
Subject: Re: Parallel Seq Scan vs kernel read ahead