Thread: min_safe_lsn column in pg_replication_slots view

min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

15 June 2020, 03:40:03

Hi,

Per the docs, pg_replication_slots.min_safe_lsn inedicates "the minimum
LSN currently available for walsenders". When I executed pg_walfile_name()
with min_safe_lsn, the function returned the name of the last removed
WAL file instead of minimum available WAL file name. This happens because
min_safe_lsn actually indicates the ending position (the boundary byte)
of the last removed WAL file.

I guess that some users would want to calculate the minimum available
WAL file name from min_safe_lsn by using pg_walfile_name(), but the result
would be incorrect. Isn't this confusing? min_safe_lsn should indicate
the bondary byte + 1, instead?

BTW, I just wonder why each row in pg_replication_slots needs to have
min_safe_lsn column? Basically min_safe_lsn should be the same between
every replication slots.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

15 June 2020, 04:44:31

On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> BTW, I just wonder why each row in pg_replication_slots needs to have
> min_safe_lsn column? Basically min_safe_lsn should be the same between
> every replication slots.

Indeed, that's confusing in its current shape.  I would buy putting
this value into pg_replication_slots if there were some consistency of
the data to worry about because of locking issues, but here this data
is controlled within info_lck, which is out of the the repslot LW
lock.  So I think that it is incorrect to put this data in this view
and that we should remove it, and that instead we had better push for
a system view that maps with the contents of XLogCtl.
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

15 June 2020, 07:35:22

At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> > BTW, I just wonder why each row in pg_replication_slots needs to have
> > min_safe_lsn column? Basically min_safe_lsn should be the same between
> > every replication slots.
> 
> Indeed, that's confusing in its current shape.  I would buy putting
> this value into pg_replication_slots if there were some consistency of
> the data to worry about because of locking issues, but here this data
> is controlled within info_lck, which is out of the the repslot LW
> lock.  So I think that it is incorrect to put this data in this view
> and that we should remove it, and that instead we had better push for
> a system view that maps with the contents of XLogCtl.

It was once the difference from the safe_lsn to restart_lsn which is
distinct among slots. Then it was changed to the safe_lsn. I agree to
the discussion above, but it is needed anywhere since no one can know
the margin until the slot goes to the "lost" state without it.  (Note
that currently even wal_status and min_safe_lsn can be inconsistent in
a line.)

Just for the need for table-consistency and in-line consistency, we
could just remember the result of XLogGetLastRemovedSegno() around
taking info_lock in the function.  That doesn't make a practical
difference but makes the view look consistent.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d6fe205eb4..f57d67a900 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9498,7 +9498,7 @@ CreateRestartPoint(int flags)
  * * WALAVAIL_INVALID_LSN means the slot hasn't been set to reserve WAL.
  */
 WALAvailability
-GetWALAvailability(XLogRecPtr targetLSN)
+GetWALAvailability(XLogRecPtr targetLSN, XLogSegNo last_removed_seg)
 {
     XLogRecPtr    currpos;        /* current write LSN */
     XLogSegNo    currSeg;        /* segid of currpos */
@@ -9524,7 +9524,7 @@ GetWALAvailability(XLogRecPtr targetLSN)
      * the first WAL segment file since startup, which causes the status being
      * wrong under certain abnormal conditions but that doesn't actually harm.
      */
-    oldestSeg = XLogGetLastRemovedSegno() + 1;
+    oldestSeg = last_removed_seg + 1;
 
     /* calculate oldest segment by max_wal_size and wal_keep_segments */
     XLByteToSeg(currpos, currSeg, wal_segment_size);
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 1b929a603e..bd2e3e84ed 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -243,6 +243,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
     MemoryContext per_query_ctx;
     MemoryContext oldcontext;
     int            slotno;
+    XLogSegNo    last_removed_seg;
 
     /* check to see if caller supports us returning a tuplestore */
     if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -272,6 +273,14 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
     rsinfo->setResult = tupstore;
     rsinfo->setDesc = tupdesc;
 
+    /*
+     * Remember the last removed segment at this point for the consistency in
+     * this table. Since there's no interlock between slot data and
+     * checkpointer, the segment can be removed in-between, but that doesn't
+     * make any practical difference.
+     */
+    last_removed_seg = XLogGetLastRemovedSegno();
+
     MemoryContextSwitchTo(oldcontext);
 
     LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
@@ -282,7 +291,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
         Datum        values[PG_GET_REPLICATION_SLOTS_COLS];
         bool        nulls[PG_GET_REPLICATION_SLOTS_COLS];
         WALAvailability walstate;
-        XLogSegNo    last_removed_seg;
         int            i;
 
         if (!slot->in_use)
@@ -342,7 +350,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
         else
             nulls[i++] = true;
 
-        walstate = GetWALAvailability(slot_contents.data.restart_lsn);
+        walstate = GetWALAvailability(slot_contents.data.restart_lsn,
+                                      last_removed_seg);
 
         switch (walstate)
         {
@@ -368,7 +377,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
         if (max_slot_wal_keep_size_mb >= 0 &&
             (walstate == WALAVAIL_NORMAL || walstate == WALAVAIL_RESERVED) &&
-            ((last_removed_seg = XLogGetLastRemovedSegno()) != 0))
+            (last_removed_seg != 0))
         {
             XLogRecPtr    min_safe_lsn;
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e917dfe92d..c05a18148d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -326,7 +326,8 @@ extern void ShutdownXLOG(int code, Datum arg);
 extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
-extern WALAvailability GetWALAvailability(XLogRecPtr restart_lsn);
+extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN,
+                                          XLogSegNo last_removed_seg);
 extern XLogRecPtr CalculateMaxmumSafeLSN(void);
 extern void XLogPutNextOid(Oid nextOid);
 extern XLogRecPtr XLogRestorePoint(const char *rpName);

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

17 June 2020, 12:37:55

On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier <michael@paquier.xyz> wrote in
>> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
>>> BTW, I just wonder why each row in pg_replication_slots needs to have
>>> min_safe_lsn column? Basically min_safe_lsn should be the same between
>>> every replication slots.
>>
>> Indeed, that's confusing in its current shape.  I would buy putting
>> this value into pg_replication_slots if there were some consistency of
>> the data to worry about because of locking issues, but here this data
>> is controlled within info_lck, which is out of the the repslot LW
>> lock.  So I think that it is incorrect to put this data in this view
>> and that we should remove it, and that instead we had better push for
>> a system view that maps with the contents of XLogCtl.

Agreed. But as you know it's too late to do that for v13...
So firstly I'd like to fix the issues in pg_replication_slots view,
and then we can improve the things later for v14 if necessary.

> It was once the difference from the safe_lsn to restart_lsn which is
> distinct among slots. Then it was changed to the safe_lsn. I agree to
> the discussion above, but it is needed anywhere since no one can know
> the margin until the slot goes to the "lost" state without it.  (Note
> that currently even wal_status and min_safe_lsn can be inconsistent in
> a line.)
> 
> Just for the need for table-consistency and in-line consistency, we
> could just remember the result of XLogGetLastRemovedSegno() around
> taking info_lock in the function.  That doesn't make a practical
> difference but makes the view look consistent.

Agreed. Thanks for the patch. Here are the review comments:

Not only the last removed segment but also current write position
should be obtain at the beginning of pg_get_replication_slots()
and should be given to GetWALAvailability(), for the consistency?

Even after applying your patch, min_safe_lsn is calculated for
each slot even though the calculated result is always the same.
Which is a bit waste of cycle. We should calculate min_safe_lsn
at the beginning and use it for each slot?

            XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,

Isn't it better to use 1 as the second argument of the above,
in order to address the issue that I reported upthread?
Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn) returns
would be confusing.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

18 June 2020, 06:22:16

Thanks for the comments.

At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier
> > <michael@paquier.xyz> wrote in
> >> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> >>> BTW, I just wonder why each row in pg_replication_slots needs to have
> >>> min_safe_lsn column? Basically min_safe_lsn should be the same between
> >>> every replication slots.
> >>
> >> Indeed, that's confusing in its current shape.  I would buy putting
> >> this value into pg_replication_slots if there were some consistency of
> >> the data to worry about because of locking issues, but here this data
> >> is controlled within info_lck, which is out of the the repslot LW
> >> lock.  So I think that it is incorrect to put this data in this view
> >> and that we should remove it, and that instead we had better push for
> >> a system view that maps with the contents of XLogCtl.
> 
> Agreed. But as you know it's too late to do that for v13...
> So firstly I'd like to fix the issues in pg_replication_slots view,
> and then we can improve the things later for v14 if necessary.
> 
> 
> > It was once the difference from the safe_lsn to restart_lsn which is
> > distinct among slots. Then it was changed to the safe_lsn. I agree to
> > the discussion above, but it is needed anywhere since no one can know
> > the margin until the slot goes to the "lost" state without it.  (Note
> > that currently even wal_status and min_safe_lsn can be inconsistent in
> > a line.)
> > Just for the need for table-consistency and in-line consistency, we
> > could just remember the result of XLogGetLastRemovedSegno() around
> > taking info_lock in the function.  That doesn't make a practical
> > difference but makes the view look consistent.
> 
> Agreed. Thanks for the patch. Here are the review comments:
> 
> 
> Not only the last removed segment but also current write position
> should be obtain at the beginning of pg_get_replication_slots()
> and should be given to GetWALAvailability(), for the consistency?

You are right.  Though I faintly thought that I didn't need that since
WriteRecPtr doesn't move by so wide steps as removed_segment, actually
it moves.

> Even after applying your patch, min_safe_lsn is calculated for
> each slot even though the calculated result is always the same.
> Which is a bit waste of cycle. We should calculate min_safe_lsn
> at the beginning and use it for each slot?

Agreed. That may results in a wastful calculation but it's better than
repeated wasteful calculations.

>             XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
> 
> Isn't it better to use 1 as the second argument of the above,
> in order to address the issue that I reported upthread?
> Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> returns
> would be confusing.

Mmm. pg_walfile_name seems too specialize to
pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
segment boundaries.)  I'm not willing to do that only to follow such
suspicious(?) specification, but surely it would practically be better
doing that. Please find the attached first patch.  I found that
there's no reason to hide min_safe_lsn when wal_status has certain
values. So I changed it to be shown always.

By the way, I noticed that when a replication slot reserves all
existing WAL segments, checkpoint cannot remove a file and
lastRemovedSegment is left being 0. The second attached forces
RemoveOldXlogFiles to initialize the variable even when no WAL
segments are removed.  It puts no additional loads on file system
since the directory is scanned anyway.  My old proposal to
unconditionally initialize it separately from checkpoint was rejected,
but I think this is acceptable.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 6742e3b31ac53ef54458d64e34feeac7d4c710d2 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 18 Jun 2020 13:36:23 +0900
Subject: [PATCH v2 1/2] Make pg_stat_replication view consistent

min_safe_lsn is not consistent within the result of a query. Make it
consistent. On the way fixing that min_safe_lsn is changed to show the
second byte of a segment instead of first byte in order to users can
get the intended segment file name from pg_walfile_name.
---
 src/backend/access/transam/xlog.c   | 15 +++++++------
 src/backend/replication/slotfuncs.c | 35 ++++++++++++++++++++---------
 src/include/access/xlog.h           |  4 +++-
 3 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 55cac186dc..940f5fcb18 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9484,6 +9484,9 @@ CreateRestartPoint(int flags)
  * Report availability of WAL for the given target LSN
  *        (typically a slot's restart_lsn)
  *
+ * currWriteLSN and lastRemovedSeg is the results of XLogGetLastRemovedSegno()
+ * and GetXLogWriteRecPtr() respectively at the referring time.
+ *
  * Returns one of the following enum values:
  * * WALAVAIL_NORMAL means targetLSN is available because it is in the range
  *   of max_wal_size.
@@ -9498,9 +9501,9 @@ CreateRestartPoint(int flags)
  * * WALAVAIL_INVALID_LSN means the slot hasn't been set to reserve WAL.
  */
 WALAvailability
-GetWALAvailability(XLogRecPtr targetLSN)
+GetWALAvailability(XLogRecPtr targetLSN, XLogRecPtr currWriteLSN,
+                   XLogSegNo lastRemovedSeg)
 {
-    XLogRecPtr    currpos;        /* current write LSN */
     XLogSegNo    currSeg;        /* segid of currpos */
     XLogSegNo    targetSeg;        /* segid of targetLSN */
     XLogSegNo    oldestSeg;        /* actual oldest segid */
@@ -9513,21 +9516,19 @@ GetWALAvailability(XLogRecPtr targetLSN)
     if (XLogRecPtrIsInvalid(targetLSN))
         return WALAVAIL_INVALID_LSN;
 
-    currpos = GetXLogWriteRecPtr();
-
     /* calculate oldest segment currently needed by slots */
     XLByteToSeg(targetLSN, targetSeg, wal_segment_size);
-    KeepLogSeg(currpos, &oldestSlotSeg);
+    KeepLogSeg(currWriteLSN, &oldestSlotSeg);
 
     /*
      * Find the oldest extant segment file. We get 1 until checkpoint removes
      * the first WAL segment file since startup, which causes the status being
      * wrong under certain abnormal conditions but that doesn't actually harm.
      */
-    oldestSeg = XLogGetLastRemovedSegno() + 1;
+    oldestSeg = lastRemovedSeg + 1;
 
     /* calculate oldest segment by max_wal_size and wal_keep_segments */
-    XLByteToSeg(currpos, currSeg, wal_segment_size);
+    XLByteToSeg(currWriteLSN, currSeg, wal_segment_size);
     keepSegs = ConvertToXSegs(Max(max_wal_size_mb, wal_keep_segments),
                               wal_segment_size) + 1;
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 1b929a603e..063c6dd435 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -243,6 +243,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
     MemoryContext per_query_ctx;
     MemoryContext oldcontext;
     int            slotno;
+    XLogSegNo    last_removed_seg;
+    XLogRecPtr    curr_write_lsn;
+    XLogRecPtr    min_safe_lsn = 0;
 
     /* check to see if caller supports us returning a tuplestore */
     if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -272,6 +275,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
     rsinfo->setResult = tupstore;
     rsinfo->setDesc = tupdesc;
 
+    /*
+     * Remember the last removed segment and current WAL write LSN at this
+     * point for the consistency in this table. Since there's no interlock
+     * between slot data and checkpointer, the segment can be removed
+     * in-between, but that doesn't make any practical difference.  Also we
+     * calculate safe_min_lsn here as the same value is shown for all slots.
+     */
+    last_removed_seg = XLogGetLastRemovedSegno();
+    curr_write_lsn = GetXLogWriteRecPtr();
+
+    /*
+     * Show the next byte after segment boundary as min_safe_lsn so that
+     * pg_walfile_name() returns the correct file name for the value.
+     */
+    if (max_slot_wal_keep_size_mb >= 0 && last_removed_seg != 0)
+        XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 1,
+                                wal_segment_size, min_safe_lsn);
+
     MemoryContextSwitchTo(oldcontext);
 
     LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
@@ -282,7 +303,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
         Datum        values[PG_GET_REPLICATION_SLOTS_COLS];
         bool        nulls[PG_GET_REPLICATION_SLOTS_COLS];
         WALAvailability walstate;
-        XLogSegNo    last_removed_seg;
         int            i;
 
         if (!slot->in_use)
@@ -342,7 +362,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
         else
             nulls[i++] = true;
 
-        walstate = GetWALAvailability(slot_contents.data.restart_lsn);
+        walstate = GetWALAvailability(slot_contents.data.restart_lsn,
+                                      curr_write_lsn, last_removed_seg);
 
         switch (walstate)
         {
@@ -366,16 +387,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
                 elog(ERROR, "invalid walstate: %d", (int) walstate);
         }
 
-        if (max_slot_wal_keep_size_mb >= 0 &&
-            (walstate == WALAVAIL_NORMAL || walstate == WALAVAIL_RESERVED) &&
-            ((last_removed_seg = XLogGetLastRemovedSegno()) != 0))
-        {
-            XLogRecPtr    min_safe_lsn;
-
-            XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
-                                    wal_segment_size, min_safe_lsn);
+        if (min_safe_lsn != 0)
             values[i++] = Int64GetDatum(min_safe_lsn);
-        }
         else
             nulls[i++] = true;
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e917dfe92d..bbe00e7195 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -326,7 +326,9 @@ extern void ShutdownXLOG(int code, Datum arg);
 extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
-extern WALAvailability GetWALAvailability(XLogRecPtr restart_lsn);
+extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN,
+                                          XLogRecPtr currWriteLSN,
+                                          XLogSegNo last_removed_seg);
 extern XLogRecPtr CalculateMaxmumSafeLSN(void);
 extern void XLogPutNextOid(Oid nextOid);
 extern XLogRecPtr XLogRestorePoint(const char *rpName);
-- 
2.18.4

From 10425b3c062ed3972c0ad067adb7c14c5dca43c8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 18 Jun 2020 14:49:48 +0900
Subject: [PATCH v2 2/2] Forcibly initialize lastRemovedSegNo at the first
 checkpoint

If we have a replication slot retaining all existing WAL segments, a
checkpoint doesn't initialize lastRemovedSegNo. That means
pg_replication_slots can not show min_safe_lsn until any slot loses a
segment. Forcibly initialize it even if no WAL segments are removed at
the first checkpoint.
---
 src/backend/access/transam/xlog.c | 33 +++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 940f5fcb18..282bac32ab 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4025,6 +4025,16 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr)
     DIR           *xldir;
     struct dirent *xlde;
     char        lastoff[MAXFNAMELEN];
+    char        oldest[MAXFNAMELEN];
+    bool        remember_oldest = false;
+
+    /*
+     * If we haven't set the initial last removed segno, force to set it even
+     * if we wouldn't have removed any segments.
+     */
+    oldest[0] = 0;
+    if (XLogGetLastRemovedSegno() == 0)
+        remember_oldest = true;
 
     /*
      * Construct a filename of the last segment to be kept. The timeline ID
@@ -4062,10 +4072,33 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr)
             {
                 /* Update the last removed location in shared memory first */
                 UpdateLastRemovedPtr(xlde->d_name);
+                remember_oldest = false;
 
                 RemoveXlogFile(xlde->d_name, lastredoptr, endptr);
             }
         }
+        /* Remember the oldest file left in the directotry */
+        else if(remember_oldest &&
+                (oldest[0] == 0 || strcmp(xlde->d_name + 8, oldest + 8) < 0))
+            strncpy(oldest, xlde->d_name, MAXFNAMELEN);
+    }
+
+    /*
+     * We haven't initialize the pointer, initialize it.
+     * The last removed pointer is the oldest existing segment minus 1.
+     */
+    if (remember_oldest && oldest[0] != 0)
+    {
+        uint32        tli;
+        XLogSegNo    segno;
+
+        XLogFromFileName(oldest, &tli, &segno, wal_segment_size);
+
+        if (segno > 1)
+        {
+            XLogFileName(oldest, tli, segno - 1, wal_segment_size);
+            UpdateLastRemovedPtr(oldest);
+        }
     }
 
     FreeDir(xldir);
-- 
2.18.4

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

18 June 2020, 12:48:37

On Thu, Jun 18, 2020 at 11:52 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> > On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > Isn't it better to use 1 as the second argument of the above,
> > in order to address the issue that I reported upthread?
> > Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> > returns
> > would be confusing.
>
> Mmm. pg_walfile_name seems too specialize to
> pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> segment boundaries.)  I'm not willing to do that only to follow such
> suspicious(?) specification, but surely it would practically be better
> doing that. Please find the attached first patch.
>

It is a little unclear to me how this or any proposed patch will solve
the original problem reported by Fujii-San?  Basically, the problem
arises because we don't have an interlock between when the checkpoint
removes the WAL segment and the view tries to acquire the same.  Am, I
missing something?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

19 June 2020, 01:02:54

At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> On Thu, Jun 18, 2020 at 11:52 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> > > On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > > Isn't it better to use 1 as the second argument of the above,
> > > in order to address the issue that I reported upthread?
> > > Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> > > returns
> > > would be confusing.
> >
> > Mmm. pg_walfile_name seems too specialize to
> > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > segment boundaries.)  I'm not willing to do that only to follow such
> > suspicious(?) specification, but surely it would practically be better
> > doing that. Please find the attached first patch.
> >
> 
> It is a little unclear to me how this or any proposed patch will solve
> the original problem reported by Fujii-San?  Basically, the problem
> arises because we don't have an interlock between when the checkpoint
> removes the WAL segment and the view tries to acquire the same.  Am, I
> missing something?

I'm not sure, but I don't get the point of blocking WAL segment
removal until the view is completed. The said columns of the view are
just for monitoring, which needs an information snapshot seemingly
taken at a certain time. And InvalidateObsoleteReplicationSlots kills
walsenders using lastRemovedSegNo of a different time.  The two are
independent each other.

Also the patch changes min_safe_lsn to show an LSN at segment boundary
+ 1.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

19 June 2020, 01:39:58

On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
> At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
>> It is a little unclear to me how this or any proposed patch will solve
>> the original problem reported by Fujii-San?  Basically, the problem
>> arises because we don't have an interlock between when the checkpoint
>> removes the WAL segment and the view tries to acquire the same.  Am, I
>> missing something?

The proposed patch fetches the computation of the minimum LSN across
all slots before taking ReplicationSlotControlLock so its value gets
more lossy, and potentially older than what the slots actually
include.  So it is an attempt to take the safest spot possible.

Honestly, I find a bit silly the design to compute and use the same
minimum LSN value for all the tuples returned by
pg_get_replication_slots, and you can actually get a pretty good
estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
directly with what pg_replication_slot provides as we have a min()
aggregate for pg_lsn.

For these reasons, I think that we should remove for now this
information from the view, and reconsider this part more carefully for
14~ with a clear definition of how much lossiness we are ready to
accept for the information provided here, if necessary.  We could for
example just have a separate SQL function that just grabs this value
(or a more global SQL view for XLogCtl data that includes this data).

> I'm not sure, but I don't get the point of blocking WAL segment
> removal until the view is completed.

We should really not do that anyway for a monitoring view.
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

19 June 2020, 03:13:56

At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> >> It is a little unclear to me how this or any proposed patch will solve
> >> the original problem reported by Fujii-San?  Basically, the problem
> >> arises because we don't have an interlock between when the checkpoint
> >> removes the WAL segment and the view tries to acquire the same.  Am, I
> >> missing something?
> 
> The proposed patch fetches the computation of the minimum LSN across
> all slots before taking ReplicationSlotControlLock so its value gets
> more lossy, and potentially older than what the slots actually
> include.  So it is an attempt to take the safest spot possible.

Minimum LSN (lastRemovedSegNo) is not protected by the lock. That
makes no defference.

> Honestly, I find a bit silly the design to compute and use the same
> minimum LSN value for all the tuples returned by
> pg_get_replication_slots, and you can actually get a pretty good

I see it as silly.  I think I said upthread that it was the distance
to the point where the slot loses a segment, and it was rejected but
just removing it makes us unable to estimate the distance so it is
there.

> estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
> directly with what pg_replication_slot provides as we have a min()
> aggregate for pg_lsn.

min(lastRemovedSegNo) is the earliest value. It is enough to read it
at the first then use it in all slots.

> For these reasons, I think that we should remove for now this
> information from the view, and reconsider this part more carefully for
> 14~ with a clear definition of how much lossiness we are ready to
> accept for the information provided here, if necessary.  We could for
> example just have a separate SQL function that just grabs this value
> (or a more global SQL view for XLogCtl data that includes this data).

I think, we need at least one of the "distance" above or min_safe_lsn
in anywhere reachable from users.

> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
> 
> We should really not do that anyway for a monitoring view.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

19 June 2020, 03:29:48

On Fri, Jun 19, 2020 at 6:32 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
> > On Thu, Jun 18, 2020 at 11:52 AM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > >
> > > At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> > > > On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > > > Isn't it better to use 1 as the second argument of the above,
> > > > in order to address the issue that I reported upthread?
> > > > Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> > > > returns
> > > > would be confusing.
> > >
> > > Mmm. pg_walfile_name seems too specialize to
> > > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > > segment boundaries.)  I'm not willing to do that only to follow such
> > > suspicious(?) specification, but surely it would practically be better
> > > doing that. Please find the attached first patch.
> > >
> >
> > It is a little unclear to me how this or any proposed patch will solve
> > the original problem reported by Fujii-San?  Basically, the problem
> > arises because we don't have an interlock between when the checkpoint
> > removes the WAL segment and the view tries to acquire the same.  Am, I
> > missing something?
>
> I'm not sure, but I don't get the point of blocking WAL segment
> removal until the view is completed.
>

I am not suggesting to do that.

> The said columns of the view are
> just for monitoring, which needs an information snapshot seemingly
> taken at a certain time. And InvalidateObsoleteReplicationSlots kills
> walsenders using lastRemovedSegNo of a different time.  The two are
> independent each other.
>
> Also the patch changes min_safe_lsn to show an LSN at segment boundary
> + 1.
>

But aren't we doing last_removed_seg+1 even without the patch?  See code below

- {
- XLogRecPtr min_safe_lsn;
-
- XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
- wal_segment_size, min_safe_lsn);



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

19 June 2020, 03:39:03

On Fri, Jun 19, 2020 at 8:44 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> > Honestly, I find a bit silly the design to compute and use the same
> > minimum LSN value for all the tuples returned by
> > pg_get_replication_slots, and you can actually get a pretty good
>
> I see it as silly.  I think I said upthread that it was the distance
> to the point where the slot loses a segment, and it was rejected but
> just removing it makes us unable to estimate the distance so it is
> there.
>

IIUC, the value of min_safe_lsn will lesser than restart_lsn, so one
can compute the difference of those to see how much ahead the
replication slot's restart_lsn is from min_safe_lsn but still it is
not clear how user will make any use of it.  Can you please explain
how the distance you are talking about is useful to users or anyone?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

19 June 2020, 05:35:11

At Fri, 19 Jun 2020 09:09:03 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> On Fri, Jun 19, 2020 at 8:44 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> > > Honestly, I find a bit silly the design to compute and use the same
> > > minimum LSN value for all the tuples returned by
> > > pg_get_replication_slots, and you can actually get a pretty good
> >
> > I see it as silly.  I think I said upthread that it was the distance
> > to the point where the slot loses a segment, and it was rejected but
> > just removing it makes us unable to estimate the distance so it is
> > there.
> >
> 
> IIUC, the value of min_safe_lsn will lesser than restart_lsn, so one
> can compute the difference of those to see how much ahead the
> replication slot's restart_lsn is from min_safe_lsn but still it is
> not clear how user will make any use of it.  Can you please explain
> how the distance you are talking about is useful to users or anyone?

When max_slot_wal_keep_size is set, the slot may retain up to as many
as that amount of old WAL segments then suddenly loses the oldest
segments.  *I* thought that I would use it in an HA cluster tool to
inform users about the remaining time (not literally, of course) a
disconnected standy is allowed diconnected.  Of course even if some
segments have been lost, they could be copied from the primary's
archive so that's not critical in theory.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

19 June 2020, 05:42:45

At Fri, 19 Jun 2020 08:59:48 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> > > > Mmm. pg_walfile_name seems too specialize to
> > > > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > > > segment boundaries.)  I'm not willing to do that only to follow such
> > > > suspicious(?) specification, but surely it would practically be better
> > > > doing that. Please find the attached first patch.
> > > >
> > >
> > > It is a little unclear to me how this or any proposed patch will solve
> > > the original problem reported by Fujii-San?  Basically, the problem
> > > arises because we don't have an interlock between when the checkpoint
> > > removes the WAL segment and the view tries to acquire the same.  Am, I
> > > missing something?
> >
> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
> >
> 
> I am not suggesting to do that.
> 
> > The said columns of the view are
> > just for monitoring, which needs an information snapshot seemingly
> > taken at a certain time. And InvalidateObsoleteReplicationSlots kills
> > walsenders using lastRemovedSegNo of a different time.  The two are
> > independent each other.
> >
> > Also the patch changes min_safe_lsn to show an LSN at segment boundary
> > + 1.
> >
> 
> But aren't we doing last_removed_seg+1 even without the patch?  See code below
> 
> - {
> - XLogRecPtr min_safe_lsn;
> -
> - XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
> - wal_segment_size, min_safe_lsn);

It is at the beginning byte of the *next* segment. Fujii-san told that
it should be the next byte of it, namely
"XLogSegNoOffsetToRecPtr(last_removed_seg + 1, *1*,", and the patch
calculates as that. It adds the follows instead.

+    if (max_slot_wal_keep_size_mb >= 0 && last_removed_seg != 0)
+        XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 1,
+                                wal_segment_size, min_safe_lsn);

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

19 June 2020, 07:13:27


On 2020/06/19 10:39, Michael Paquier wrote:
> On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
>> At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
>>> It is a little unclear to me how this or any proposed patch will solve
>>> the original problem reported by Fujii-San?  Basically, the problem
>>> arises because we don't have an interlock between when the checkpoint
>>> removes the WAL segment and the view tries to acquire the same.  Am, I
>>> missing something?
> 
> The proposed patch fetches the computation of the minimum LSN across
> all slots before taking ReplicationSlotControlLock so its value gets
> more lossy, and potentially older than what the slots actually
> include.  So it is an attempt to take the safest spot possible.
> 
> Honestly, I find a bit silly the design to compute and use the same
> minimum LSN value for all the tuples returned by
> pg_get_replication_slots, and you can actually get a pretty good
> estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
> directly with what pg_replication_slot provides as we have a min()
> aggregate for pg_lsn.
> 
> For these reasons, I think that we should remove for now this
> information from the view, and reconsider this part more carefully for
> 14~ with a clear definition of how much lossiness we are ready to
> accept for the information provided here, if necessary.

Agreed. But isn't it too late to remove the columns (i.e., change
the catalog) for v13? Because v13 beta1 was already released.
IIUC the catalog should not be changed since beta1 release so that
users can upgrade PostgreSQL without initdb.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

19 June 2020, 07:36:09

On Fri, Jun 19, 2020 at 04:13:27PM +0900, Fujii Masao wrote:
> Agreed. But isn't it too late to remove the columns (i.e., change
> the catalog) for v13? Because v13 beta1 was already released.
> IIUC the catalog should not be changed since beta1 release so that
> users can upgrade PostgreSQL without initdb.

Catalog bumps have happened in the past between beta versions:
git log -p REL_12_BETA1..REL_12_BETA2 src/include/catalog/catversion.h
git log -p REL_11_BETA1..REL_11_BETA2 src/include/catalog/catversion.h
git log -p REL_10_BETA1..REL_10_BETA2 src/include/catalog/catversion.h

So we usually avoid to do that between betas, but my take here is that
a catalog bump is better than regretting a change we may have to live
with after the release is sealed.
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

19 June 2020, 07:43:04

At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Jun 19, 2020 at 04:13:27PM +0900, Fujii Masao wrote:
> > Agreed. But isn't it too late to remove the columns (i.e., change
> > the catalog) for v13? Because v13 beta1 was already released.
> > IIUC the catalog should not be changed since beta1 release so that
> > users can upgrade PostgreSQL without initdb.
> 
> Catalog bumps have happened in the past between beta versions:
> git log -p REL_12_BETA1..REL_12_BETA2 src/include/catalog/catversion.h
> git log -p REL_11_BETA1..REL_11_BETA2 src/include/catalog/catversion.h
> git log -p REL_10_BETA1..REL_10_BETA2 src/include/catalog/catversion.h
> 
> So we usually avoid to do that between betas, but my take here is that
> a catalog bump is better than regretting a change we may have to live
> with after the release is sealed.

FWIW if we decide that it is really useless, I agree to remove it now.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

19 June 2020, 08:34:01


On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <michael@paquier.xyz> wrote in
>> On Fri, Jun 19, 2020 at 04:13:27PM +0900, Fujii Masao wrote:
>>> Agreed. But isn't it too late to remove the columns (i.e., change
>>> the catalog) for v13? Because v13 beta1 was already released.
>>> IIUC the catalog should not be changed since beta1 release so that
>>> users can upgrade PostgreSQL without initdb.
>>
>> Catalog bumps have happened in the past between beta versions:
>> git log -p REL_12_BETA1..REL_12_BETA2 src/include/catalog/catversion.h
>> git log -p REL_11_BETA1..REL_11_BETA2 src/include/catalog/catversion.h
>> git log -p REL_10_BETA1..REL_10_BETA2 src/include/catalog/catversion.h
>>
>> So we usually avoid to do that between betas, but my take here is that
>> a catalog bump is better than regretting a change we may have to live
>> with after the release is sealed.

Sounds reasonable.

Regards,


-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

19 June 2020, 12:15:52

On Fri, Jun 19, 2020 at 05:34:01PM +0900, Fujii Masao wrote:
> On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
>> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <michael@paquier.xyz> wrote in
>>> So we usually avoid to do that between betas, but my take here is that
>>> a catalog bump is better than regretting a change we may have to live
>>> with after the release is sealed.
>
> Sounds reasonable.

If we want to make this happen, I am afraid that the time is short as
beta2 is planned for next week.  As the version will be likely tagged
by Monday US time, it would be good to get this addressed within 48
hours to give some room to the buildfarm to react.  Attached is a
straight-forward proposal of patch.  Any thoughts?

(The change in catversion.h is a self-reminder.)
--
Michael

Attachment

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

19 June 2020, 12:53:29

At Fri, 19 Jun 2020 21:15:52 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Jun 19, 2020 at 05:34:01PM +0900, Fujii Masao wrote:
> > On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
> >> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> >>> So we usually avoid to do that between betas, but my take here is that
> >>> a catalog bump is better than regretting a change we may have to live
> >>> with after the release is sealed.
> > 
> > Sounds reasonable.
> 
> If we want to make this happen, I am afraid that the time is short as
> beta2 is planned for next week.  As the version will be likely tagged
> by Monday US time, it would be good to get this addressed within 48
> hours to give some room to the buildfarm to react.  Attached is a
> straight-forward proposal of patch.  Any thoughts?
> 
> (The change in catversion.h is a self-reminder.)

Thanks for the patch.

As a whole it contains all needed for ripping off the min_safe_lsn.
Some items in the TAP test gets coarse but none of them lose
significance. Compiles almost cleanly and passes all tests including
TAP test.

The variable last_removed_seg in slotfuncs.c:285 is left alone but no
longer used after applying this patch. It should be removed as well.

Other than that the patch looks good to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

20 June 2020, 00:39:13


On 2020/06/19 21:15, Michael Paquier wrote:
> On Fri, Jun 19, 2020 at 05:34:01PM +0900, Fujii Masao wrote:
>> On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
>>> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <michael@paquier.xyz> wrote in
>>>> So we usually avoid to do that between betas, but my take here is that
>>>> a catalog bump is better than regretting a change we may have to live
>>>> with after the release is sealed.
>>
>> Sounds reasonable.
> 
> If we want to make this happen, I am afraid that the time is short as
> beta2 is planned for next week.  As the version will be likely tagged
> by Monday US time, it would be good to get this addressed within 48
> hours to give some room to the buildfarm to react.  Attached is a
> straight-forward proposal of patch.  Any thoughts?

It's better if we can do that. But I think that we should hear Alvaro's opinion
about this before rushing to push the patch. Even if we miss beta2 as the result
of that, I'm ok. We would be able to push something better into beta3.
So, CC Alvaro.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

20 June 2020, 01:18:55

On 2020-Jun-20, Fujii Masao wrote:

> It's better if we can do that. But I think that we should hear Alvaro's opinion
> about this before rushing to push the patch. Even if we miss beta2 as the result
> of that, I'm ok. We would be able to push something better into beta3.
> So, CC Alvaro.

Uh, I was not aware of this thread.  I'll go over it now and let you
know.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

20 June 2020, 01:42:23

On 2020-Jun-19, Michael Paquier wrote:

> If we want to make this happen, I am afraid that the time is short as
> beta2 is planned for next week.  As the version will be likely tagged
> by Monday US time, it would be good to get this addressed within 48
> hours to give some room to the buildfarm to react.  Attached is a
> straight-forward proposal of patch.  Any thoughts?

I don't disagree with removing the LSN column, but at the same time we
need to provide *some* way for users to monitor this, so let's add a
function to extract the value they need for that.  It seems simple
enough.

I cannot implement it myself now, though.  I've reached the end of my
week and I'm not sure I'll be able to work on it during the weekend.

I agree with Kyotaro's opinion that the pg_walfile_name() function seems
too single-minded ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

20 June 2020, 04:15:52

On Sat, Jun 20, 2020 at 7:12 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Jun-19, Michael Paquier wrote:
>
> > If we want to make this happen, I am afraid that the time is short as
> > beta2 is planned for next week.  As the version will be likely tagged
> > by Monday US time, it would be good to get this addressed within 48
> > hours to give some room to the buildfarm to react.  Attached is a
> > straight-forward proposal of patch.  Any thoughts?
>
> I don't disagree with removing the LSN column, but at the same time we
> need to provide *some* way for users to monitor this, so let's add a
> function to extract the value they need for that.  It seems simple
> enough.
>

Isn't this information specific to checkpoints, so maybe better to
display in view pg_stat_bgwriter?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

20 June 2020, 06:53:54

On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
> On Sat, Jun 20, 2020 at 7:12 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> I don't disagree with removing the LSN column, but at the same time we
>> need to provide *some* way for users to monitor this, so let's add a
>> function to extract the value they need for that.  It seems simple
>> enough.
>
> Isn't this information specific to checkpoints, so maybe better to
> display in view pg_stat_bgwriter?

Not sure that's a good match.  If we decide to expose that, a separate
function returning a LSN based on the segment number from
XLogGetLastRemovedSegno() sounds fine to me, like
pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
mind?
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

22 June 2020, 05:49:50

On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
>> Isn't this information specific to checkpoints, so maybe better to
>> display in view pg_stat_bgwriter?
>
> Not sure that's a good match.  If we decide to expose that, a separate
> function returning a LSN based on the segment number from
> XLogGetLastRemovedSegno() sounds fine to me, like
> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
> mind?

I was thinking on this one for the last couple of days, and came up
with the name pg_wal_oldest_lsn(), as per the attached, traking the
oldest WAL location still available.  That's unfortunately too late
for beta2, but let's continue the discussion.
--
Michael

Attachment

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

22 June 2020, 12:01:16

On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
> > On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
> >> Isn't this information specific to checkpoints, so maybe better to
> >> display in view pg_stat_bgwriter?
> >
> > Not sure that's a good match.  If we decide to expose that, a separate
> > function returning a LSN based on the segment number from
> > XLogGetLastRemovedSegno() sounds fine to me, like
> > pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
> > mind?
>
> I was thinking on this one for the last couple of days, and came up
> with the name pg_wal_oldest_lsn(), as per the attached, traking the
> oldest WAL location still available.
>

I feel such a function is good to have but I am not sure if there is a
need to tie it with the removal of min_safe_lsn column.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

22 June 2020, 13:02:51

On 2020/06/22 21:01, Amit Kapila wrote:
> On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
>>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
>>>> Isn't this information specific to checkpoints, so maybe better to
>>>> display in view pg_stat_bgwriter?
>>>
>>> Not sure that's a good match.  If we decide to expose that, a separate
>>> function returning a LSN based on the segment number from
>>> XLogGetLastRemovedSegno() sounds fine to me, like
>>> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
>>> mind?
>>
>> I was thinking on this one for the last couple of days, and came up
>> with the name pg_wal_oldest_lsn(), as per the attached, traking the
>> oldest WAL location still available.

Thanks for the patch!

+        <literal>NULL</literal> if no WAL segments have been removed since
+        startup.

Isn't this confusing? I think that we should store the last removed
WAL segment to somewhere (e.g., pg_control) and restore it at
the startup, so that we can see the actual value even after the startup.
Or we should scan pg_wal directory and find the "minimal" WAL segment
and return its LSN.

> I feel such a function is good to have but I am not sure if there is a
> need to tie it with the removal of min_safe_lsn column.

We should expose the LSN calculated from
"the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"?
This indicates the minimum LSN of WAL files that are guaraneed to be
currently retained by wal_keep_segments and max_slot_wal_keep_size.
That is, if checkpoint occurs when restart_lsn of replication slot is
smaller than that minimum LSN, some required WAL files may be removed.

So DBAs can periodically monitor and compare restart_lsn and that minimum
LSN. If they see frequently that difference of those LSN is very small,
they can decide to increase wal_keep_segments or max_slot_wal_keep_size,
to prevent required WAL files from being removed. Thought?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

23 June 2020, 01:10:37

At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2020/06/22 21:01, Amit Kapila wrote:
> > On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz>
> > wrote:
> >>
> >> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
> >>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
> >>>> Isn't this information specific to checkpoints, so maybe better to
> >>>> display in view pg_stat_bgwriter?
> >>>
> >>> Not sure that's a good match.  If we decide to expose that, a separate
> >>> function returning a LSN based on the segment number from
> >>> XLogGetLastRemovedSegno() sounds fine to me, like
> >>> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
> >>> mind?
> >>
> >> I was thinking on this one for the last couple of days, and came up
> >> with the name pg_wal_oldest_lsn(), as per the attached, traking the
> >> oldest WAL location still available.
> 
> Thanks for the patch!
> 
> + <literal>NULL</literal> if no WAL segments have been removed since
> +        startup.
> 
> Isn't this confusing? I think that we should store the last removed
> WAL segment to somewhere (e.g., pg_control) and restore it at
> the startup, so that we can see the actual value even after the
> startup.
> Or we should scan pg_wal directory and find the "minimal" WAL segment
> and return its LSN.

Running a separate scan on pg_wal at startup or first time the oldest
WAL segno is referenced is something that was rejected before. But
with the current behavior we don't find the last removed segment until
any slot loses a segment if all WAL files are retained by a slot. FWIW
I recently proposed a patch to find the oldest WAL file while trying
removing old WAL files.

> > I feel such a function is good to have but I am not sure if there is a
> > need to tie it with the removal of min_safe_lsn column.
> 
> We should expose the LSN calculated from
> "the current WAL LSN - max(wal_keep_segments * 16MB,
> max_slot_wal_keep_size)"?
> This indicates the minimum LSN of WAL files that are guaraneed to be
> currently retained by wal_keep_segments and max_slot_wal_keep_size.
> That is, if checkpoint occurs when restart_lsn of replication slot is
> smaller than that minimum LSN, some required WAL files may be removed.
> So DBAs can periodically monitor and compare restart_lsn and that
> minimum
> LSN. If they see frequently that difference of those LSN is very
> small,
> they can decide to increase wal_keep_segments or
> max_slot_wal_keep_size,
> to prevent required WAL files from being removed. Thought?

I'm not sure about the consensus here about showing that number in the
view. It is similar to "remain" in the earlier versions of this patch
but a bit simpler. It would be usable in a similar way. I can live
with either numbers.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

23 June 2020, 02:16:33

On Tue, Jun 23, 2020 at 10:10:37AM +0900, Kyotaro Horiguchi wrote:
> At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Isn't this confusing? I think that we should store the last removed
>> WAL segment to somewhere (e.g., pg_control) and restore it at
>> the startup, so that we can see the actual value even after the
>> startup.
>> Or we should scan pg_wal directory and find the "minimal" WAL segment
>> and return its LSN.
>
> Running a separate scan on pg_wal at startup or first time the oldest
> WAL segno is referenced is something that was rejected before. But
> with the current behavior we don't find the last removed segment until
> any slot loses a segment if all WAL files are retained by a slot. FWIW
> I recently proposed a patch to find the oldest WAL file while trying
> removing old WAL files.

Hmm.  I agree that the approach I previously sent may be kind of
confusing without a clear initialization point, which would actually
be (checkPointCopy.redo + checkPointCopy.ThisTimeLineID) from the
control file with an extra computation depending on any replication
slot data present on disk?  So one could do the maths cleanly after
StartupReplicationSlots() is called in the startup process.  My point
is: it does not seem really obvious to me that we need to change the
control file to track that.

> I'm not sure about the consensus here about showing that number in the
> view. It is similar to "remain" in the earlier versions of this patch
> but a bit simpler. It would be usable in a similar way. I can live
> with either numbers.

Anyway, here is my take.  We are discussing a design issue here, we
are moving the discussion into having a different design, and
discussing new designs is never a good sign post-beta (some open items
tend to move towards this direction every year).  So I'd like to think
that the best thing we can do here is just to drop min_safe_lsn from
pg_replication_slots, and just reconsider this part for 14~ with
something we think is better.

By the way, I have added a separate open item for this thread.
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

23 June 2020, 02:17:15


On 2020/06/23 10:10, Kyotaro Horiguchi wrote:
> At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>
>>
>> On 2020/06/22 21:01, Amit Kapila wrote:
>>> On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz>
>>> wrote:
>>>>
>>>> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
>>>>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
>>>>>> Isn't this information specific to checkpoints, so maybe better to
>>>>>> display in view pg_stat_bgwriter?
>>>>>
>>>>> Not sure that's a good match.  If we decide to expose that, a separate
>>>>> function returning a LSN based on the segment number from
>>>>> XLogGetLastRemovedSegno() sounds fine to me, like
>>>>> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
>>>>> mind?
>>>>
>>>> I was thinking on this one for the last couple of days, and came up
>>>> with the name pg_wal_oldest_lsn(), as per the attached, traking the
>>>> oldest WAL location still available.
>>
>> Thanks for the patch!
>>
>> + <literal>NULL</literal> if no WAL segments have been removed since
>> +        startup.
>>
>> Isn't this confusing? I think that we should store the last removed
>> WAL segment to somewhere (e.g., pg_control) and restore it at
>> the startup, so that we can see the actual value even after the
>> startup.
>> Or we should scan pg_wal directory and find the "minimal" WAL segment
>> and return its LSN.
> 
> Running a separate scan on pg_wal at startup or first time the oldest
> WAL segno is referenced is something that was rejected before. But
> with the current behavior we don't find the last removed segment until
> any slot loses a segment if all WAL files are retained by a slot.

Because scanning pg_wal can be heavy operation especially when
max_wal_size is high and there are lots of WAL files? If so, it might
be better to save the value in pg_control as I told upthread.

However I'm not sure the use case of this function yet...

> FWIW
> I recently proposed a patch to find the oldest WAL file while trying
> removing old WAL files.
> 
>>> I feel such a function is good to have but I am not sure if there is a
>>> need to tie it with the removal of min_safe_lsn column.
>>
>> We should expose the LSN calculated from
>> "the current WAL LSN - max(wal_keep_segments * 16MB,
>> max_slot_wal_keep_size)"?
>> This indicates the minimum LSN of WAL files that are guaraneed to be
>> currently retained by wal_keep_segments and max_slot_wal_keep_size.
>> That is, if checkpoint occurs when restart_lsn of replication slot is
>> smaller than that minimum LSN, some required WAL files may be removed.
>> So DBAs can periodically monitor and compare restart_lsn and that
>> minimum
>> LSN. If they see frequently that difference of those LSN is very
>> small,
>> they can decide to increase wal_keep_segments or
>> max_slot_wal_keep_size,
>> to prevent required WAL files from being removed. Thought?
> 
> I'm not sure about the consensus here about showing that number in the
> view. It is similar to "remain" in the earlier versions of this patch
> but a bit simpler. It would be usable in a similar way. I can live
> with either numbers.

It's useless to display this value in each replication slot in the view.
I'm thinking to expose it as a function.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

23 June 2020, 06:20:34

On Mon, Jun 22, 2020 at 6:32 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2020/06/22 21:01, Amit Kapila wrote:
> > On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz> wrote:
> >>
> >> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
> >>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
> >>>> Isn't this information specific to checkpoints, so maybe better to
> >>>> display in view pg_stat_bgwriter?
> >>>
> >>> Not sure that's a good match.  If we decide to expose that, a separate
> >>> function returning a LSN based on the segment number from
> >>> XLogGetLastRemovedSegno() sounds fine to me, like
> >>> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
> >>> mind?
> >>
> >> I was thinking on this one for the last couple of days, and came up
> >> with the name pg_wal_oldest_lsn(), as per the attached, traking the
> >> oldest WAL location still available.
>
> Thanks for the patch!
>
> +        <literal>NULL</literal> if no WAL segments have been removed since
> +        startup.
>
> Isn't this confusing? I think that we should store the last removed
> WAL segment to somewhere (e.g., pg_control) and restore it at
> the startup, so that we can see the actual value even after the startup.
> Or we should scan pg_wal directory and find the "minimal" WAL segment
> and return its LSN.
>
>
> > I feel such a function is good to have but I am not sure if there is a
> > need to tie it with the removal of min_safe_lsn column.
>
> We should expose the LSN calculated from
> "the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"?
> This indicates the minimum LSN of WAL files that are guaraneed to be
> currently retained by wal_keep_segments and max_slot_wal_keep_size.
> That is, if checkpoint occurs when restart_lsn of replication slot is
> smaller than that minimum LSN, some required WAL files may be removed.
>
> So DBAs can periodically monitor and compare restart_lsn and that minimum
> LSN. If they see frequently that difference of those LSN is very small,
> they can decide to increase wal_keep_segments or max_slot_wal_keep_size,
> to prevent required WAL files from being removed. Thought?
>

+1.  This sounds like a good and useful stat for users.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

23 June 2020, 06:27:30

On Tue, Jun 23, 2020 at 7:47 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2020/06/23 10:10, Kyotaro Horiguchi wrote:
> > At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> >>
> >>> I feel such a function is good to have but I am not sure if there is a
> >>> need to tie it with the removal of min_safe_lsn column.
> >>
> >> We should expose the LSN calculated from
> >> "the current WAL LSN - max(wal_keep_segments * 16MB,
> >> max_slot_wal_keep_size)"?
> >> This indicates the minimum LSN of WAL files that are guaraneed to be
> >> currently retained by wal_keep_segments and max_slot_wal_keep_size.
> >> That is, if checkpoint occurs when restart_lsn of replication slot is
> >> smaller than that minimum LSN, some required WAL files may be removed.
> >> So DBAs can periodically monitor and compare restart_lsn and that
> >> minimum
> >> LSN. If they see frequently that difference of those LSN is very
> >> small,
> >> they can decide to increase wal_keep_segments or
> >> max_slot_wal_keep_size,
> >> to prevent required WAL files from being removed. Thought?
> >
> > I'm not sure about the consensus here about showing that number in the
> > view. It is similar to "remain" in the earlier versions of this patch
> > but a bit simpler. It would be usable in a similar way. I can live
> > with either numbers.
>
> It's useless to display this value in each replication slot in the view.
> I'm thinking to expose it as a function.
>

Having a separate function for this seems like a good idea but can we
consider displaying it in a view like pg_stat_replication_slots as we
are discussing a nearby thread to have such a view for other things.
I think ultimately this information is required to check whether some
slot can be invalidated or not, so having it displayed along with
other slot information might not be a bad idea.

[1] - https://www.postgresql.org/message-id/CAA4eK1Jyh4qgdnxzV4fYuk9GiXLb%3DUz-6o19E2RfiN8MPmUu3A%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

23 June 2020, 08:08:14

At Tue, 23 Jun 2020 11:50:34 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> On Mon, Jun 22, 2020 at 6:32 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >
> > On 2020/06/22 21:01, Amit Kapila wrote:
> > > On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael@paquier.xyz> wrote:
> > >>
> > >> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
> > >>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
> > >>>> Isn't this information specific to checkpoints, so maybe better to
> > >>>> display in view pg_stat_bgwriter?
> > >>>
> > >>> Not sure that's a good match.  If we decide to expose that, a separate
> > >>> function returning a LSN based on the segment number from
> > >>> XLogGetLastRemovedSegno() sounds fine to me, like
> > >>> pg_wal_last_recycled_lsn().  Perhaps somebody has a better name in
> > >>> mind?
> > >>
> > >> I was thinking on this one for the last couple of days, and came up
> > >> with the name pg_wal_oldest_lsn(), as per the attached, traking the
> > >> oldest WAL location still available.
> >
> > Thanks for the patch!
> >
> > +        <literal>NULL</literal> if no WAL segments have been removed since
> > +        startup.
> >
> > Isn't this confusing? I think that we should store the last removed
> > WAL segment to somewhere (e.g., pg_control) and restore it at
> > the startup, so that we can see the actual value even after the startup.
> > Or we should scan pg_wal directory and find the "minimal" WAL segment
> > and return its LSN.
> >
> >
> > > I feel such a function is good to have but I am not sure if there is a
> > > need to tie it with the removal of min_safe_lsn column.
> >
> > We should expose the LSN calculated from
> > "the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"?
> > This indicates the minimum LSN of WAL files that are guaraneed to be
> > currently retained by wal_keep_segments and max_slot_wal_keep_size.
> > That is, if checkpoint occurs when restart_lsn of replication slot is
> > smaller than that minimum LSN, some required WAL files may be removed.
> >
> > So DBAs can periodically monitor and compare restart_lsn and that minimum
> > LSN. If they see frequently that difference of those LSN is very small,
> > they can decide to increase wal_keep_segments or max_slot_wal_keep_size,
> > to prevent required WAL files from being removed. Thought?
> >
> 
> +1.  This sounds like a good and useful stat for users.

+1 for showing a number that is not involving lastRemovedSegNo. It is
like returning to the initial version of this patch. It showed a
number like ((the suggested above) minus restart_lsn). The number is
different for each slot so they fit in the view.

The number is usable for the same purpose so I'm ok with it.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

23 June 2020, 23:39:38

On 2020-Jun-23, Kyotaro Horiguchi wrote:

> At Tue, 23 Jun 2020 11:50:34 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> > On Mon, Jun 22, 2020 at 6:32 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

> > > We should expose the LSN calculated from
> > > "the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"?
> > > This indicates the minimum LSN of WAL files that are guaraneed to be
> > > currently retained by wal_keep_segments and max_slot_wal_keep_size.
> > > That is, if checkpoint occurs when restart_lsn of replication slot is
> > > smaller than that minimum LSN, some required WAL files may be removed.
> > >
> > > So DBAs can periodically monitor and compare restart_lsn and that minimum
> > > LSN. If they see frequently that difference of those LSN is very small,
> > > they can decide to increase wal_keep_segments or max_slot_wal_keep_size,
> > > to prevent required WAL files from being removed. Thought?
> > 
> > +1.  This sounds like a good and useful stat for users.
> 
> +1 for showing a number that is not involving lastRemovedSegNo. It is
> like returning to the initial version of this patch. It showed a
> number like ((the suggested above) minus restart_lsn). The number is
> different for each slot so they fit in the view.
> 
> The number is usable for the same purpose so I'm ok with it.

I think we should publish the value from wal_keep_segments separately
from max_slot_wal_keep_size.  ISTM that the user might decide to change
or remove wal_keep_segments and be suddenly at risk of losing slots
because of overlooking that it was wal_keep_segments, not
max_slot_wal_keep_size, that was protecting them.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

24 June 2020, 09:07:45


On 2020/06/23 15:27, Amit Kapila wrote:
> On Tue, Jun 23, 2020 at 7:47 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>> On 2020/06/23 10:10, Kyotaro Horiguchi wrote:
>>> At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>>>
>>>>> I feel such a function is good to have but I am not sure if there is a
>>>>> need to tie it with the removal of min_safe_lsn column.
>>>>
>>>> We should expose the LSN calculated from
>>>> "the current WAL LSN - max(wal_keep_segments * 16MB,
>>>> max_slot_wal_keep_size)"?
>>>> This indicates the minimum LSN of WAL files that are guaraneed to be
>>>> currently retained by wal_keep_segments and max_slot_wal_keep_size.
>>>> That is, if checkpoint occurs when restart_lsn of replication slot is
>>>> smaller than that minimum LSN, some required WAL files may be removed.
>>>> So DBAs can periodically monitor and compare restart_lsn and that
>>>> minimum
>>>> LSN. If they see frequently that difference of those LSN is very
>>>> small,
>>>> they can decide to increase wal_keep_segments or
>>>> max_slot_wal_keep_size,
>>>> to prevent required WAL files from being removed. Thought?
>>>
>>> I'm not sure about the consensus here about showing that number in the
>>> view. It is similar to "remain" in the earlier versions of this patch
>>> but a bit simpler. It would be usable in a similar way. I can live
>>> with either numbers.
>>
>> It's useless to display this value in each replication slot in the view.
>> I'm thinking to expose it as a function.
>>
> 
> Having a separate function for this seems like a good idea but can we
> consider displaying it in a view like pg_stat_replication_slots as we
> are discussing a nearby thread to have such a view for other things.
> I think ultimately this information is required to check whether some
> slot can be invalidated or not, so having it displayed along with
> other slot information might not be a bad idea.

"the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"
is the same value between all the replication slots. But you think it's better
to display that same value for every slots in the view?

Or you're thinking to display the difference of that LSN value and
restart_lsn as Horiguchi-san suggested? That diff varies each replication slot,
so it seems ok to display it for every rows.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

24 June 2020, 09:09:06


On 2020/06/24 8:39, Alvaro Herrera wrote:
> On 2020-Jun-23, Kyotaro Horiguchi wrote:
> 
>> At Tue, 23 Jun 2020 11:50:34 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
>>> On Mon, Jun 22, 2020 at 6:32 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> 
>>>> We should expose the LSN calculated from
>>>> "the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"?
>>>> This indicates the minimum LSN of WAL files that are guaraneed to be
>>>> currently retained by wal_keep_segments and max_slot_wal_keep_size.
>>>> That is, if checkpoint occurs when restart_lsn of replication slot is
>>>> smaller than that minimum LSN, some required WAL files may be removed.
>>>>
>>>> So DBAs can periodically monitor and compare restart_lsn and that minimum
>>>> LSN. If they see frequently that difference of those LSN is very small,
>>>> they can decide to increase wal_keep_segments or max_slot_wal_keep_size,
>>>> to prevent required WAL files from being removed. Thought?
>>>
>>> +1.  This sounds like a good and useful stat for users.
>>
>> +1 for showing a number that is not involving lastRemovedSegNo. It is
>> like returning to the initial version of this patch. It showed a
>> number like ((the suggested above) minus restart_lsn). The number is
>> different for each slot so they fit in the view.
>>
>> The number is usable for the same purpose so I'm ok with it.
> 
> I think we should publish the value from wal_keep_segments separately
> from max_slot_wal_keep_size.  ISTM that the user might decide to change
> or remove wal_keep_segments and be suddenly at risk of losing slots
> because of overlooking that it was wal_keep_segments, not
> max_slot_wal_keep_size, that was protecting them.

You mean to have two functions that returns

1. "current WAL LSN - wal_keep_segments * 16MB"
2. "current WAL LSN - max_slot_wal_keep_size"

Right?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

24 June 2020, 12:49:46

On Wed, Jun 24, 2020 at 2:37 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2020/06/23 15:27, Amit Kapila wrote:
> >
> > Having a separate function for this seems like a good idea but can we
> > consider displaying it in a view like pg_stat_replication_slots as we
> > are discussing a nearby thread to have such a view for other things.
> > I think ultimately this information is required to check whether some
> > slot can be invalidated or not, so having it displayed along with
> > other slot information might not be a bad idea.
>
> "the current WAL LSN - max(wal_keep_segments * 16MB, max_slot_wal_keep_size)"
> is the same value between all the replication slots. But you think it's better
> to display that same value for every slots in the view?
>
> Or you're thinking to display the difference of that LSN value and
> restart_lsn as Horiguchi-san suggested?
>

I see value in Horiguchi-San's proposal.  IIUC, it will tell help
DBAs/Users to know if any particular slot will get invalidated soon.

> That diff varies each replication slot,
> so it seems ok to display it for every rows.
>

Yes.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

24 June 2020, 15:15:14

On 2020-Jun-24, Fujii Masao wrote:

> On 2020/06/24 8:39, Alvaro Herrera wrote:

> > I think we should publish the value from wal_keep_segments separately
> > from max_slot_wal_keep_size.  ISTM that the user might decide to change
> > or remove wal_keep_segments and be suddenly at risk of losing slots
> > because of overlooking that it was wal_keep_segments, not
> > max_slot_wal_keep_size, that was protecting them.
> 
> You mean to have two functions that returns
> 
> 1. "current WAL LSN - wal_keep_segments * 16MB"
> 2. "current WAL LSN - max_slot_wal_keep_size"

Hmm, but all the values there are easily findable.  What would be the
point in repeating it?

Maybe we should disregard this line of thinking and go back to
Horiguchi-san's original proposal, to wit use the "distance to
breakage", as also supported now by Amit Kapila[1] (unless I
misunderstand him).

[1] https://postgr.es/m/CAA4eK1L2oJ7T1cESdc5w4J9L3Q_hhvWqTigdAXKfnsJy4=v13w@mail.gmail.com

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

25 June 2020, 10:20:27

On Wed, Jun 24, 2020 at 8:45 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Jun-24, Fujii Masao wrote:
>
> > On 2020/06/24 8:39, Alvaro Herrera wrote:
>
> > > I think we should publish the value from wal_keep_segments separately
> > > from max_slot_wal_keep_size.  ISTM that the user might decide to change
> > > or remove wal_keep_segments and be suddenly at risk of losing slots
> > > because of overlooking that it was wal_keep_segments, not
> > > max_slot_wal_keep_size, that was protecting them.
> >
> > You mean to have two functions that returns
> >
> > 1. "current WAL LSN - wal_keep_segments * 16MB"
> > 2. "current WAL LSN - max_slot_wal_keep_size"
>
> Hmm, but all the values there are easily findable.  What would be the
> point in repeating it?
>
> Maybe we should disregard this line of thinking and go back to
> Horiguchi-san's original proposal, to wit use the "distance to
> breakage", as also supported now by Amit Kapila[1] (unless I
> misunderstand him).
>

+1.  I also think let's drop the idea of exposing a function for this
value and revert the min_safe_lsn part of the work as proposed by
Michael above [1] excluding the function pg_wal_oldest_lsn() in that
patch.  Then, we can expose this as a new stat for PG14.  I feel it
would be better to display this stat in a new view (something like
pg_stat_replication_slots) as discussed in another thread [2].  Does
that make sense?

[1] - https://www.postgresql.org/message-id/20200622054950.GC50978%40paquier.xyz
[2] - https://www.postgresql.org/message-id/CA%2Bfd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

25 June 2020, 15:24:27

On 2020-Jun-25, Amit Kapila wrote:

> +1.  I also think let's drop the idea of exposing a function for this
> value and revert the min_safe_lsn part of the work as proposed by
> Michael above [1] excluding the function pg_wal_oldest_lsn() in that
> patch.  Then, we can expose this as a new stat for PG14.  I feel it
> would be better to display this stat in a new view (something like
> pg_stat_replication_slots) as discussed in another thread [2].  Does
> that make sense?

I don't understand the proposal.  Michael posted a patch that adds
pg_wal_oldest_lsn(), and you say we should apply the patch except the
part that adds that function -- so what part would be applying?

If the proposal is to apply just the hunk in pg_get_replication_slots
that removes min_safe_lsn, and do nothing else in pg13, then I don't like
it.  The feature exposes a way to monitor slots w.r.t. the maximum slot
size; I'm okay if you prefer to express that in a different way, but I
don't like the idea of shipping pg13 without any way to monitor it.

As reported by Masao-san, the current min_safe_lsn has a definitional
problem when used with pg_walfile_name(), but we've established that
that's because pg_walfile_name() has a special-case definition, not
because min_safe_lsn itself is bogus.  If we're looking for a minimal
change that can fix this problem, let's increment one byte, which should
fix that issue, no?

I also see that some people complain that all slots return the same
value and therefore this column is redundant.  To that argument I say
that it's not unreasonable that we'll add a slot-specific size limit;
and if we do, we'll be happy we had slot-specific min safe LSN; see e.g.
https://postgr.es/m/20170301160610.wc7ez3vihmialntd@alap3.anarazel.de

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Michael Paquier

Date:

25 June 2020, 22:53:50

On Thu, Jun 25, 2020 at 11:24:27AM -0400, Alvaro Herrera wrote:
> I don't understand the proposal.  Michael posted a patch that adds
> pg_wal_oldest_lsn(), and you say we should apply the patch except the
> part that adds that function -- so what part would be applying?

I have sent last week a patch about only the removal of min_safe_lsn:
https://www.postgresql.org/message-id/20200619121552.GH453547@paquier.xyz
So this applies to this part.

> If the proposal is to apply just the hunk in pg_get_replication_slots
> that removes min_safe_lsn, and do nothing else in pg13, then I don't like
> it.  The feature exposes a way to monitor slots w.r.t. the maximum slot
> size; I'm okay if you prefer to express that in a different way, but I
> don't like the idea of shipping pg13 without any way to monitor it.

From what I can see, it seems to me that we have a lot of views of how
to tackle the matter.  That gives an idea that we are not really happy
with the current state of things, and usually a sign that we may want
to redesign it, going back to this issue for v14.

My 2c.
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

25 June 2020, 23:24:17

On 2020-Jun-26, Michael Paquier wrote:

> On Thu, Jun 25, 2020 at 11:24:27AM -0400, Alvaro Herrera wrote:
> > I don't understand the proposal.  Michael posted a patch that adds
> > pg_wal_oldest_lsn(), and you say we should apply the patch except the
> > part that adds that function -- so what part would be applying?
> 
> I have sent last week a patch about only the removal of min_safe_lsn:
> https://www.postgresql.org/message-id/20200619121552.GH453547@paquier.xyz
> So this applies to this part.

Well, I oppose that because it leaves us with no way to monitor slot
limits.  In his opening email, Masao-san proposed to simply change the
value by adding 1.  How you go from adding 1 to a column to removing
the column completely with no recourse, is beyond me.

Let me summarize the situation and possible ways forward as I see them.
If I'm mistaken, please correct me.

Problems:
i)  pg_replication_slot.min_safe_lsn has a weird definition in that all
    replication slots show the same value
ii) min_safe_lsn cannot be used with pg_walfile_name, because it returns
    the name of the previous segment.

Proposed solutions:

a) Do nothing -- keep the min_safe_lsn column as is.  Warn users that
   pg_walfile_name should not be used with this column.
b) Redefine min_safe_lsn to be lsn+1, so that pg_walfile_name can be used
   and return a useful value.
c) Remove min_safe_lsn; add functions that expose the same value
d) Remove min_safe_lsn; add a new view that exposes the same value and
   possibly others
e) Replace min_safe_lsn with a "distance" column, which reports
   restart_lsn - oldest valid LSN
   (Note that you no longer have an LSN in this scenario, so you can't
   call pg_walfile_name.)

The original patch implemented (e); it was changed to its current
definition because of this[1] comment.  My proposal is to put it back.

[1] https://postgr.es/m/20171106132050.6apzynxrqrzghb4r@alap3.anarazel.de

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

26 June 2020, 04:45:08

On Fri, Jun 26, 2020 at 4:54 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Jun-26, Michael Paquier wrote:
>
> > On Thu, Jun 25, 2020 at 11:24:27AM -0400, Alvaro Herrera wrote:
> > > I don't understand the proposal.  Michael posted a patch that adds
> > > pg_wal_oldest_lsn(), and you say we should apply the patch except the
> > > part that adds that function -- so what part would be applying?
> >
> > I have sent last week a patch about only the removal of min_safe_lsn:
> > https://www.postgresql.org/message-id/20200619121552.GH453547@paquier.xyz
> > So this applies to this part.
>
> Well, I oppose that because it leaves us with no way to monitor slot
> limits.  In his opening email, Masao-san proposed to simply change the
> value by adding 1.  How you go from adding 1 to a column to removing
> the column completely with no recourse, is beyond me.
>
> Let me summarize the situation and possible ways forward as I see them.
> If I'm mistaken, please correct me.
>
> Problems:
> i)  pg_replication_slot.min_safe_lsn has a weird definition in that all
>     replication slots show the same value
>

It is also not clear how the user can make use of that value?

> ii) min_safe_lsn cannot be used with pg_walfile_name, because it returns
>     the name of the previous segment.
>
> Proposed solutions:
>
> a) Do nothing -- keep the min_safe_lsn column as is.  Warn users that
>    pg_walfile_name should not be used with this column.
> b) Redefine min_safe_lsn to be lsn+1, so that pg_walfile_name can be used
>    and return a useful value.
> c) Remove min_safe_lsn; add functions that expose the same value
> d) Remove min_safe_lsn; add a new view that exposes the same value and
>    possibly others
>
> e) Replace min_safe_lsn with a "distance" column, which reports
>    restart_lsn - oldest valid LSN
>    (Note that you no longer have an LSN in this scenario, so you can't
>    call pg_walfile_name.)
>

Can we consider an option to "Remove min_safe_lsn; document how a user
can monitor the distance"?  We have a function to get current WAL
insert location and other things required are available either via
view or as guc variable values.  The reason I am thinking of this
option is that it might be better to get some more feedback on what is
the most appropriate value to display.  However, I am okay if we can
reach a consensus on one of the above options.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

30 June 2020, 08:07:03


On 2020/06/26 13:45, Amit Kapila wrote:
> On Fri, Jun 26, 2020 at 4:54 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> On 2020-Jun-26, Michael Paquier wrote:
>>
>>> On Thu, Jun 25, 2020 at 11:24:27AM -0400, Alvaro Herrera wrote:
>>>> I don't understand the proposal.  Michael posted a patch that adds
>>>> pg_wal_oldest_lsn(), and you say we should apply the patch except the
>>>> part that adds that function -- so what part would be applying?
>>>
>>> I have sent last week a patch about only the removal of min_safe_lsn:
>>> https://www.postgresql.org/message-id/20200619121552.GH453547@paquier.xyz
>>> So this applies to this part.
>>
>> Well, I oppose that because it leaves us with no way to monitor slot
>> limits.  In his opening email, Masao-san proposed to simply change the
>> value by adding 1.  How you go from adding 1 to a column to removing
>> the column completely with no recourse, is beyond me.
>>
>> Let me summarize the situation and possible ways forward as I see them.
>> If I'm mistaken, please correct me.
>>
>> Problems:
>> i)  pg_replication_slot.min_safe_lsn has a weird definition in that all
>>      replication slots show the same value
>>
> 
> It is also not clear how the user can make use of that value?
> 
>> ii) min_safe_lsn cannot be used with pg_walfile_name, because it returns
>>      the name of the previous segment.
>>
>> Proposed solutions:
>>
>> a) Do nothing -- keep the min_safe_lsn column as is.  Warn users that
>>     pg_walfile_name should not be used with this column.
>> b) Redefine min_safe_lsn to be lsn+1, so that pg_walfile_name can be used
>>     and return a useful value.
>> c) Remove min_safe_lsn; add functions that expose the same value
>> d) Remove min_safe_lsn; add a new view that exposes the same value and
>>     possibly others
>>
>> e) Replace min_safe_lsn with a "distance" column, which reports
>>     restart_lsn - oldest valid LSN
>>     (Note that you no longer have an LSN in this scenario, so you can't
>>     call pg_walfile_name.)

I like (e).

> 
> Can we consider an option to "Remove min_safe_lsn; document how a user
> can monitor the distance"?  We have a function to get current WAL
> insert location and other things required are available either via
> view or as guc variable values.  The reason I am thinking of this
> option is that it might be better to get some more feedback on what is
> the most appropriate value to display.  However, I am okay if we can
> reach a consensus on one of the above options.

Yes, that's an idea. But it might not be easy to calculate that distance
manually by subtracting max_slot_wal_keep_size from the current LSN.
Because we've not supported -(pg_lsn, numeric) operator yet. I'm
proposing that operator, but it's for v14.

Regards,


-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Fujii Masao

Date:

30 June 2020, 14:23:30


On 2020/06/30 17:07, Fujii Masao wrote:
> 
> 
> On 2020/06/26 13:45, Amit Kapila wrote:
>> On Fri, Jun 26, 2020 at 4:54 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>>
>>> On 2020-Jun-26, Michael Paquier wrote:
>>>
>>>> On Thu, Jun 25, 2020 at 11:24:27AM -0400, Alvaro Herrera wrote:
>>>>> I don't understand the proposal.  Michael posted a patch that adds
>>>>> pg_wal_oldest_lsn(), and you say we should apply the patch except the
>>>>> part that adds that function -- so what part would be applying?
>>>>
>>>> I have sent last week a patch about only the removal of min_safe_lsn:
>>>> https://www.postgresql.org/message-id/20200619121552.GH453547@paquier.xyz
>>>> So this applies to this part.
>>>
>>> Well, I oppose that because it leaves us with no way to monitor slot
>>> limits.  In his opening email, Masao-san proposed to simply change the
>>> value by adding 1.  How you go from adding 1 to a column to removing
>>> the column completely with no recourse, is beyond me.
>>>
>>> Let me summarize the situation and possible ways forward as I see them.
>>> If I'm mistaken, please correct me.
>>>
>>> Problems:
>>> i)  pg_replication_slot.min_safe_lsn has a weird definition in that all
>>>      replication slots show the same value
>>>
>>
>> It is also not clear how the user can make use of that value?
>>
>>> ii) min_safe_lsn cannot be used with pg_walfile_name, because it returns
>>>      the name of the previous segment.
>>>
>>> Proposed solutions:
>>>
>>> a) Do nothing -- keep the min_safe_lsn column as is.  Warn users that
>>>     pg_walfile_name should not be used with this column.
>>> b) Redefine min_safe_lsn to be lsn+1, so that pg_walfile_name can be used
>>>     and return a useful value.
>>> c) Remove min_safe_lsn; add functions that expose the same value
>>> d) Remove min_safe_lsn; add a new view that exposes the same value and
>>>     possibly others
>>>
>>> e) Replace min_safe_lsn with a "distance" column, which reports
>>>     restart_lsn - oldest valid LSN
>>>     (Note that you no longer have an LSN in this scenario, so you can't
>>>     call pg_walfile_name.)
> 
> I like (e).
> 
>>
>> Can we consider an option to "Remove min_safe_lsn; document how a user
>> can monitor the distance"?  We have a function to get current WAL
>> insert location and other things required are available either via
>> view or as guc variable values.  The reason I am thinking of this
>> option is that it might be better to get some more feedback on what is
>> the most appropriate value to display.  However, I am okay if we can
>> reach a consensus on one of the above options.
> 
> Yes, that's an idea. But it might not be easy to calculate that distance
> manually by subtracting max_slot_wal_keep_size from the current LSN.
> Because we've not supported -(pg_lsn, numeric) operator yet. I'm
> proposing that operator, but it's for v14.

Sorry this is not true. That distance can be calculated without those operators.
For example,

SELECT restart_lsn - pg_current_wal_lsn() + (SELECT setting::numeric * 1024 * 1024 FROM pg_settings WHERE name =
'max_slot_wal_keep_size')distance FROM pg_replication_slots;
 

If the calculated distance is small or negative value, which means that
we may lose some required WAL files. So in this case it's worth considering
to increase max_slot_wal_keep_size.

I still think it's better and more helpful to display something like
that distance in pg_replication_slots rather than making each user
calculate it...

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

30 June 2020, 18:09:22

On 2020-Jun-30, Fujii Masao wrote:

> Sorry this is not true. That distance can be calculated without those operators.
> For example,
> 
> SELECT restart_lsn - pg_current_wal_lsn() + (SELECT setting::numeric * 1024 * 1024 FROM pg_settings WHERE name =
'max_slot_wal_keep_size')distance FROM pg_replication_slots;
 
> 
> If the calculated distance is small or negative value, which means that
> we may lose some required WAL files. So in this case it's worth considering
> to increase max_slot_wal_keep_size.

... OK, but you're forgetting wal_keep_segments.

> I still think it's better and more helpful to display something like
> that distance in pg_replication_slots rather than making each user
> calculate it...

Agreed.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

01 July 2020, 01:32:59

At Tue, 30 Jun 2020 23:23:30 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> >> Can we consider an option to "Remove min_safe_lsn; document how a user
> >> can monitor the distance"?  We have a function to get current WAL
> >> insert location and other things required are available either via
> >> view or as guc variable values.  The reason I am thinking of this
> >> option is that it might be better to get some more feedback on what is
> >> the most appropriate value to display.  However, I am okay if we can
> >> reach a consensus on one of the above options.
> > Yes, that's an idea. But it might not be easy to calculate that
> > distance
> > manually by subtracting max_slot_wal_keep_size from the current LSN.
> > Because we've not supported -(pg_lsn, numeric) operator yet. I'm
> > proposing that operator, but it's for v14.
>
> Sorry this is not true. That distance can be calculated without those
> operators.
> For example,
>
> SELECT restart_lsn - pg_current_wal_lsn() + (SELECT setting::numeric *
> 1024 * 1024 FROM pg_settings WHERE name = 'max_slot_wal_keep_size')
> distance FROM pg_replication_slots;

It's an approximation with accuracy of segment size. The calculation
would be not that simple because of the unit of the calculation.  The
formula for the exact calculateion (ignoring wal_keep_segments) is:

distance = (seg_floor(restart_lsn) +
             seg_floor(max_slot_wal_keep_size) + 1) * wal_segment_size -
            current_lsn

where seg_floor is floor() by wal_segment_size.

regards.

> If the calculated distance is small or negative value, which means
> that
> we may lose some required WAL files. So in this case it's worth
> considering
> to increase max_slot_wal_keep_size.
>
> I still think it's better and more helpful to display something like
> that distance in pg_replication_slots rather than making each user
> calculate it...

Agreed.  The attached replaces min_safe_lsn with "distance".

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center
From 7e078c9ff8e0594ce8f4e95c7db84ea0a31e9775 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Tue, 30 Jun 2020 21:09:19 +0900
Subject: [PATCH v1] Replace min_safe_lsn with "distance" in
 pg_replication_slots

pg_replication_slot.min_safe_lsn, which shows the oldest LSN kept in
pg_wal, is doubtful in usability for monitoring. Change it to
distance, which shows how many bytes the server can advance before the
slot loses required segments.
---
 src/backend/access/transam/xlog.c         | 39 +++++++++++++++++++++++
 src/backend/catalog/system_views.sql      |  2 +-
 src/backend/replication/slotfuncs.c       | 19 +++++------
 src/include/access/xlog.h                 |  1 +
 src/include/catalog/pg_proc.dat           |  4 +--
 src/test/recovery/t/019_replslot_limit.pl | 20 ++++++------
 src/test/regress/expected/rules.out       |  4 +--
 7 files changed, 62 insertions(+), 27 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fd93bcfaeb..1f27639912 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9570,6 +9570,45 @@ GetWALAvailability(XLogRecPtr targetLSN)
     return WALAVAIL_REMOVED;
 }
 
+/*
+ * Calculate how many bytes we can advance from currptr until the targetLSN is
+ * removed.
+ *
+ * Returns 0 if the distance is invalid.
+ */
+uint64
+DistanceToWALHorizon(XLogRecPtr targetLSN, XLogRecPtr currptr)
+{
+    XLogSegNo    targetSeg;
+    XLogSegNo    keepSegs;
+    XLogSegNo    failSeg;
+    XLogRecPtr    horizon;
+
+    XLByteToSeg(targetLSN, targetSeg, wal_segment_size);
+    keepSegs = 0;
+
+    /* no limit if max_slot_wal_keep_size is invalid */
+    if (max_slot_wal_keep_size_mb < 0)
+        return 0;
+
+    /* How many segments slots can keep? */
+    keepSegs = ConvertToXSegs(max_slot_wal_keep_size_mb, wal_segment_size);
+
+    /* override by wal_keep_segments if needed */
+    if (wal_keep_segments > keepSegs)
+        keepSegs = wal_keep_segments;
+
+    /* calculate the LSN where targetLSN is lost when currpos reaches */
+    failSeg = targetSeg + keepSegs + 1;
+    XLogSegNoOffsetToRecPtr(failSeg, 0, wal_segment_size, horizon);
+
+    /* If currptr already beyond the horizon, return zero. */
+    if (currptr > horizon)
+        return 0;
+
+    /* return the distance from currptr to the horizon */
+    return horizon - currptr;
+}
 
 /*
  * Retreat *logSegNo to the last segment that we need to retain because of
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5314e9348f..b9847a9f92 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -879,7 +879,7 @@ CREATE VIEW pg_replication_slots AS
             L.restart_lsn,
             L.confirmed_flush_lsn,
             L.wal_status,
-            L.min_safe_lsn
+            L.distance
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 88033a79b2..532b3c5826 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -242,6 +242,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
     Tuplestorestate *tupstore;
     MemoryContext per_query_ctx;
     MemoryContext oldcontext;
+    XLogRecPtr    currlsn;
     int            slotno;
 
     /* check to see if caller supports us returning a tuplestore */
@@ -274,6 +275,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
     MemoryContextSwitchTo(oldcontext);
 
+    currlsn = GetXLogWriteRecPtr();
+
     LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
     for (slotno = 0; slotno < max_replication_slots; slotno++)
     {
@@ -282,7 +285,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
         Datum        values[PG_GET_REPLICATION_SLOTS_COLS];
         bool        nulls[PG_GET_REPLICATION_SLOTS_COLS];
         WALAvailability walstate;
-        XLogSegNo    last_removed_seg;
+        uint32        distance;
         int            i;
 
         if (!slot->in_use)
@@ -398,16 +401,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
                 break;
         }
 
-        if (max_slot_wal_keep_size_mb >= 0 &&
-            (walstate == WALAVAIL_RESERVED || walstate == WALAVAIL_EXTENDED) &&
-            ((last_removed_seg = XLogGetLastRemovedSegno()) != 0))
-        {
-            XLogRecPtr    min_safe_lsn;
-
-            XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
-                                    wal_segment_size, min_safe_lsn);
-            values[i++] = Int64GetDatum(min_safe_lsn);
-        }
+        distance =
+            DistanceToWALHorizon(slot_contents.data.restart_lsn, currlsn);
+        if (distance > 0)
+            values[i++] = Int64GetDatum(distance);
         else
             nulls[i++] = true;
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 77ac4e785f..1ec448c5d5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -329,6 +329,7 @@ extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
 extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
+extern uint64 DistanceToWALHorizon(XLogRecPtr targetLSN, XLogRecPtr currptr);
 extern XLogRecPtr CalculateMaxmumSafeLSN(void);
 extern void XLogPutNextOid(Oid nextOid);
 extern XLogRecPtr XLogRestorePoint(const char *rpName);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 61f2c2f5b4..199fd994bd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -10063,9 +10063,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,pg_lsn}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames =>
'{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,min_safe_lsn}',
+  proargnames =>
'{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,distance}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 7d22ae5720..1c76d2d9e9 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -28,7 +28,7 @@ $node_master->safe_psql('postgres',
 
 # The slot state and remain should be null before the first connection
 my $result = $node_master->safe_psql('postgres',
-    "SELECT restart_lsn IS NULL, wal_status is NULL, min_safe_lsn is NULL FROM pg_replication_slots WHERE slot_name =
'rep1'"
+    "SELECT restart_lsn IS NULL, wal_status is NULL, distance is NULL FROM pg_replication_slots WHERE slot_name =
'rep1'"
 );
 is($result, "t|t|t", 'check the state of non-reserved slot is "unknown"');
 
@@ -52,9 +52,9 @@ $node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
 # Stop standby
 $node_standby->stop;
 
-# Preparation done, the slot is the state "normal" now
+# Preparation done, the slot is the state "reserved" now
 $result = $node_master->safe_psql('postgres',
-    "SELECT wal_status, min_safe_lsn is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
+    "SELECT wal_status, distance is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
 );
 is($result, "reserved|t", 'check the catching-up state');
 
@@ -64,7 +64,7 @@ $node_master->safe_psql('postgres', "CHECKPOINT;");
 
 # The slot is always "safe" when fitting max_wal_size
 $result = $node_master->safe_psql('postgres',
-    "SELECT wal_status, min_safe_lsn is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
+    "SELECT wal_status, distance is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
 );
 is($result, "reserved|t",
     'check that it is safe if WAL fits in max_wal_size');
@@ -74,7 +74,7 @@ $node_master->safe_psql('postgres', "CHECKPOINT;");
 
 # The slot is always "safe" when max_slot_wal_keep_size is not set
 $result = $node_master->safe_psql('postgres',
-    "SELECT wal_status, min_safe_lsn is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
+    "SELECT wal_status, distance is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
 );
 is($result, "reserved|t", 'check that slot is working');
 
@@ -94,9 +94,7 @@ max_slot_wal_keep_size = ${max_slot_wal_keep_size_mb}MB
 ));
 $node_master->reload;
 
-# The slot is in safe state. The distance from the min_safe_lsn should
-# be as almost (max_slot_wal_keep_size - 1) times large as the segment
-# size
+# The slot is in safe state.
 
 $result = $node_master->safe_psql('postgres',
     "SELECT wal_status FROM pg_replication_slots WHERE slot_name = 'rep1'");
@@ -110,7 +108,7 @@ $node_master->safe_psql('postgres', "CHECKPOINT;");
 $result = $node_master->safe_psql('postgres',
     "SELECT wal_status FROM pg_replication_slots WHERE slot_name = 'rep1'");
 is($result, "reserved",
-    'check that min_safe_lsn gets close to the current LSN');
+    'check that distance gets close to the current LSN');
 
 # The standby can reconnect to master
 $node_standby->start;
@@ -154,7 +152,7 @@ advance_wal($node_master, 1);
 
 # Slot gets into 'unreserved' state
 $result = $node_master->safe_psql('postgres',
-    "SELECT wal_status, min_safe_lsn is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
+    "SELECT wal_status, distance is NULL FROM pg_replication_slots WHERE slot_name = 'rep1'"
 );
 is($result, "unreserved|t",
     'check that the slot state changes to "unreserved"');
@@ -186,7 +184,7 @@ ok( find_in_log(
 
 # This slot should be broken
 $result = $node_master->safe_psql('postgres',
-    "SELECT slot_name, active, restart_lsn IS NULL, wal_status, min_safe_lsn FROM pg_replication_slots WHERE slot_name
='rep1'"
 
+    "SELECT slot_name, active, restart_lsn IS NULL, wal_status, distance FROM pg_replication_slots WHERE slot_name =
'rep1'"
 );
 is($result, "rep1|f|t|lost|",
     'check that the slot became inactive and the state "lost" persists');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b813e32215..392eab12dd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1464,8 +1464,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.restart_lsn,
     l.confirmed_flush_lsn,
     l.wal_status,
-    l.min_safe_lsn
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin,
catalog_xmin,restart_lsn, confirmed_flush_lsn, wal_status, min_safe_lsn)
 
+    l.distance
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin,
catalog_xmin,restart_lsn, confirmed_flush_lsn, wal_status, distance)
 
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.18.4

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

01 July 2020, 08:53:13

On Tue, Jun 30, 2020 at 7:53 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2020/06/30 17:07, Fujii Masao wrote:
> >
> >
> > On 2020/06/26 13:45, Amit Kapila wrote:
> >>
> >> Can we consider an option to "Remove min_safe_lsn; document how a user
> >> can monitor the distance"?  We have a function to get current WAL
> >> insert location and other things required are available either via
> >> view or as guc variable values.  The reason I am thinking of this
> >> option is that it might be better to get some more feedback on what is
> >> the most appropriate value to display.  However, I am okay if we can
> >> reach a consensus on one of the above options.
> >
> > Yes, that's an idea. But it might not be easy to calculate that distance
> > manually by subtracting max_slot_wal_keep_size from the current LSN.
> > Because we've not supported -(pg_lsn, numeric) operator yet. I'm
> > proposing that operator, but it's for v14.
>
> Sorry this is not true. That distance can be calculated without those operators.
> For example,
>
> SELECT restart_lsn - pg_current_wal_lsn() + (SELECT setting::numeric * 1024 * 1024 FROM pg_settings WHERE name =
'max_slot_wal_keep_size')distance FROM pg_replication_slots;
 
>
> If the calculated distance is small or negative value, which means that
> we may lose some required WAL files. So in this case it's worth considering
> to increase max_slot_wal_keep_size.
>
> I still think it's better and more helpful to display something like
> that distance in pg_replication_slots rather than making each user
> calculate it...
>

Okay, but do we think it is better to display this in
pg_replication_slots or some new view like pg_stat_*_slots as being
discussed in [1]?  It should not happen that we later decide to move
this or similar stats to that view.

[1] - https://www.postgresql.org/message-id/CA%2Bfd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

01 July 2020, 15:14:21

On 2020-Jul-01, Amit Kapila wrote:

> Okay, but do we think it is better to display this in
> pg_replication_slots or some new view like pg_stat_*_slots as being
> discussed in [1]?  It should not happen that we later decide to move
> this or similar stats to that view.

It seems that the main motivation for having some counters in another
view is the ability to reset them; and resetting this distance value
makes no sense, so I think it's better to have it in
pg_replication_slots.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

michael@paquier.xyz

Date:

02 July 2020, 01:38:45

On Wed, Jul 01, 2020 at 11:14:21AM -0400, Alvaro Herrera wrote:
> On 2020-Jul-01, Amit Kapila wrote:
>> Okay, but do we think it is better to display this in
>> pg_replication_slots or some new view like pg_stat_*_slots as being
>> discussed in [1]?  It should not happen that we later decide to move
>> this or similar stats to that view.
>
> It seems that the main motivation for having some counters in another
> view is the ability to reset them; and resetting this distance value
> makes no sense, so I think it's better to have it in
> pg_replication_slots.

pg_replication_slots would make sense to me than a stat view for a
distance column.  Now, I have to admit that I am worried when seeing
design discussions on this thread for 13 after beta2 has been shipped,
so my vote would still be to remove for now the column in 13, document
an equivalent query to do this work (I actually just do that in a
bgworker monitoring repslot bloat now in some stuff I maintain
internally), and resend a patch in v14 to give the occasion for this
feature to go through one extra round of review.  My 2c.
--
Michael

Attachment

signature.asc

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

02 July 2020, 01:46:01

On 2020-Jul-02, michael@paquier.xyz wrote:

> pg_replication_slots would make sense to me than a stat view for a
> distance column.  Now, I have to admit that I am worried when seeing
> design discussions on this thread for 13 after beta2 has been shipped,

We already had this discussion and one of the things we said before
beta2 was "we're still in beta2, there's time".  I see no need to panic.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Amit Kapila

Date:

04 July 2020, 06:18:47

On Wed, Jul 1, 2020 at 8:44 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Jul-01, Amit Kapila wrote:
>
> > Okay, but do we think it is better to display this in
> > pg_replication_slots or some new view like pg_stat_*_slots as being
> > discussed in [1]?  It should not happen that we later decide to move
> > this or similar stats to that view.
>
> It seems that the main motivation for having some counters in another
> view is the ability to reset them; and resetting this distance value
> makes no sense, so I think it's better to have it in
> pg_replication_slots.
>

Fair enough.  It would be good if we can come up with something better
than 'distance' for this column.  Some ideas safe_wal_limit,
safe_wal_size?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

06 July 2020, 15:29:42

On 2020-Jul-04, Amit Kapila wrote:

> Fair enough.  It would be good if we can come up with something better
> than 'distance' for this column.  Some ideas safe_wal_limit,
> safe_wal_size?

Hmm, I like safe_wal_size.

I've been looking at this intermittently since late last week and I
intend to get it done in the next couple of days.

Thanks!

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

07 July 2020, 00:54:36

On 2020-Jul-06, Alvaro Herrera wrote:

> Hmm, I like safe_wal_size.
> 
> I've been looking at this intermittently since late last week and I
> intend to get it done in the next couple of days.

I propose the attached.  This is pretty much what was proposed by
Kyotaro, but I made a couple of changes.  Most notably, I moved the
calculation to the view code itself rather than creating a function in
xlog.c, mostly because it seemed to me that the new function was
creating an abstraction leakage without adding any value; also, if we
add per-slot size limits later, it would get worse.

The other change was to report negative values when the slot becomes
unreserved, rather than zero.  It shows how much beyond safety your
slots are getting, so it seems useful.  Clamping at zero seems to serve
no purpose.

I also made it report null immediately when slots are in state lost.
But beware of slots that appear lost but fall in the unreserved category
because they advanced before checkpointer signalled them.  (This case
requires a debugger to hit ...)

One thing that got my attention while going over this is that the error
message we throw when making a slot invalid is not very helpful; it
doesn't say what the current insertion LSN was at that point.  Maybe we
should add that?  (As a separate patch, of couse.)

Any more thoughts?  If not, I'll get this pushed tomorrow finally.

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

0001-Morph-pg_replication_slots.min_safe_lsn-to-safe_wal_.patch

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

07 July 2020, 01:55:10

Thanks!
At Mon, 6 Jul 2020 20:54:36 -0400, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in 
> On 2020-Jul-06, Alvaro Herrera wrote:
> 
> > Hmm, I like safe_wal_size.

I agree to the name, too.

> > I've been looking at this intermittently since late last week and I
> > intend to get it done in the next couple of days.
> 
> I propose the attached.  This is pretty much what was proposed by
> Kyotaro, but I made a couple of changes.  Most notably, I moved the
> calculation to the view code itself rather than creating a function in
> xlog.c, mostly because it seemed to me that the new function was
> creating an abstraction leakage without adding any value; also, if we
> add per-slot size limits later, it would get worse.

I'm not sure that detailed WAL segment calculation fits slotfuncs.c
but I don't object to the change.  However if we do that:

+            /* determine how many segments slots can be kept by slots ... */
+            keepSegs = max_slot_wal_keep_size_mb / (wal_segment_size / (1024 * 1024));

Couldn't we move ConvertToXSegs from xlog.c to xlog_ingernals.h and
use it intead of the bare expression?


> The other change was to report negative values when the slot becomes
> unreserved, rather than zero.  It shows how much beyond safety your
> slots are getting, so it seems useful.  Clamping at zero seems to serve
> no purpose.

The reason for the clamping is the signedness of the values, or
integral promotion.  However, I believe the calculation cannot go
beyond the range of signed long so the signedness conversion in the
patch looks fine.

> I also made it report null immediately when slots are in state lost.
> But beware of slots that appear lost but fall in the unreserved category
> because they advanced before checkpointer signalled them.  (This case
> requires a debugger to hit ...)

Oh! Okay, that change seems right to me.

> One thing that got my attention while going over this is that the error
> message we throw when making a slot invalid is not very helpful; it
> doesn't say what the current insertion LSN was at that point.  Maybe we
> should add that?  (As a separate patch, of couse.)

It sounds helpful to me. (I remember that I sometime want to see
checkpoint LSNs in server log..)

> Any more thoughts?  If not, I'll get this pushed tomorrow finally.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

07 July 2020, 03:01:33

On 2020-Jul-07, Kyotaro Horiguchi wrote:

> At Mon, 6 Jul 2020 20:54:36 -0400, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in 

> > I propose the attached.  This is pretty much what was proposed by
> > Kyotaro, but I made a couple of changes.  Most notably, I moved the
> > calculation to the view code itself rather than creating a function in
> > xlog.c, mostly because it seemed to me that the new function was
> > creating an abstraction leakage without adding any value; also, if we
> > add per-slot size limits later, it would get worse.
> 
> I'm not sure that detailed WAL segment calculation fits slotfuncs.c
> but I don't object to the change.  However if we do that:
> 
> +            /* determine how many segments slots can be kept by slots ... */
> +            keepSegs = max_slot_wal_keep_size_mb / (wal_segment_size / (1024 * 1024));
> 
> Couldn't we move ConvertToXSegs from xlog.c to xlog_ingernals.h and
> use it intead of the bare expression?

I was of two minds about that, and the only reason I didn't do it is
that we'll need to give it a better name if we do it ...  I'm open to
suggestions.

> > The other change was to report negative values when the slot becomes
> > unreserved, rather than zero.  It shows how much beyond safety your
> > slots are getting, so it seems useful.  Clamping at zero seems to serve
> > no purpose.
> 
> The reason for the clamping is the signedness of the values, or
> integral promotion.  However, I believe the calculation cannot go
> beyond the range of signed long so the signedness conversion in the
> patch looks fine.

Yeah, I think the negative values are useful to see.  I think if you
ever get close to 2^62, you're in much more serious trouble anyway :-)
But I don't deny that the math there could be subject of overflow
issues.  If you want to verify, please be my guest ...

> > One thing that got my attention while going over this is that the error
> > message we throw when making a slot invalid is not very helpful; it
> > doesn't say what the current insertion LSN was at that point.  Maybe we
> > should add that?  (As a separate patch, of couse.)
> 
> It sounds helpful to me. (I remember that I sometime want to see
> checkpoint LSNs in server log..)

Hmm, ... let's do that for pg14!

Thanks for looking,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

07 July 2020, 17:48:00

On 2020-Jul-06, Alvaro Herrera wrote:

> On 2020-Jul-07, Kyotaro Horiguchi wrote:

> > Couldn't we move ConvertToXSegs from xlog.c to xlog_ingernals.h and
> > use it intead of the bare expression?
> 
> I was of two minds about that, and the only reason I didn't do it is
> that we'll need to give it a better name if we do it ...  I'm open to
> suggestions.

In absence of other suggestions I gave this the name XLogMBVarToSegs,
and redefined ConvertToXSegs to use that.  Didn't touch callers in
xlog.c to avoid pointless churn.  Pushed to both master and 13.

I hope this satisfies everyone ... Masao-san, thanks for reporting the
problem, and thanks Horiguchi-san for providing the fix.  (Also thanks
to Amit and Michael for discussion.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Tom Lane

Date:

08 July 2020, 17:55:17

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> In absence of other suggestions I gave this the name XLogMBVarToSegs,
> and redefined ConvertToXSegs to use that.  Didn't touch callers in
> xlog.c to avoid pointless churn.  Pushed to both master and 13.

The buildfarm's sparc64 members seem unhappy with this.

            regards, tom lane

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

08 July 2020, 23:07:57

On 2020-Jul-08, Tom Lane wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > In absence of other suggestions I gave this the name XLogMBVarToSegs,
> > and redefined ConvertToXSegs to use that.  Didn't touch callers in
> > xlog.c to avoid pointless churn.  Pushed to both master and 13.
> 
> The buildfarm's sparc64 members seem unhappy with this.

Hmm.  Some of them are, yeah, but it's not universal.  For example
mussurana and ibisbill are not showing failures.

Anyway the error is pretty strange: only GetWALAvailability is showing a
problem, but the size calculation in the view function def is returning
a negative number, as expected.

So looking at the code in GetWALAvailability, what happens is that
targetSeg >= oldestSlotSeg, but we expect the opposite.  I'd bet for
targetSeg to be correct, since its input is just the slot LSN -- pretty
easy.  But for oldestSlotSeg, we have KeepLogSeg involved.

No immediate ideas ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Tom Lane

Date:

08 July 2020, 23:24:50

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Jul-08, Tom Lane wrote:
>> The buildfarm's sparc64 members seem unhappy with this.

> Hmm.  Some of them are, yeah, but it's not universal.  For example
> mussurana and ibisbill are not showing failures.

Ah, right, I was thinking they hadn't run since this commit, but they
have.

> Anyway the error is pretty strange: only GetWALAvailability is showing a
> problem, but the size calculation in the view function def is returning
> a negative number, as expected.

We've previously noted what seem to be compiler optimization bugs on
both sparc32 and sparc64; the latest thread about that is
https://www.postgresql.org/message-id/flat/f28f842d-e82b-4e30-a81a-2a1f9fa4a8e1%40www.fastmail.com

This is looking uncomfortably like the same thing.  Tom, could you
experiment with different -O levels on those animals?

            regards, tom lane

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

08 July 2020, 23:35:26

On 2020-Jul-08, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:

> > Anyway the error is pretty strange: only GetWALAvailability is showing a
> > problem, but the size calculation in the view function def is returning
> > a negative number, as expected.
> 
> We've previously noted what seem to be compiler optimization bugs on
> both sparc32 and sparc64; the latest thread about that is
> https://www.postgresql.org/message-id/flat/f28f842d-e82b-4e30-a81a-2a1f9fa4a8e1%40www.fastmail.com
> 
> This is looking uncomfortably like the same thing.

Ouch.  So 12 builds with -O0 but 13 does not?  Did we do something to
sequence.c to work around this problem?  I cannot find anything.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Tom Lane

Date:

09 July 2020, 00:35:49

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Jul-08, Tom Lane wrote:
>> We've previously noted what seem to be compiler optimization bugs on
>> both sparc32 and sparc64; the latest thread about that is
>> https://www.postgresql.org/message-id/flat/f28f842d-e82b-4e30-a81a-2a1f9fa4a8e1%40www.fastmail.com
>> This is looking uncomfortably like the same thing.

> Ouch.  So 12 builds with -O0 but 13 does not?

Unless Tom's changed the animal's config since that thread, yes.

> Did we do something to
> sequence.c to work around this problem?  I cannot find anything.

We did not.  If it's a compiler bug, and one as phase-of-the-moon-
dependent as this seems to be, I'd have zero confidence that any
specific source code change would fix it (barring someone confidently
explaining exactly what the compiler bug is, anyway).  The best we
can do for now is hope that backing off the -O level avoids the bug.

            regards, tom lane

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

09 July 2020, 00:53:32

On 2020-Jul-08, Tom Lane wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:

> > Ouch.  So 12 builds with -O0 but 13 does not?
> 
> Unless Tom's changed the animal's config since that thread, yes.

I verified the configs in branches 12 and 13 in one of the failing
animals, and yes that's still the case.

> > Did we do something to
> > sequence.c to work around this problem?  I cannot find anything.
> 
> We did not.  If it's a compiler bug, and one as phase-of-the-moon-
> dependent as this seems to be, I'd have zero confidence that any
> specific source code change would fix it (barring someone confidently
> explaining exactly what the compiler bug is, anyway).  The best we
> can do for now is hope that backing off the -O level avoids the bug.

An easy workaround might be to add -O0 for that platform in that
directory's Makefile.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Tom Lane

Date:

09 July 2020, 01:03:55

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Jul-08, Tom Lane wrote:
>> We did not.  If it's a compiler bug, and one as phase-of-the-moon-
>> dependent as this seems to be, I'd have zero confidence that any
>> specific source code change would fix it (barring someone confidently
>> explaining exactly what the compiler bug is, anyway).  The best we
>> can do for now is hope that backing off the -O level avoids the bug.

> An easy workaround might be to add -O0 for that platform in that
> directory's Makefile.

"Back off the -O level in one directory" seems about as misguided as
"back off the -O level in one branch", if you ask me.  There's no
reason to suppose that the problem won't bite us somewhere else next
week.

The previous sparc32 bug that we'd made some effort to run to ground
is described here:
https://www.postgresql.org/message-id/15142.1498165769@sss.pgh.pa.us
We really don't know what aspects of the source code trigger that.
I'm slightly suspicious that we might be seeing the same bug in the
sparc64 builds, but it's just a guess.

            regards, tom lane

Re: min_safe_lsn column in pg_replication_slots view

From

Tom Lane

Date:

09 July 2020, 21:56:06

... or on the other hand, maybe these animals are just showing more
sensitivity than others to an actual code bug.  skink is showing
valgrind failures in this very area, on both HEAD and v13:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2020-07-08%2021%3A13%3A02

==3166208== VALGRINDERROR-BEGIN
==3166208== Conditional jump or move depends on uninitialised value(s)
==3166208==    at 0x28618D: KeepLogSeg (xlog.c:9627)
==3166208==    by 0x296AC5: GetWALAvailability (xlog.c:9533)
==3166208==    by 0x4FFECB: pg_get_replication_slots (slotfuncs.c:356)
==3166208==    by 0x3C762F: ExecMakeTableFunctionResult (execSRF.c:234)
==3166208==    by 0x3D9A9A: FunctionNext (nodeFunctionscan.c:95)
==3166208==    by 0x3C81D6: ExecScanFetch (execScan.c:133)
==3166208==    by 0x3C81D6: ExecScan (execScan.c:199)
==3166208==    by 0x3D99A9: ExecFunctionScan (nodeFunctionscan.c:270)
==3166208==    by 0x3C5072: ExecProcNodeFirst (execProcnode.c:450)
==3166208==    by 0x3BD35E: ExecProcNode (executor.h:245)
==3166208==    by 0x3BD35E: ExecutePlan (execMain.c:1646)
==3166208==    by 0x3BE039: standard_ExecutorRun (execMain.c:364)
==3166208==    by 0x3BE102: ExecutorRun (execMain.c:308)
==3166208==    by 0x559669: PortalRunSelect (pquery.c:912)
==3166208==  Uninitialised value was created by a stack allocation
==3166208==    at 0x296A84: GetWALAvailability (xlog.c:9523)
==3166208==
==3166208== VALGRINDERROR-END

and even the most cursory look at the code confirms that there's a
real bug here.  KeepLogSeg expects *logSegNo to be defined on entry,
but GetWALAvailability hasn't bothered to initialize oldestSlotSeg.
It is not clear to me which one is in the wrong; the comment for
KeepLogSeg isn't particularly clear on this.

            regards, tom lane

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

09 July 2020, 22:20:04

On 2020-Jul-09, Tom Lane wrote:

> and even the most cursory look at the code confirms that there's a
> real bug here.  KeepLogSeg expects *logSegNo to be defined on entry,
> but GetWALAvailability hasn't bothered to initialize oldestSlotSeg.
> It is not clear to me which one is in the wrong; the comment for
> KeepLogSeg isn't particularly clear on this.

Oh, so I introduced the bug when I removed the initialization in this
fix.  That one was using the wrong datatype, but evidently it achieved
the right effect.  And KeepLogSeg is using the wrong datatype Invalid
macro also.

I think we should define InvalidXLogSegNo to be ~((uint64)0) and add a
macro to test for that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

10 July 2020, 23:40:37

On 2020-Jul-09, Alvaro Herrera wrote:

> I think we should define InvalidXLogSegNo to be ~((uint64)0) and add a
> macro to test for that.

That's overkill really.  I just used zero.  Running
contrib/test_decoding under valgrind, this now passes.

I think I'd rather do away with the compare to zero, and initialize to
something else in GetWALAvailability, though.  What we're doing seems
unclean and unclear.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

0001-fix-uninitialized-value.patch

Re: min_safe_lsn column in pg_replication_slots view

From

Tom Lane

Date:

11 July 2020, 14:27:53

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Jul-09, Alvaro Herrera wrote:
>> I think we should define InvalidXLogSegNo to be ~((uint64)0) and add a
>> macro to test for that.

> That's overkill really.  I just used zero.  Running
> contrib/test_decoding under valgrind, this now passes.

> I think I'd rather do away with the compare to zero, and initialize to
> something else in GetWALAvailability, though.  What we're doing seems
> unclean and unclear.

Is zero really not a valid segment number?

            regards, tom lane

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

11 July 2020, 21:26:30

On 2020-Jul-11, Tom Lane wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > On 2020-Jul-09, Alvaro Herrera wrote:
> >> I think we should define InvalidXLogSegNo to be ~((uint64)0) and add a
> >> macro to test for that.
> 
> > That's overkill really.  I just used zero.  Running
> > contrib/test_decoding under valgrind, this now passes.
> 
> > I think I'd rather do away with the compare to zero, and initialize to
> > something else in GetWALAvailability, though.  What we're doing seems
> > unclean and unclear.
> 
> Is zero really not a valid segment number?

No, but you cannot retreat from that ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

13 July 2020, 17:23:56

A much more sensible answer is to initialize the segno to the segment
currently being written, as in the attached.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v3-0001-Fix-uninitialized-value-in-segno-calculation.patch

Re: min_safe_lsn column in pg_replication_slots view

From

Alvaro Herrera

Date:

13 July 2020, 17:52:12

On 2020-Jul-13, Alvaro Herrera wrote:

> A much more sensible answer is to initialize the segno to the segment
> currently being written, as in the attached.

Ran the valgrind test locally and it passes.  Pushed it now.

Thanks,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: min_safe_lsn column in pg_replication_slots view

From

Kyotaro Horiguchi

Date:

14 July 2020, 04:38:09

At Mon, 13 Jul 2020 13:52:12 -0400, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in 
> On 2020-Jul-13, Alvaro Herrera wrote:
> 
> > A much more sensible answer is to initialize the segno to the segment
> > currently being written, as in the attached.
> 
> Ran the valgrind test locally and it passes.  Pushed it now.

-    if (XLogRecPtrIsInvalid(*logSegNo) || segno < *logSegNo)
+    if (segno < *logSegNo)

Oops! Thank you for fixing it!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center