Don't try fetching future segment of a TLI. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Don't try fetching future segment of a TLI.
Date
Msg-id 20200129.120222.1476610231001551715.horikyota.ntt@gmail.com
Whole thread Raw
Responses Re: Don't try fetching future segment of a TLI.  (David Steele <david@pgmasters.net>)
Re: Don't try fetching future segment of a TLI.  (David Steele <david@pgmasters.net>)
List pgsql-hackers
Hello, I added (moved to) -hackers.

At Tue, 28 Jan 2020 19:13:32 +0300, Pavel Suderevsky <psuderevsky@gmail.com> wrote in 
> But for me it still seems that PostgreSQL has enough information to check
> that no WALs exist for the new timeline to omit searching all the
> possibly-existing WALs.
> 
> It can just look through the first received new-timeline's WAL and ensure
> timeline switch occured in this WAL. Finally, it can check archive for the
> only one possibly-existing previous WAL.

Right. The timeline history file tells where a timeline ends.

> Regading influence: issue is not about the large amount of WALs to apply
> but in searching for the non-existing WALs on the remote storage, each such
> search can take 5-10 seconds while obtaining existing WAL takes
> milliseconds.

Wow. I didn't know of a file system that takes that much seconds to
trying non-existent files. Although I still think this is not a bug,
but avoiding that actually leads to a big win on such systems.

After a thought, I think it's safe and effectively doable to let
XLogFileReadAnyTLI() refrain from trying WAL segments of too-high
TLIs.  Some garbage archive files out of the range of a timeline might
be seen, for example, after reusing archive directory without clearing
files.  However, fetching such garbages just to fail doesn't
contribute durability or reliablity at all, I think.

The attached does that. 

Any thoughts?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 1b9211740175b7f9cb6810c822a67d4065ca9cf0 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Wed, 29 Jan 2020 11:17:56 +0900
Subject: [PATCH v1] Don't try fetching out-of-timeline segments.

XLogFileReadAnyTLI scans known TLIs down from the largest one in
descending order while searching the target segment. Even if we know
that the segment belongs to a lower TLI, it tries opening the segment
of the larger TLIs just to fail. Under certain circumstances that
access to non-existent files take a long time and makes recovery time
significantly longer.

Although a segment beyond the end of a TLI suggests that the
XLOG/archive files may be broken, we can safely ignore such files as
far as recovery proceeds.
---
 src/backend/access/transam/xlog.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6e09ded597..415288f50d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3738,11 +3738,27 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)
 
     foreach(cell, tles)
     {
-        TimeLineID    tli = ((TimeLineHistoryEntry *) lfirst(cell))->tli;
+        TimeLineHistoryEntry *hent = (TimeLineHistoryEntry *) lfirst(cell);
+        TimeLineID    tli = hent->tli;
 
         if (tli < curFileTLI)
             break;                /* don't bother looking at too-old TLIs */
 
+        /* Skip segments not belonging to the TLI */
+        if (hent->begin != InvalidXLogRecPtr)
+        {
+            XLogSegNo    beginseg = 0;
+
+            XLByteToSeg(hent->begin, beginseg, wal_segment_size);
+
+            /*
+             * We are scanning TLIs in descending order. It is sufficient to
+             * check only the upper boundary.
+             */
+            if (segno < beginseg)
+                continue;        /* don't bother looking at future TLIs */
+        }
+
         if (source == XLOG_FROM_ANY || source == XLOG_FROM_ARCHIVE)
         {
             fd = XLogFileRead(segno, emode, tli,
-- 
2.18.2


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Is custom MemoryContext prohibited?
Next
From: Peter Geoghegan
Date:
Subject: Re: Is custom MemoryContext prohibited?