Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?
Date
Msg-id 89e33d4f-5c75-0738-3dcb-44c4df59e920@iki.fi
Whole thread Raw
In response to Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?
List pgsql-hackers
On 18/01/18 20:54, Kyotaro HORIGUCHI wrote:
> At Thu, 18 Jan 2018 11:52:52 -0800, Andres Freund <andres@anarazel.de> wrote in
<20180118195252.hyxgkb3krcqjzfjm@alap3.anarazel.de>
>> On 2018-01-18 20:58:27 +0900, Michael Paquier wrote:
>>> b) The second patch that I would like to mention is doing things on the
>>> standby side by being able to change a WAL source when fetching a single
>>> record so as it is possible to fetch the beginning of a cross-segment
>>> record from say an archive or the local xlog, and then request the rest
>>> on the primary. This patch can be found in
>>> https://www.postgresql.org/message-id/20171026.190551.208996945.horiguchi.kyotaro%40lab.ntt.co.jp
>>> and the diff in WaitForWALToBecomeAvailable() scares me a lot because,
>>> it seems to me that this actually breaks timeline switch
>>> consistency. The concept of switching a WAL source in the middle of a
>>> WAL segment is risky anyway, because we perform sanity checks while
>>> switching sources. The bug we are dealing about here is very rare, and
>>> seeing a problem like that for a timeline switch is even more rare, but
>>> the risk is not zero and we may finish with a corrupted instance, so we
>>> should not in my opinion take this approach. Also this actually achieves
>>> point 2) above, not completely though, but not 1).
>>
>> I don't agree with the sentiment about the approach, but I agree there
>> might be weaknesses in the implementation (which I have not reviewed). I
>> think it's perfectly sensible to require fetching one segment from
>> pg_xlog and the next one via another method. Currently the inability to
>> do so often leads to us constantly refetching the same segment.
> 
> Though I'm still not fully confident, if reading a set of
> continuation records from two (or more) sources can lead to
> inconsistency, two successve (complete) records are facing the
> same risk.

This seems like the best approach to me as well.

Looking at the patch linked above 
(https://www.postgresql.org/message-id/20171026.190551.208996945.horiguchi.kyotaro%40lab.ntt.co.jp):

> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -11693,6 +11693,10 @@ retry:
>      Assert(reqLen <= readLen);
>  
>      *readTLI = curFileTLI;
> +
> +    if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
> +        goto next_record_is_invalid;
> +
>      return readLen;
>  
>  next_record_is_invalid:

What is the point of adding this XLogReaderValidatePageHeader() call? 
The caller of this callback function, ReadPageInternal(), checks the 
page header anyway. Earlier in this thread, you said:

> The rest to do is let XLogPageRead retry other sources
> immediately. To do this I made ValidXLogPageHeader(at)xlogreader(dot)c
> public (and renamed to XLogReaderValidatePageHeader).

but I don't understand that.

- Heikki


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BGWORKER_BYPASS_ALLOWCONN used nowhere (infra part of on-linechecksum switcher)
Next
From: Andrew Gierth
Date:
Subject: Re: Toast issues with OldestXmin going backwards