Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? |
Date | |
Msg-id | 20180424.195712.56235724.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?
|
List | pgsql-hackers |
Thank you very much for looking this! At Mon, 23 Apr 2018 03:41:47 -0400, Heikki Linnakangas <hlinnaka@iki.fi> wrote in <89e33d4f-5c75-0738-3dcb-44c4df59e920@iki.fi> > On 18/01/18 20:54, Kyotaro HORIGUCHI wrote: > > At Thu, 18 Jan 2018 11:52:52 -0800, Andres Freund <andres@anarazel.de> > > wrote in <20180118195252.hyxgkb3krcqjzfjm@alap3.anarazel.de> > >> On 2018-01-18 20:58:27 +0900, Michael Paquier wrote: > >>> b) The second patch that I would like to mention is doing things on > >>> the > >>> standby side by being able to change a WAL source when fetching a > >>> single > >>> record so as it is possible to fetch the beginning of a cross-segment > >>> record from say an archive or the local xlog, and then request the > >>> rest > >>> on the primary. This patch can be found in > >>> https://www.postgresql.org/message-id/20171026.190551.208996945.horiguchi.kyotaro%40lab.ntt.co.jp > >>> and the diff in WaitForWALToBecomeAvailable() scares me a lot because, > >>> it seems to me that this actually breaks timeline switch > >>> consistency. The concept of switching a WAL source in the middle of a > >>> WAL segment is risky anyway, because we perform sanity checks while > >>> switching sources. The bug we are dealing about here is very rare, and > >>> seeing a problem like that for a timeline switch is even more rare, > >>> but > >>> the risk is not zero and we may finish with a corrupted instance, so > >>> we > >>> should not in my opinion take this approach. Also this actually > >>> achieves > >>> point 2) above, not completely though, but not 1). > >> > >> I don't agree with the sentiment about the approach, but I agree there > >> might be weaknesses in the implementation (which I have not > >> reviewed). I > >> think it's perfectly sensible to require fetching one segment from > >> pg_xlog and the next one via another method. Currently the inability > >> to > >> do so often leads to us constantly refetching the same segment. > > Though I'm still not fully confident, if reading a set of > > continuation records from two (or more) sources can lead to > > inconsistency, two successve (complete) records are facing the > > same risk. > > This seems like the best approach to me as well. > > Looking at the patch linked above > (https://www.postgresql.org/message-id/20171026.190551.208996945.horiguchi.kyotaro%40lab.ntt.co.jp): > > > --- a/src/backend/access/transam/xlog.c > > +++ b/src/backend/access/transam/xlog.c > > @@ -11693,6 +11693,10 @@ retry: > > Assert(reqLen <= readLen); > > *readTLI = curFileTLI; > > + > > + if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, > > readBuf)) > > + goto next_record_is_invalid; > > + > > return readLen; > > next_record_is_invalid: > > What is the point of adding this XLogReaderValidatePageHeader() call? > The caller of this callback function, ReadPageInternal(), checks the > page header anyway. Earlier in this thread, you said: Without the lines, server actually fails to start replication. (I try to remember the details...) The difference is whether the function can cause retry for the same portion of a set of continued records (without changing the plugin API). XLogPageRead can do that. On the other hand all ReadPageInternal can do is just letting the caller ReadRecords retry from the beginning of the set of continued records since the caller side handles only complete records. In the failure case, in XLogPageRead, when read() fails, it can try the next source midst of a continued records. On the other hand if the segment was read but it was recycled one, it passes "success" to ReadPageInternal and leads to retring from the beginning of the recrod. Infinite loop. Calling XLogReaderValidatePageHeader in ReadPageInternal is redundant, but removing it may be interface change of xlogreader plugin and I am not sure that is allowed. > > The rest to do is let XLogPageRead retry other sources > > immediately. To do this I made ValidXLogPageHeader(at)xlogreader(dot)c > > public (and renamed to XLogReaderValidatePageHeader). > > but I don't understand that. It is exposed so that XLogPageRead can validate the page header using it. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: