Re: prevent immature WAL streaming - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: prevent immature WAL streaming
Date
Msg-id 20210824.120357.1673176579644397801.horikyota.ntt@gmail.com
Whole thread Raw
In response to prevent immature WAL streaming  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: prevent immature WAL streaming  ("Bossart, Nathan" <bossartn@amazon.com>)
Re: prevent immature WAL streaming  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
At Mon, 23 Aug 2021 18:52:17 -0400, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in 
> Included 蔡梦娟 and Jakub Wartak because they've expressed interest on
> this topic -- notably [2] ("Bug on update timing of walrcv->flushedUpto
> variable").
> 
> As mentioned in the course of thread [1], we're missing a fix for
> streaming replication to avoid sending records that the primary hasn't
> fully flushed yet.  This patch is a first attempt at fixing that problem
> by retreating the LSN reported as FlushPtr whenever a segment is
> registered, based on the understanding that if no registration exists
> then the LogwrtResult.Flush pointer can be taken at face value; but if a
> registration exists, then we have to stream only till the start LSN of
> that registered entry.
> 
> This patch is probably incomplete.  First, I'm not sure that logical
> replication is affected by this problem.  I think it isn't, because
> logical replication will halt until the record can be read completely --
> maybe I'm wrong and there is a way for things to go wrong with logical
> replication as well.  But also, I need to look at the other uses of
> GetFlushRecPtr() and see if those need to change to the new function too
> or they can remain what they are now.
> 
> I'd also like to have tests.  That seems moderately hard, but if we had
> WAL-molasses that could be used in walreceiver, it could be done. (It
> sounds easier to write tests with a molasses-archive_command.)
> 
> 
> [1] https://postgr.es/m/CBDDFA01-6E40-46BB-9F98-9340F4379505@amazon.com
> [2] https://postgr.es/m/3f9c466d-d143-472c-a961-66406172af96.mengjuan.cmj@alibaba-inc.com

(I'm not sure what "WAL-molasses" above expresses, same as "sugar"?)

For our information, this issue is related to the commit 0668719801
which makes XLogPageRead restart reading a (continued or
segments-spanning) record with switching sources.  In that thread, I
modifed the code to cause a server crash under the desired situation.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: .ready and .done files considered harmful
Next
From: Amit Langote
Date:
Subject: Re: Allow batched insert during cross-partition updates