Re: prevent immature WAL streaming - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: prevent immature WAL streaming
Date
Msg-id 20210826.174834.1618367955913954817.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: prevent immature WAL streaming  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
At Thu, 26 Aug 2021 03:24:48 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in 
> On 8/25/21, 5:40 PM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote:
> > At Wed, 25 Aug 2021 18:18:59 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in
> >> Let's say we have the following situation (F = flush, E = earliest
> >> registered boundary, and L = latest registered boundary), and let's
> >> assume that each segment has a cross-segment record that ends in the
> >> next segment.
> >>
> >>         F     E                                         L
> >>         |-----|-----|-----|-----|-----|-----|-----|-----|
> >>            1     2     3     4     5     6     7     8
> >>
> >> Then, we write out WAL to disk and create .ready files as needed.  If
> >> we didn't flush beyond the latest registered boundary, the latest
> >> registered boundary now becomes the earliest boundary.
> >>
> >>                           F                             E
> >>         |-----|-----|-----|-----|-----|-----|-----|-----|
> >>            1     2     3     4     5     6     7     8
> >>
> >> At this point, the earliest segment boundary past the flush point is
> >> before the "earliest" boundary we are tracking.
> >
> > We know we have some cross-segment records in the regin [E L] so we
> > cannot add a .ready file if flush is in the region. I haven't looked
> > the latest patch (or I may misunderstand the discussion here) but I
> > think we shouldn't move E before F exceeds previous (or in the first
> > picture above) L.  Things are done that way in my ancient proposal in
> > [1].
> 
> The strategy in place ensures that we track a boundary that doesn't
> change until the flush position passes it as well as the latest
> registered boundary.  I think it is important that any segment
> boundary tracking mechanism does at least those two things.  I don't
> see how we could do that if we didn't update E until F passed both E
> and L.

(Sorry, but I didn't get you clearly. So the discussion below might be
pointless.)

The ancient patch did:

If a flush didn't reach E, we can archive finished segments.

If a flush ends between E and L, we shouldn't archive finshed segments
at all. L can be moved further while this state, while E sits on the
same location while this state.

Once a flush passes L, we can archive all finished segments and can
erase both E and L.

A drawback of this strategy is that the region [E L] can contain gaps
(that is, segment boundaries that is not bonded by a continuation
record) and archive can be excessively retarded.  Perhaps if flush
goes behind write head by more than two segments, the probability of
creating the gaps would be higher.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: list of acknowledgments for PG14
Next
From: Amit Langote
Date:
Subject: Re: ExecRTCheckPerms() and many prunable partitions