At Thu, 26 Aug 2021 03:24:48 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in
> On 8/25/21, 5:40 PM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote:
> > At Wed, 25 Aug 2021 18:18:59 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in
> >> Let's say we have the following situation (F = flush, E = earliest
> >> registered boundary, and L = latest registered boundary), and let's
> >> assume that each segment has a cross-segment record that ends in the
> >> next segment.
> >>
> >> F E L
> >> |-----|-----|-----|-----|-----|-----|-----|-----|
> >> 1 2 3 4 5 6 7 8
> >>
> >> Then, we write out WAL to disk and create .ready files as needed. If
> >> we didn't flush beyond the latest registered boundary, the latest
> >> registered boundary now becomes the earliest boundary.
> >>
> >> F E
> >> |-----|-----|-----|-----|-----|-----|-----|-----|
> >> 1 2 3 4 5 6 7 8
> >>
> >> At this point, the earliest segment boundary past the flush point is
> >> before the "earliest" boundary we are tracking.
> >
> > We know we have some cross-segment records in the regin [E L] so we
> > cannot add a .ready file if flush is in the region. I haven't looked
> > the latest patch (or I may misunderstand the discussion here) but I
> > think we shouldn't move E before F exceeds previous (or in the first
> > picture above) L. Things are done that way in my ancient proposal in
> > [1].
>
> The strategy in place ensures that we track a boundary that doesn't
> change until the flush position passes it as well as the latest
> registered boundary. I think it is important that any segment
> boundary tracking mechanism does at least those two things. I don't
> see how we could do that if we didn't update E until F passed both E
> and L.
(Sorry, but I didn't get you clearly. So the discussion below might be
pointless.)
The ancient patch did:
If a flush didn't reach E, we can archive finished segments.
If a flush ends between E and L, we shouldn't archive finshed segments
at all. L can be moved further while this state, while E sits on the
same location while this state.
Once a flush passes L, we can archive all finished segments and can
erase both E and L.
A drawback of this strategy is that the region [E L] can contain gaps
(that is, segment boundaries that is not bonded by a continuation
record) and archive can be excessively retarded. Perhaps if flush
goes behind write head by more than two segments, the probability of
creating the gaps would be higher.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center