Re: archive status ".ready" files may be created too early - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: archive status ".ready" files may be created too early
Date
Msg-id 20210806.164138.1233981515805339727.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: archive status ".ready" files may be created too early  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: archive status ".ready" files may be created too early  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
At Fri, 6 Aug 2021 00:21:34 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in 
> On 8/5/21, 12:39 AM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote:
> > Honestly I don't like having this initialization in XLogWrite.  We
> > should and I think can initialize it earlier.  It seems to me the most
> > appropriate timing to initialize the variable is just before running
> > the end-of-recovery checkpoint).  Since StartupXLOG knows the first
> > segment to write .  If it were set to 0, that doesn't matter at all.
> > We can get rid of the new symbol by doing this.
> 
> This seems like a good idea to me.  I made this change in v5.  I
> performed some basic testing, and it seems to reliably initialize
> lastNotifiedSeg correctly.
> 
> > +    /*
> > +     * EndOfLog resides on the next segment of the last finished one. Set the
> > +     * last finished segment as lastNotifiedSeg now.  In the case where the
> > +     * last crash has left the last several segments not being marked as
> > +     * .ready, the checkpoint just after does that for all finished segments.
> > +     * There's a corner case where the checkpoint advances segment, but that
> > +     * ends up at most with a duplicate archive notification.
> > +     */
> 
> I'm not quite following the corner case you've described here.  Is it
> possible that the segment that EndOfLog points to will be eligible for
> removal after the checkpoint?

Archiving doesn't immediately mean removal.  A finished segment is
ought to be archived right away.  Since the EndOfLog segment must not
get marked .ready, setting lastNotifiedSeg to the previous segment is
quite right, but if the end-of-recovery checkpoint advances segment,
EndOfLog is marked .ready at the XLogFlush just after.  But, sorry,
what I forgot at the time was the checkpoint also moves
lastNotifiedSeg.  So, sorry, that corner case does not exist.

> In v5 of the patch, I've also added an extra call to
> NotifySegmentsReadyForArchive() in the same place we previously
> created the .ready files.  I think this helps notify archiver sooner
> in certain cases (e.g., asynchronous commit).

In v5, NotifySegmentsReadyForArchive() still holds ArchNotifyLock
including .ready file creations. Since the notification loop doesn't
need the hash itself, the loop can be took out of the lock section?

current:
    LWLockAcquire(ArchNotifyLock, LW_EXCLUSIVE);
    last_notified = GetLastNotifiedSegment();
    latest_boundary_seg = GetLatestRecordBoundarySegment(last_notified, flushed, &found);

    if (found)
    {
        SetLastNotifiedSegment(latest_boundary_seg - 1);
        for (seg = last_notified + 1; seg < latest_boundary_seg; seg++)
            XLogArchiveNotifySeg(seg, false);

        RemoveRecordBoundariesUpTo(latest_boundary_seg);

        PgArchWakeup();
    }
    LWLockRelease(ArchNotifyLock);

But we can release the lock earlier.

    LWLockAcquire(ArchNotifyLock, LW_EXCLUSIVE);
    last_notified = GetLastNotifiedSegment();
    latest_boundary_seg = GetLatestRecordBoundarySegment(last_notified, flushed, &found);

    if (found)
    {
        SetLastNotifiedSegment(latest_boundary_seg - 1);
        RemoveRecordBoundariesUpTo(latest_boundary_seg);
    }
    LWLockRelease(ArchNotifyLock);

    if (found)
    {
        for (seg = last_notified + 1; seg < latest_boundary_seg; seg++)
            XLogArchiveNotifySeg(seg, false);

        PgArchWakeup();
    }

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)