Re: Possible missing segments in archiving on standby - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Possible missing segments in archiving on standby
Date
Msg-id f045a90c-5971-aa6b-7e32-c9b4f64d91b8@oss.nttdata.com
Whole thread Raw
In response to Re: Possible missing segments in archiving on standby  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Possible missing segments in archiving on standby
List pgsql-hackers

On 2021/09/03 14:56, Kyotaro Horiguchi wrote:
> +                if (readSource == XLOG_FROM_STREAM &&
> +                    record->xl_rmid == RM_XLOG_ID &&
> +                    (record->xl_info & ~XLR_INFO_MASK) == XLOG_SWITCH)
> 
> readSource is the source at the time startup reads it and it could be
> different from the source at the time the record was written. We
> cannot know where the record came from there, so does the readSource
> condition work as expected?  If we had some trouble streaming just
> before, readSource at the time is likely to be XLOG_FROM_PG_WAL.

Yes.


> +                        if (XLogArchivingAlways())
> +                            XLogArchiveNotify(xlogfilename, true);
> +                        else
> +                            XLogArchiveForceDone(xlogfilename);
> 
> The path is used both for crash and archive recovery. If we pass there
> while crash recovery on a primary with archive_mode=on, the file could
> be marked .done before actually archived. On the other hand when
> archive_mode=always, the file could be re-marked .ready even after it
> has been already archived.  Why isn't it XLogArchiveCheckDone?

Yeah, you're right. ISTM what we should do is to just call
XLogArchiveCheckDone() for the segment including XLOG_SWITCH record,
i.e., to create .ready file if the segment has no archive notification file yet
and archive_mode is set to 'always'. Even if we don't do this when we reach
XLOG_SWITCH record, subsequent restartpoints eventually will call
XLogArchiveCheckDone() for such segments.

One issue of this approach is that the WAL segment including XLOG_SWITCH
record may be archived before its previous segments are. That is,
the notification file of current segment is created when it's replayed
because it includes XLOG_SWIATCH, but the notification files of
its previous segments will be created by subsequent restartpoints
because they don't have XLOG_SWITCH. Probably we should avoid this?

If yes, one approach for this issue is to call XLogArchiveCheckDone() for
not only the segment including XLOG_SWITCH but also all the segments
older than that. Thought?


Anyway, I extracted the changes in walreceiver from the patch,
because it's self-contained and can be applied separately.
Patch attached.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: [BUG?] SET TIME ZONE doesn't work with abbreviations
Next
From: Peter Eisentraut
Date:
Subject: Re: Support tab completion for upper character inputs in psql