Re: Possible missing segments in archiving on standby - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Possible missing segments in archiving on standby |
Date | |
Msg-id | 20210908.104526.1076005617766473945.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Possible missing segments in archiving on standby (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Responses |
Re: Possible missing segments in archiving on standby
|
List | pgsql-hackers |
At Tue, 7 Sep 2021 17:03:06 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > + if (XLogArchivingAlways()) > > + XLogArchiveNotify(xlogfilename, true); > > + else > > + XLogArchiveForceDone(xlogfilename); > > The path is used both for crash and archive recovery. If we pass there > > while crash recovery on a primary with archive_mode=on, the file could > > be marked .done before actually archived. On the other hand when > > archive_mode=always, the file could be re-marked .ready even after it > > has been already archived. Why isn't it XLogArchiveCheckDone? > > Yeah, you're right. ISTM what we should do is to just call > XLogArchiveCheckDone() for the segment including XLOG_SWITCH record, > i.e., to create .ready file if the segment has no archive notification > file yet > and archive_mode is set to 'always'. Even if we don't do this when we > reach > XLOG_SWITCH record, subsequent restartpoints eventually will call > XLogArchiveCheckDone() for such segments. > > One issue of this approach is that the WAL segment including > XLOG_SWITCH > record may be archived before its previous segments are. That is, > the notification file of current segment is created when it's replayed > because it includes XLOG_SWIATCH, but the notification files of > its previous segments will be created by subsequent restartpoints > because they don't have XLOG_SWITCH. Probably we should avoid this? Anyway there's no guarantee on the archive ordering. As discussed in the nearby thread [1], newer segment is often archived earlier. I agree that that happens mainly on busy servers, though. The archiver is designed to handle such "gaps" and/or out-of-order segment notifications. We could impose a strict ordering on archiving but I think we would take total performance than such strictness. In short, no. > If yes, one approach for this issue is to call XLogArchiveCheckDone() > for > not only the segment including XLOG_SWITCH but also all the segments > older than that. Thought? At least currently, recovery fetches segments by a single process and every file is marked immediately after being filled-up, so all files other than the latest one in pg_wal including history files should have been marked for sure unless file system gets into a trouble. So I think we don't need to do that even if we want the strictness. Addition to that that takes too long time when many segments reside in pg_wal so I think we never want to run such a scan at every segment end that recovery passes. If I remember correctly, the reason we don't fix archive status at start up but at checkpoint is we avoided extra startup time. > Anyway, I extracted the changes in walreceiver from the patch, > because it's self-contained and can be applied separately. > Patch attached. I'm not sure I like that XLogWalRcvClose hides the segment-switch condition. If we do that check in the function, I'd like to name the function XLogWalRcvCloseIfSwitched or something indicates the condition. I'd like to invert the condition to reduce indentation, too. Why don't we call it just after writing data, as my first proposal did? There's no difference in functionality between doing that and the patch. If we do so, recvFile>=0 is always true and that condition can be removed but that would be optional. Anyway, by doing that, no longer need to call the function twice or we can even live without the new function. [1] https://www.postgresql.org/message-id/20210504042755.ehuaoz5blcnjw5yk%40alap3.anarazel.de regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: