Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Date
Msg-id 2271.1584655492@sss.pgh.pa.us
Whole thread Raw
In response to Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding  (Noah Misch <noah@leadboat.com>)
Responses Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
> On Sun, Jan 05, 2020 at 01:33:55AM +0100, Tomas Vondra wrote:
>> It's a bit unfortunate that a patch for a data corruption / loss issue
>> (even if a low-probability one) fell through multiple commitfests.

> Thanks for investing in steps to fix that.

Yeah, this patch has been waiting in the queue for way too long :-(.

I spent some time studying this, and I have a few comments/questions:

1. It seems like this discussion is conflating two different issues.
The original complaint was about a seemingly-bogus log message "could
not truncate directory "pg_xact": apparent wraparound".  Your theory
about that, IIUC, is that SimpleLruTruncate's initial round-back of
cutoffPage to a segment boundary moved us from a state where
shared->latest_page_number doesn't logically precede the cutoffPage to
one where it does, triggering the message.  So removing the initial
round-back, and then doing whatever's needful to compensate for that in
the later processing, is a reasonable fix to prevent the bogus warning.
However, you're also discussing whether or not an SLRU segment file that
is close to the wraparound boundary should get removed or not.  As far
as I can see that's 100% independent of issuance of the log message, no?
This might not affect the code substance of the patch at all; but it
seems like we need to be clear about it in our discussion, and maybe the
comments need to change too.

2. I wonder whether we have an issue even with rounding back to the
SLRU page boundary, as is done by each caller before we ever get to
SimpleLruTruncate.  I'm pretty sure that the actual anti-wraparound
protections are all precise to the XID level, so that there's some
daylight between what SimpleLruTruncate is told and the range of
data that the higher-level code thinks is of interest.  Maybe we
should restructure things so that we keep the original cutoff XID
(or equivalent) all the way through, and compare the start/end
positions of a segment file directly to that.

3. It feels like the proposed test of cutoff position against both
ends of a segment is a roundabout way of fixing the problem.  I
wonder whether we ought not pass *both* the cutoff and the current
endpoint (latest_page_number) down to the truncation logic, and
have it compare against both of those values.

To try to clarify this in my head, I thought about an image of
the modular XID space as an octagon, where each side would correspond to
a segment file if we chose numbers such that there were only 8 possible
segment files.  Let's say that nextXID is somewhere in the bottommost
octagon segment.  The oldest possible value for the truncation cutoff
XID is a bit less than "halfway around" from nextXID; so it could be
in the topmost octagon segment, if the minimum permitted daylight-
till-wraparound is less than the SLRU segment size (which it is).
Then, if we round the cutoff XID "back" to a segment boundary, most
of the current (bottommost) segment is now less than halfway around
from that cutoff point, and in particular the current segment's
starting page is exactly halfway around.  Because of the way that
TransactionIdPrecedes works, the current segment will be considered to
precede that cutoff point (the int32 difference comes out as exactly
2^31 which is negative).  Boom, data loss, because we'll decide the
current segment is removable.

I think that your proposed patch does fix this, but I'm not quite
sold that the corner cases (where the cutoff XID is itself exactly
at a page boundary) are right.  In any case, I think it'd be more
robust to be comparing explicitly against a notion of the latest
in-use page number, instead of backing into it from an assumption
that the cutoff XID itself is less than halfway around.

I wonder if we ought to dodge the problem by having a higher minimum
value of the required daylight-before-wraparound, so that the cutoff
point couldn't be in the diametrically-opposite-current segment but
would have to be at least one segment before that.  In the end,
I believe that all of this logic was written under an assumption
that we should never get into a situation where we are so close
to the wraparound threshold that considerations like these would
manifest.  Maybe we can get it right, but I don't have huge
faith in it.

It also bothers me that some of the callers of SimpleLruTruncate
have explicit one-count backoffs of the cutoff point and others
do not.  There's no obvious reason for the difference, so I wonder
if that isn't something we should have across-the-board, or else
adjust SimpleLruTruncate to do the equivalent thing internally.

I haven't thought much yet about your second point about race
conditions arising from nextXID possibly moving before we
finish the deletion scan.  Maybe we could integrate a fix for
that issue, along the lines of (1) see an SLRU segment file,
(2) determine that it appears to precede the cutoff XID, if so
(3) acquire the control lock and fetch latest_page_number,
compare against that to verify that the segment file is old
and not new, then (4) unlink if that still holds.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Improve errors when setting incorrect bounds for SSL protocols
Next
From: Daniel Gustafsson
Date:
Subject: Re: Add PostgreSQL home page to --help output