Re: Assert() failures during RI checks - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Assert() failures during RI checks
Date
Msg-id 20200322211828.gqcwl7gspce764u6@alap3.anarazel.de
Whole thread Raw
In response to Re: Assert() failures during RI checks  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Assert() failures during RI checks  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2020-03-22 15:00:51 -0400, Tom Lane wrote:
> Antonin Houska <ah@cybertec.at> writes:
> > I was trying to figure out what exactly the "crosscheck snapshot" does in the
> > RI checks, and hit some assertion failures:
> 
> Yeah, your example reproduces for me.
> 
> > I'm not familiar enough with this code but I wonder if it's only about
> > incorrect assertions.
> 
> Mmm ... not really.  I dug around in the history a bit, and there are
> a few moving parts here.  The case that's problematic is where the
> "crosscheck" stanza starting at heapam.c:2639 fires, and replaces
> result = TM_Ok with result = TM_Updated.  That causes the code to
> go into the next stanza, which is all about trying to follow a ctid
> update link so it can report a valid TID to the caller along with
> the failure result.  But in this case, the tuple at hand has never
> been updated, so the assertions fire.
> 
> The crosscheck stanza dates back to 55d85f42a, and at the time,
> just setting result = TM_Updated was sufficient to make the
> then-far-simpler next stanza do the right thing --- basically,
> we wanted to just release whatever locks we hold and return the
> failure result.  However, later patches added assertions to verify
> that that next stanza was doing something sane ... and really, it
> isn't in this case.
>
> Another thing that is very peculiar in this area is that the initial
> assertion in the second stanza allows the case of result == TM_Deleted.
> It makes *no* sense to allow that if what we think we're doing is
> chasing a ctid update link.  That alternative was not allowed until
> really recently (Andres' commit 5db6df0c0, which appears otherwise
> to be just refactoring), and I wonder how carefully it was really
> thought through.

Before that commit HeapTupleUpdated represented both deletions and
updates. When splitting them, and where it wasn't obvious only one of
the two could apply, I opted to continue to allow both. After all
previously both were allowed too. IOW, I think it was actually
previously allowed?

In this case, isn't it clearly required to accept TM_Deleted? The HTSU
after l1: obviously can return that, no?


I think what the problem with 5db6df0c0 here could be is that it added
+        Assert(result != TM_Updated ||
+               !ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
as a new restriction. That's likely because TM_Updated should indicate
an update, and that should imply we have a ctid link - and that callers
checked this previously to detect whether an update happened.  But
that's not the case here, due to the way the crosscheck stanza currently
works.

E.g. in <12, callers like ExecDelete() checked things like
-                if (ItemPointerIndicatesMovedPartitions(&hufd.ctid))
-                    ereport(ERROR,
-                            (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                             errmsg("tuple to be deleted was already moved to another partition due to concurrent
update")));
-
-                if (!ItemPointerEquals(tupleid, &hufd.ctid))
-                {

but that doesn't really work without exposing too much detail of the
AM to those callsites.

I wonder if we shouldn't just change the crosscheck case to set
something other than TM_Updated, as it's not really accurate to say the
tuple was updated. Either something like TM_Deleted, or perhaps a new
return code?  I think that's effectively how it was treated before the
split of HeapTupleUpdated into TM_Deleted and TM_Updated.


> So I think there are two somewhat separable issues we have to address:
> 
> 1. The crosscheck stanza can't get away with just setting result =
> TM_Updated.  Maybe the best thing is for it not to try to share code
> with what follows, but take responsibility for releasing locks and
> returning consistent data to the caller on its own.  I'm not quite
> sure what's the most consistent thing for it to return anyway, or
> how much we care about what TID is returned in this case.  All we
> really want is for an error to be raised, and I don't think the
> error path is going to look too closely at the returned TID.

If we want to go for that, perhaps the easiest way to handle this is to
have an error path at the end of heap_delete that we reach with two
gotos. Like

delete_not_ok:
        Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID));
        Assert(result != TM_Updated ||
               !ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
delete_no_ok_crosscheck:
        Assert(result == TM_SelfModified ||
               result == TM_Updated ||
               result == TM_Deleted ||
               result == TM_BeingModified);
...




> 2. Does the following stanza actually need to allow for the case
> of a deleted tuple as well as an updated one?  If so, blindly
> trying to follow the ctid link is not so cool.

Which ctid link following are you referring to? I'm not sure I quite
follow.


> I think the situation in heap_update is just the same, though
> it's unclear why the code is slightly different.  The extra
> Assert in that crosscheck stanza certainly appears 100% redundant
> with the later Assert, though both are wrong per this analysis.

I assume Haribabu, Alexander (who I unfortunately forgot to credit in
the commit message) or I added it because it's useful to differentiate
whether the assert is raised because of the crosscheck, or the "general"
code paths.   I assume it was added because calling code did check for
ctid difference to understand the difference between HeapTupleUpdated
implying a delete or an update.  But for the crosscheck case (most of?)
those wouldn't ever reach the ctid equivalency check, because of
+                    if (IsolationUsesXactSnapshot())
+                        ereport(ERROR,
+                                (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                                 errmsg("could not serialize access due to concurrent update")));
like checks beforehand.


> In a non-assert build, this would just boil down to what TID
> is returned along with the failure result code, so it might
> be that there's no obvious bug, depending on what callers are
> doing with that TID.  That would perhaps explain the lack of
> associated field trouble reports.  These issues are pretty old
> so you'd think we'd have heard about it if there were live bugs.

I think the if (IsolationUsesXactSnapshot()) ereport(ERROR, ...)
type checks hould prevent the ctid from being relevant - unless we
somehow end up with a crosscheck snapshot without
IsolationUsesXactSnapshot() being true.


> In any case, seems like we're missing some isolation test
> coverage here ...

Indeed. I added a good number of additional EPQ tests, but they're
either largely or exclusively for READ COMMITTED...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] [PATCH] Generic type subscripting
Next
From: James Coleman
Date:
Subject: Re: optimisation? collation "C" sorting for GroupAggregate for alldeterministic collations