Thread: Assert() failures during RI checks

Assert() failures during RI checks

From
Antonin Houska
Date:
I was trying to figure out what exactly the "crosscheck snapshot" does in the
RI checks, and hit some assertion failures:

postgres=# create table p(i int primary key);
CREATE TABLE
postgres=# create table f (i int references p on delete cascade on update cascade deferrable initially deferred);
CREATE TABLE
postgres=# insert into p values (1);
INSERT 0 1
postgres=# begin isolation level repeatable read;
BEGIN
postgres=*# table f;
 i
---
(0 rows)


In another session:

postgres=# insert into f values (1);
INSERT 0 1

Back in the first session:

postgres=*# delete from p where i=1;
TRAP: FailedAssertion("!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)", File: "heapam.c", Line: 2652)

I'm not familiar enough with this code but I wonder if it's only about
incorrect assertions. When I commented out some, I got error message that
makes sense to me:

postgres=*# delete from p where i=1;
2020-03-17 11:59:19.214 CET [89379] ERROR:  could not serialize access due to concurrent update
2020-03-17 11:59:19.214 CET [89379] CONTEXT:  SQL statement "DELETE FROM ONLY "public"."f" WHERE $1
OPERATOR(pg_catalog.=)"i"" 
2020-03-17 11:59:19.214 CET [89379] STATEMENT:  delete from p where i=1;
ERROR:  could not serialize access due to concurrent update
CONTEXT:  SQL statement "DELETE FROM ONLY "public"."f" WHERE $1 OPERATOR(pg_catalog.=) "i""


Similarly, if the test ends with an UPDATE statement, I get this failure:

postgres=*# update p set i=i+1 where i=1;
TRAP: FailedAssertion("!ItemPointerEquals(&oldtup.t_self, &oldtup.t_data->t_ctid)", File: "heapam.c", Line: 3275)

Likewise, with the Assert() statements commented out, the right thing seems to
happen:

2020-03-17 11:57:04.810 CET [88678] ERROR:  could not serialize access due to concurrent update
2020-03-17 11:57:04.810 CET [88678] CONTEXT:  SQL statement "UPDATE ONLY "public"."f" SET "i" = $1 WHERE $2
OPERATOR(pg_catalog.=)"i"" 
2020-03-17 11:57:04.810 CET [88678] STATEMENT:  update p set i=i+1 where i=1;
ERROR:  could not serialize access due to concurrent update
CONTEXT:  SQL statement "UPDATE ONLY "public"."f" SET "i" = $1 WHERE $2 OPERATOR(pg_catalog.=) "i""

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: Assert() failures during RI checks

From
Tom Lane
Date:
Antonin Houska <ah@cybertec.at> writes:
> I was trying to figure out what exactly the "crosscheck snapshot" does in the
> RI checks, and hit some assertion failures:

Yeah, your example reproduces for me.

> I'm not familiar enough with this code but I wonder if it's only about
> incorrect assertions.

Mmm ... not really.  I dug around in the history a bit, and there are
a few moving parts here.  The case that's problematic is where the
"crosscheck" stanza starting at heapam.c:2639 fires, and replaces
result = TM_Ok with result = TM_Updated.  That causes the code to
go into the next stanza, which is all about trying to follow a ctid
update link so it can report a valid TID to the caller along with
the failure result.  But in this case, the tuple at hand has never
been updated, so the assertions fire.

The crosscheck stanza dates back to 55d85f42a, and at the time,
just setting result = TM_Updated was sufficient to make the
then-far-simpler next stanza do the right thing --- basically,
we wanted to just release whatever locks we hold and return the
failure result.  However, later patches added assertions to verify
that that next stanza was doing something sane ... and really, it
isn't in this case.

Another thing that is very peculiar in this area is that the initial
assertion in the second stanza allows the case of result == TM_Deleted.
It makes *no* sense to allow that if what we think we're doing is
chasing a ctid update link.  That alternative was not allowed until
really recently (Andres' commit 5db6df0c0, which appears otherwise
to be just refactoring), and I wonder how carefully it was really
thought through.

So I think there are two somewhat separable issues we have to address:

1. The crosscheck stanza can't get away with just setting result =
TM_Updated.  Maybe the best thing is for it not to try to share code
with what follows, but take responsibility for releasing locks and
returning consistent data to the caller on its own.  I'm not quite
sure what's the most consistent thing for it to return anyway, or
how much we care about what TID is returned in this case.  All we
really want is for an error to be raised, and I don't think the
error path is going to look too closely at the returned TID.

2. Does the following stanza actually need to allow for the case
of a deleted tuple as well as an updated one?  If so, blindly
trying to follow the ctid link is not so cool.

I think the situation in heap_update is just the same, though
it's unclear why the code is slightly different.  The extra
Assert in that crosscheck stanza certainly appears 100% redundant
with the later Assert, though both are wrong per this analysis.

In a non-assert build, this would just boil down to what TID
is returned along with the failure result code, so it might
be that there's no obvious bug, depending on what callers are
doing with that TID.  That would perhaps explain the lack of
associated field trouble reports.  These issues are pretty old
so you'd think we'd have heard about it if there were live bugs.

In any case, seems like we're missing some isolation test
coverage here ...

            regards, tom lane



Re: Assert() failures during RI checks

From
Andres Freund
Date:
Hi,

On 2020-03-22 15:00:51 -0400, Tom Lane wrote:
> Antonin Houska <ah@cybertec.at> writes:
> > I was trying to figure out what exactly the "crosscheck snapshot" does in the
> > RI checks, and hit some assertion failures:
> 
> Yeah, your example reproduces for me.
> 
> > I'm not familiar enough with this code but I wonder if it's only about
> > incorrect assertions.
> 
> Mmm ... not really.  I dug around in the history a bit, and there are
> a few moving parts here.  The case that's problematic is where the
> "crosscheck" stanza starting at heapam.c:2639 fires, and replaces
> result = TM_Ok with result = TM_Updated.  That causes the code to
> go into the next stanza, which is all about trying to follow a ctid
> update link so it can report a valid TID to the caller along with
> the failure result.  But in this case, the tuple at hand has never
> been updated, so the assertions fire.
> 
> The crosscheck stanza dates back to 55d85f42a, and at the time,
> just setting result = TM_Updated was sufficient to make the
> then-far-simpler next stanza do the right thing --- basically,
> we wanted to just release whatever locks we hold and return the
> failure result.  However, later patches added assertions to verify
> that that next stanza was doing something sane ... and really, it
> isn't in this case.
>
> Another thing that is very peculiar in this area is that the initial
> assertion in the second stanza allows the case of result == TM_Deleted.
> It makes *no* sense to allow that if what we think we're doing is
> chasing a ctid update link.  That alternative was not allowed until
> really recently (Andres' commit 5db6df0c0, which appears otherwise
> to be just refactoring), and I wonder how carefully it was really
> thought through.

Before that commit HeapTupleUpdated represented both deletions and
updates. When splitting them, and where it wasn't obvious only one of
the two could apply, I opted to continue to allow both. After all
previously both were allowed too. IOW, I think it was actually
previously allowed?

In this case, isn't it clearly required to accept TM_Deleted? The HTSU
after l1: obviously can return that, no?


I think what the problem with 5db6df0c0 here could be is that it added
+        Assert(result != TM_Updated ||
+               !ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
as a new restriction. That's likely because TM_Updated should indicate
an update, and that should imply we have a ctid link - and that callers
checked this previously to detect whether an update happened.  But
that's not the case here, due to the way the crosscheck stanza currently
works.

E.g. in <12, callers like ExecDelete() checked things like
-                if (ItemPointerIndicatesMovedPartitions(&hufd.ctid))
-                    ereport(ERROR,
-                            (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-                             errmsg("tuple to be deleted was already moved to another partition due to concurrent
update")));
-
-                if (!ItemPointerEquals(tupleid, &hufd.ctid))
-                {

but that doesn't really work without exposing too much detail of the
AM to those callsites.

I wonder if we shouldn't just change the crosscheck case to set
something other than TM_Updated, as it's not really accurate to say the
tuple was updated. Either something like TM_Deleted, or perhaps a new
return code?  I think that's effectively how it was treated before the
split of HeapTupleUpdated into TM_Deleted and TM_Updated.


> So I think there are two somewhat separable issues we have to address:
> 
> 1. The crosscheck stanza can't get away with just setting result =
> TM_Updated.  Maybe the best thing is for it not to try to share code
> with what follows, but take responsibility for releasing locks and
> returning consistent data to the caller on its own.  I'm not quite
> sure what's the most consistent thing for it to return anyway, or
> how much we care about what TID is returned in this case.  All we
> really want is for an error to be raised, and I don't think the
> error path is going to look too closely at the returned TID.

If we want to go for that, perhaps the easiest way to handle this is to
have an error path at the end of heap_delete that we reach with two
gotos. Like

delete_not_ok:
        Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID));
        Assert(result != TM_Updated ||
               !ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
delete_no_ok_crosscheck:
        Assert(result == TM_SelfModified ||
               result == TM_Updated ||
               result == TM_Deleted ||
               result == TM_BeingModified);
...




> 2. Does the following stanza actually need to allow for the case
> of a deleted tuple as well as an updated one?  If so, blindly
> trying to follow the ctid link is not so cool.

Which ctid link following are you referring to? I'm not sure I quite
follow.


> I think the situation in heap_update is just the same, though
> it's unclear why the code is slightly different.  The extra
> Assert in that crosscheck stanza certainly appears 100% redundant
> with the later Assert, though both are wrong per this analysis.

I assume Haribabu, Alexander (who I unfortunately forgot to credit in
the commit message) or I added it because it's useful to differentiate
whether the assert is raised because of the crosscheck, or the "general"
code paths.   I assume it was added because calling code did check for
ctid difference to understand the difference between HeapTupleUpdated
implying a delete or an update.  But for the crosscheck case (most of?)
those wouldn't ever reach the ctid equivalency check, because of
+                    if (IsolationUsesXactSnapshot())
+                        ereport(ERROR,
+                                (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+                                 errmsg("could not serialize access due to concurrent update")));
like checks beforehand.


> In a non-assert build, this would just boil down to what TID
> is returned along with the failure result code, so it might
> be that there's no obvious bug, depending on what callers are
> doing with that TID.  That would perhaps explain the lack of
> associated field trouble reports.  These issues are pretty old
> so you'd think we'd have heard about it if there were live bugs.

I think the if (IsolationUsesXactSnapshot()) ereport(ERROR, ...)
type checks hould prevent the ctid from being relevant - unless we
somehow end up with a crosscheck snapshot without
IsolationUsesXactSnapshot() being true.


> In any case, seems like we're missing some isolation test
> coverage here ...

Indeed. I added a good number of additional EPQ tests, but they're
either largely or exclusively for READ COMMITTED...

Greetings,

Andres Freund



Re: Assert() failures during RI checks

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-03-22 15:00:51 -0400, Tom Lane wrote:
>> Another thing that is very peculiar in this area is that the initial
>> assertion in the second stanza allows the case of result == TM_Deleted.

> In this case, isn't it clearly required to accept TM_Deleted? The HTSU
> after l1: obviously can return that, no?

Well, that's the question.  If it is required, then we need to fix the
code in that stanza to not be trying to chase a bogus next-tid link.

> I wonder if we shouldn't just change the crosscheck case to set
> something other than TM_Updated, as it's not really accurate to say the
> tuple was updated.

Yeah, I was wondering about giving that a new result code, too.
It would be a little bit invasive and not at all back-patchable,
but (say) TM_SerializationViolation seems like a cleaner output ---
and we could define it to not return any TID info, as TM_Delete
doesn't.  But we also need a fix that *can* be back-patched.

>> 2. Does the following stanza actually need to allow for the case
>> of a deleted tuple as well as an updated one?  If so, blindly
>> trying to follow the ctid link is not so cool.

> Which ctid link following are you referring to? I'm not sure I quite
> follow.

The stanza right after the crosscheck one has always assumed that
it is dealing with an outdated-by-update tuple, so that chasing up
to the next TID is a sane thing to do.  Maybe that's been wrong
since day one, or maybe we made it wrong somewhere along the line,
but it seems clearly wrong now (even without accounting for the
serialization crosscheck case).  I think it just accidentally
doesn't cause problems in non-assert builds, as long as nobody
is assuming too much about which TID or XID gets returned.

> But for the crosscheck case (most of?)
> those wouldn't ever reach the ctid equivalency check, because of
> +                    if (IsolationUsesXactSnapshot())
> +                        ereport(ERROR,
> +                                (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> +                                 errmsg("could not serialize access due to concurrent update")));
> like checks beforehand.

Yeah.  For HEAD I'm imagining that callers would just throw
ERRCODE_T_R_SERIALIZATION_FAILURE unconditionally if they
get back TM_SerializationViolation.

            regards, tom lane



Re: Assert() failures during RI checks

From
Andres Freund
Date:
Hi,

On 2020-03-22 18:30:04 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I wonder if we shouldn't just change the crosscheck case to set
> > something other than TM_Updated, as it's not really accurate to say the
> > tuple was updated.
>
> Yeah, I was wondering about giving that a new result code, too.
> It would be a little bit invasive and not at all back-patchable,
> but (say) TM_SerializationViolation seems like a cleaner output ---
> and we could define it to not return any TID info, as TM_Delete
> doesn't.  But we also need a fix that *can* be back-patched.

I wonder if returning TM_Invisible in the backbranches would be the
right answer. "The affected tuple wasn't visible to the relevant
snapshot" is not the worst description for a crosscheck snapshot
violation.

We would have to start issuing a better error message for it, as it's
unexpected in most places right now. It'd be kind of odd for
heap_delete/update() to error out with "attempted to delete invisible
tuple" but still return it in some cases, though.


> >> 2. Does the following stanza actually need to allow for the case
> >> of a deleted tuple as well as an updated one?  If so, blindly
> >> trying to follow the ctid link is not so cool.
>
> > Which ctid link following are you referring to? I'm not sure I quite
> > follow.
>
> The stanza right after the crosscheck one has always assumed that
> it is dealing with an outdated-by-update tuple, so that chasing up
> to the next TID is a sane thing to do.

Do you just mean the
        tmfd->ctid = tp.t_data->t_ctid;
and
        tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
? I would not have described that as chasing, which would explain my
difficulty following.


> Maybe that's been wrong
> since day one, or maybe we made it wrong somewhere along the line,
> but it seems clearly wrong now (even without accounting for the
> serialization crosscheck case).  I think it just accidentally
> doesn't cause problems in non-assert builds, as long as nobody
> is assuming too much about which TID or XID gets returned.

It'd probably be worthwhile to rejigger those branches to something
roughly like:

    if (result != TM_Ok)
    {
        switch (result)
        {
            case TM_SelfModified:
                Assert(!ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
                tmfd->ctid = tp.t_data->t_ctid;
                tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
                tmfd->cmax = HeapTupleHeaderGetCmax(tp.t_data);
                break;
            case TM_Updated:
                Assert(!ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
                tmfd->ctid = tp.t_data->t_ctid;
                tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
                tmfd->cmax = InvalidCommandId;
                break;
            case TM_Deleted:
                tmfd->ctid = tp.t_self;
                tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
                tmfd->cmax = InvalidCommandId;
                break;
            case TM_BeingModified:
                Assert(!ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid));
                tmfd->ctid = tp.t_data->t_ctid;
                tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
                tmfd->cmax = InvalidCommandId;
                break;
            default:
                elog(ERROR, "unexpected visibility %d", result);
        }

        UnlockReleaseBuffer(buffer);
        if (have_tuple_lock)
            UnlockTupleTuplock(relation, &(tp.t_self), LockTupleExclusive);
        if (vmbuffer != InvalidBuffer)
            ReleaseBuffer(vmbuffer);
        return result;
    }


> > But for the crosscheck case (most of?)
> > those wouldn't ever reach the ctid equivalency check, because of
> > +                    if (IsolationUsesXactSnapshot())
> > +                        ereport(ERROR,
> > +                                (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> > +                                 errmsg("could not serialize access due to concurrent update")));
> > like checks beforehand.
>
> Yeah.  For HEAD I'm imagining that callers would just throw
> ERRCODE_T_R_SERIALIZATION_FAILURE unconditionally if they
> get back TM_SerializationViolation.

I am wondering if the correct HEAD answer would be to somehow lift the
crosscheck logic out of heapam.c. ISTM that heapam.c is kind of too low
level for that type of concern. But it seems hard to do that without
performing unnecessary updates that immediately are thrown away by
throwing an error at a higher level.

Greetings,

Andres Freund



Re: Assert() failures during RI checks

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-03-22 18:30:04 -0400, Tom Lane wrote:
>> Yeah, I was wondering about giving that a new result code, too.
>> It would be a little bit invasive and not at all back-patchable,
>> but (say) TM_SerializationViolation seems like a cleaner output ---
>> and we could define it to not return any TID info, as TM_Delete
>> doesn't.  But we also need a fix that *can* be back-patched.

> I wonder if returning TM_Invisible in the backbranches would be the
> right answer. "The affected tuple wasn't visible to the relevant
> snapshot" is not the worst description for a crosscheck snapshot
> violation.

I'm not for that.  Abusing an existing result code is what got us
into this mess in the first place; abusing a different result code
than what we did in prior minor releases can only make things worse
not better.  (Of course, if there's no external callers of these
functions then we could get away with changing it ... but I'm not
prepared to assume that, are you?)

Also it'd mean something quite different in the direct output of
HeapTupleSatisfiesUpdate than what it would mean one level up,
which is certain to cause confusion.

I think the right thing in the back branches is to keep on returning
the "Updated" result code, but adjust the crosscheck code path so that
the TID that's sent back is always the tuple's own TID.

>> The stanza right after the crosscheck one has always assumed that
>> it is dealing with an outdated-by-update tuple, so that chasing up
>> to the next TID is a sane thing to do.

> Do you just mean the
>         tmfd->ctid = tp.t_data->t_ctid;
> and
>         tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
> ? I would not have described that as chasing, which would explain my
> difficulty following.

The bit about returning t_ctid rather than the tuple's own TID seems
like chasing a next-tid link to me.  The Assert about xmax being valid
is certainly intended to verify that that's what is happening.  I can't
speak to what the code thinks it's returning in tmfd->xmax, because
that functionality wasn't there the last time I studied this.

> I am wondering if the correct HEAD answer would be to somehow lift the
> crosscheck logic out of heapam.c. ISTM that heapam.c is kind of too low
> level for that type of concern. But it seems hard to do that without
> performing unnecessary updates that immediately are thrown away by
> throwing an error at a higher level.

Yeah, I'd be the first to agree that the crosscheck stuff is messy.
But it's hard to see how to do better.

            regards, tom lane



Re: Assert() failures during RI checks

From
Andres Freund
Date:
Hi,

On 2020-03-23 13:54:31 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-03-22 18:30:04 -0400, Tom Lane wrote:
> >> Yeah, I was wondering about giving that a new result code, too.
> >> It would be a little bit invasive and not at all back-patchable,
> >> but (say) TM_SerializationViolation seems like a cleaner output ---
> >> and we could define it to not return any TID info, as TM_Delete
> >> doesn't.  But we also need a fix that *can* be back-patched.
> 
> > I wonder if returning TM_Invisible in the backbranches would be the
> > right answer. "The affected tuple wasn't visible to the relevant
> > snapshot" is not the worst description for a crosscheck snapshot
> > violation.
> 
> I'm not for that.  Abusing an existing result code is what got us
> into this mess in the first place; abusing a different result code
> than what we did in prior minor releases can only make things worse
> not better.

Well, I think TM_Invisible is much less error prone - callers wouldn't
try to chase ctid chains or such. There's really not much they can do
except to error out. It also seems semantically be a better match than
TM_Updated.


> Of course, if there's no external callers of these
> functions then we could get away with changing it ... but I'm not
> prepared to assume that, are you?)

I'm certain we can't assume that.


> I think the right thing in the back branches is to keep on returning
> the "Updated" result code, but adjust the crosscheck code path so that
> the TID that's sent back is always the tuple's own TID.

If we do that, and set xmax to InvalidTransactionId, I think we'd likely
prevent any "wrong" chaining from outside heapam, since that already
needed to check xmax to be correct.


> >> The stanza right after the crosscheck one has always assumed that
> >> it is dealing with an outdated-by-update tuple, so that chasing up
> >> to the next TID is a sane thing to do.> 
> > Do you just mean the
> >         tmfd->ctid = tp.t_data->t_ctid;
> > and
> >         tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data);
> > ? I would not have described that as chasing, which would explain my
> > difficulty following.
> 
> The bit about returning t_ctid rather than the tuple's own TID seems
> like chasing a next-tid link to me.  The Assert about xmax being valid
> is certainly intended to verify that that's what is happening.  I can't
> speak to what the code thinks it's returning in tmfd->xmax, because
> that functionality wasn't there the last time I studied this.

We could probably get rid of tmfd->xmax. It kind of was introduced as
part of

commit 6868ed7491b7ea7f0af6133bb66566a2f5fe5a75
Author: Kevin Grittner <kgrittn@postgresql.org>
Date:   2012-10-26 14:55:36 -0500

    Throw error if expiring tuple is again updated or deleted.

It did previously exist as an explicit heap_delete/heap_update
parameter, however - and has for a long time:

commit f57e3f4cf36f3fdd89cae8d566479ad747809b2f
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   2005-08-20 00:40:32 +0000

    Repair problems with VACUUM destroying t_ctid chains too soon, and with
    insufficient paranoia in code that follows t_ctid links.  (We must do both


Since Kevin's commit the relevant logic has since been pushed down into
the AM layer as part of tableam. While still used in one place
internally inside heap AM (basically where the check from Kevin's commit
commit has been moved), but that looks trivial to solve differently.


> > I am wondering if the correct HEAD answer would be to somehow lift the
> > crosscheck logic out of heapam.c. ISTM that heapam.c is kind of too low
> > level for that type of concern. But it seems hard to do that without
> > performing unnecessary updates that immediately are thrown away by
> > throwing an error at a higher level.
> 
> Yeah, I'd be the first to agree that the crosscheck stuff is messy.
> But it's hard to see how to do better.

Obviously not a short-term solution at all: I wonder if some of this
stems from implementing RI triggers partially in SQL, even though we
require semantics that aren't efficiently doable in SQL. With the
consequences of having RI specific code in various parts of the executor
and AMs, and of needing a lot more memory etc.

Greetings,

Andres Freund