Re: Assert() failures during RI checks - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Assert() failures during RI checks
Date
Msg-id 4963.1584916204@sss.pgh.pa.us
Whole thread Raw
In response to Re: Assert() failures during RI checks  (Andres Freund <andres@anarazel.de>)
Responses Re: Assert() failures during RI checks  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2020-03-22 15:00:51 -0400, Tom Lane wrote:
>> Another thing that is very peculiar in this area is that the initial
>> assertion in the second stanza allows the case of result == TM_Deleted.

> In this case, isn't it clearly required to accept TM_Deleted? The HTSU
> after l1: obviously can return that, no?

Well, that's the question.  If it is required, then we need to fix the
code in that stanza to not be trying to chase a bogus next-tid link.

> I wonder if we shouldn't just change the crosscheck case to set
> something other than TM_Updated, as it's not really accurate to say the
> tuple was updated.

Yeah, I was wondering about giving that a new result code, too.
It would be a little bit invasive and not at all back-patchable,
but (say) TM_SerializationViolation seems like a cleaner output ---
and we could define it to not return any TID info, as TM_Delete
doesn't.  But we also need a fix that *can* be back-patched.

>> 2. Does the following stanza actually need to allow for the case
>> of a deleted tuple as well as an updated one?  If so, blindly
>> trying to follow the ctid link is not so cool.

> Which ctid link following are you referring to? I'm not sure I quite
> follow.

The stanza right after the crosscheck one has always assumed that
it is dealing with an outdated-by-update tuple, so that chasing up
to the next TID is a sane thing to do.  Maybe that's been wrong
since day one, or maybe we made it wrong somewhere along the line,
but it seems clearly wrong now (even without accounting for the
serialization crosscheck case).  I think it just accidentally
doesn't cause problems in non-assert builds, as long as nobody
is assuming too much about which TID or XID gets returned.

> But for the crosscheck case (most of?)
> those wouldn't ever reach the ctid equivalency check, because of
> +                    if (IsolationUsesXactSnapshot())
> +                        ereport(ERROR,
> +                                (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> +                                 errmsg("could not serialize access due to concurrent update")));
> like checks beforehand.

Yeah.  For HEAD I'm imagining that callers would just throw
ERRCODE_T_R_SERIALIZATION_FAILURE unconditionally if they
get back TM_SerializationViolation.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Index Skip Scan
Next
From: Kirill Bychik
Date:
Subject: Re: WAL usage calculation patch