Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date
Msg-id 1533665.1649962046@sss.pgh.pa.us
Whole thread Raw
In response to Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2022-04-14 12:16:45 -0400, Robert Haas wrote:
>> I got curious and looked at the underlying problem here and I am
>> wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
>> seems to me that the code is always going to return true if there are
>> any active snapshots, and the rest of the function is intended to test
>> whether there is a registered snapshot other than the catalog
>> snapshot. But I don't think that's what this code does:
>>
>> if (pairingheap_is_empty(&RegisteredSnapshots) ||
>>     !pairingheap_is_singular(&RegisteredSnapshots))
>>      return false;
>>
>> return CatalogSnapshot == NULL;

> Certainly looks off...

Yeah, this is broken.  Whilst waiting around for a build on wrasse's
host, I reproduced the problem shown in this thread, and here's what
I see at the point of the exception:

(gdb) p RegisteredSnapshots
$5 = {ph_compare = 0x9a6000 <xmin_cmp>, ph_arg = 0x0,
  ph_root = 0xec3168 <CatalogSnapshotData+72>}
(gdb) p *RegisteredSnapshots.ph_root
$6 = {first_child = 0x2d85d70, next_sibling = 0x0, prev_or_parent = 0x0}
(gdb) p CatalogSnapshotData
$7 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155,
  xip = 0x2d855b0, xcnt = 0, subxip = 0x2de9130, subxcnt = 0,
  suboverflowed = false, takenDuringRecovery = false, copied = false,
  curcid = 0, speculativeToken = 0, vistest = 0x0, active_count = 0,
  regd_count = 0, ph_node = {first_child = 0x2d85d70, next_sibling = 0x0,
    prev_or_parent = 0x0}, whenTaken = 0, lsn = 0, snapXactCompletionCount = 1}
(gdb) p CatalogSnapshot
$8 = (Snapshot) 0xec3120 <CatalogSnapshotData>
(gdb) p *(Snapshot) (0x2d85d70-72)
$9 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155, xip = 0x0,
  xcnt = 0, subxip = 0x0, subxcnt = 0, suboverflowed = false,
  takenDuringRecovery = false, copied = true, curcid = 0,
  speculativeToken = 0, vistest = 0x0, active_count = 0, regd_count = 2,
  ph_node = {first_child = 0x0, next_sibling = 0x0,
    prev_or_parent = 0xec3168 <CatalogSnapshotData+72>}, whenTaken = 0,
  lsn = 0, snapXactCompletionCount = 0}
(gdb) p ActiveSnapshot
$10 = (ActiveSnapshotElt *) 0x0

So in fact there IS another registered snapshot, and
HaveRegisteredOrActiveSnapshot is just lying.  I think the
correct test is more nearly what we have in
InvalidateCatalogSnapshotConditionally:

    if (CatalogSnapshot &&
        ActiveSnapshot == NULL &&
        pairingheap_is_singular(&RegisteredSnapshots))
            // then the CatalogSnapshot is the only one.

Ergo, this actually is a bug in 277692220.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Next
From: Tom Lane
Date:
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)