Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date
Msg-id 20220414165837.bg22dkdzq4krff56@alap3.anarazel.de
Whole thread Raw
In response to Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2022-04-14 09:54:25 -0700, Andres Freund wrote:
> On 2022-04-14 12:16:45 -0400, Robert Haas wrote:
> > I got curious and looked at the underlying problem here and I am
> > wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
> > seems to me that the code is always going to return true if there are
> > any active snapshots, and the rest of the function is intended to test
> > whether there is a registered snapshot other than the catalog
> > snapshot. But I don't think that's what this code does:
> > 
> >      if (pairingheap_is_empty(&RegisteredSnapshots) ||
> >          !pairingheap_is_singular(&RegisteredSnapshots))
> >          return false;
> > 
> >      return CatalogSnapshot == NULL;
> 
> Certainly looks off...
> 
> 
> > I find that 'make check-world' passes with this change, which is
> > disturbing, because it also passes without this change. That means we
> > don't have any tests that reach HaveRegisteredOrActiveSnapshot() with
> > more than one registered snapshot.
> 
> Part of that is because right now the assertion is placed "too deep" -
> it should be much higher up, so it's reached even if there's not
> actually a toast datum. But there's of other bugs preventing that :(. A
> lot of bugs have been hidden by the existence of CatalogSnapshot (which
> of course isn't something one actually can rely on).
> 
> 
> > Also, unless we have plans to use HaveRegisteredOrActiveSnapshot() in
> > more places,
> 
> I think we should, but there's the other bugs that need to be fixed
> first :(. Namely that we have plenty places doing catalog accesses
> without an active or registered snapshot :(.

Ah, we actually were debating some of these issues more recently, in:
https://www.postgresql.org/message-id/20220311030721.olixpzcquqkw2qyt%40alap3.anarazel.de
https://www.postgresql.org/message-id/20220311021047.hgtqkrl6n52srvdu%40alap3.anarazel.de

It looks like the same bug, and that the patch in this thread fixes
them. And that we need to backpatch the fix, right?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Next
From: Peter Geoghegan
Date:
Subject: Re: Intermittent buildfarm failures on wrasse