Hi,
On 2022-04-14 12:16:45 -0400, Robert Haas wrote:
> I got curious and looked at the underlying problem here and I am
> wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
> seems to me that the code is always going to return true if there are
> any active snapshots, and the rest of the function is intended to test
> whether there is a registered snapshot other than the catalog
> snapshot. But I don't think that's what this code does:
>
> if (pairingheap_is_empty(&RegisteredSnapshots) ||
> !pairingheap_is_singular(&RegisteredSnapshots))
> return false;
>
> return CatalogSnapshot == NULL;
Certainly looks off...
> I find that 'make check-world' passes with this change, which is
> disturbing, because it also passes without this change. That means we
> don't have any tests that reach HaveRegisteredOrActiveSnapshot() with
> more than one registered snapshot.
Part of that is because right now the assertion is placed "too deep" -
it should be much higher up, so it's reached even if there's not
actually a toast datum. But there's of other bugs preventing that :(. A
lot of bugs have been hidden by the existence of CatalogSnapshot (which
of course isn't something one actually can rely on).
> Also, unless we have plans to use HaveRegisteredOrActiveSnapshot() in
> more places,
I think we should, but there's the other bugs that need to be fixed
first :(. Namely that we have plenty places doing catalog accesses
without an active or registered snapshot :(.
> I think we should consider revising the whole approach here. The way
> init_toast_snapshot() is coded, we basically have some code that tests
> for active and registered snapshots and finds the oldest one. We error
> out if there isn't one. And then we get to this assertion, which
> checks the same stuff a second time but with an additional check to
> see whether we ended up with the catalog snapshot. Wouldn't it make
> more sense if GetOldestSnapshot() just refused to return the catalog
> snapshot in the first place?
I'm worried that that could cause additional bugs. Consider code using
GetOldestSnapshot() to check if tuples need to be preserved or such.
Greetings,
Andres Freund