Re: Assertion failure in SnapBuildInitialSnapshot() - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Assertion failure in SnapBuildInitialSnapshot()
Date
Msg-id CAA4eK1KAwhdJdqd5tpwskh24j9tjZPqt6it0jJNMn6Xf+XfWNw@mail.gmail.com
Whole thread Raw
In response to Re: Assertion failure in SnapBuildInitialSnapshot()  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:
> > On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > > > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > nor do we enforce in an obvious place that we
> > > > > don't already hold a snapshot.
> > > > >
> > > >
> > > > We have a check for (FirstXactSnapshot == NULL) in
> > > > RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
> > > > sufficient?
> > >
> > > I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
> > > still be bad. And I think checking in SetTransactionSnapshot() is too late,
> > > we've already overwritten MyProc->xmin by that point.
> > >
> >
> > So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
> > we have the Assert for Assert(!FirstSnapshotSet)?
> > Assert(FirstXactSnapshot == NULL);
> > Assert(!HistoricSnapshotActive());
>
> I don't think that'd catch a catalog snapshot. But perhaps the better answer
> for the catalog snapshot is to just invalidate it explicitly. The user doesn't
> have control over the catalog snapshot being taken, and it's not too hard to
> imagine the walsender code triggering one somewhere.
>
> So maybe we should add something like:
>
> InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
> if (HaveRegisteredOrActiveSnapshot())
>   elog(ERROR, "cannot build an initial slot snapshot when snapshots exist")
> Assert(!HistoricSnapshotActive());
>
> I think we'd not need to assert FirstXactSnapshot == NULL or !FirstSnapshotSet
> with that, because those would show up in HaveRegisteredOrActiveSnapshot().
>

In the attached patch, I have incorporated this change and other
feedback. I think this should probably help us find the reason for
this problem when we see it in the future.

-- 
With Regards,
Amit Kapila.

Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
Next
From: Peter Eisentraut
Date:
Subject: Re: Fix order of checking ICU options in initdb and create database