Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date
Msg-id CA+TgmoY5A+JoVZODuLaLiu6s=sNKPdOv-Mf2K43yKfg6Cp5MDw@mail.gmail.com
Whole thread Raw
In response to Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Apr 14, 2022 at 4:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Well, if that's true, then I agree that it's a good argument against
> > that approach. But I guess I'm confused as to why we'd end up in that
> > situation. Suppose we do these two things:
>
> > 1. Decree that SnapshotResetXmin calls InvalidateCatalogSnapshot. It's
> > the other way around right now, but that's only because we're
> > registering the catalog snapshot.
> > 2. Bomb out in GetCatalogSnapshot if you don't have an active or
> > registered snapshot already.
>
> > Is there some reason we'd need any more infrastructure than that?
>
> Yes.
>
> 1. Create snapshot 1 (beginning of transaction).
> 2. Create catalog snapshot (okay because of snapshot 1).
> 3. Create snapshot 2.
> 4. Destroy snapshot 1.
> 5. Catalog snapshot is still there and is now the oldest.

Sorry, I'm not seeing the problem. If we call SnapshotResetXmin()
after step 4, then the catalog snapshot would get invalidated under my
proposal. If we don't, then our advertised xmin has not changed and
nothing can be pruned out from under us.

> I'm basically not on board with adding complication to make the system
> less performant and more brittle, and I don't see how the direction
> you want to go isn't that.

Well ... I agree that brittle is bad, but it's not clear to me which
way is actually less brittle. As for performant, I think you might be
misjudging the situation. My original design for removing SnapshotNow
didn't even have the catalog snapshot - it just took a new snapshot
every time. That was mostly fine, but Andres found a somewhat extreme
test case where it exhibited a significant regression, so I added the
catalog snapshot stuff to work around that. So I'm not AT ALL
convinced that giving catalog snapshots longer lifetimes is a relevant
thing to do. There's some value in it if you can construct a test case
where the overall rate of snapshot taking is extremely high, but in
normal cases that isn't so. It's certainly not worth complicating the
code for backend startup or DDL commands to reduce the number of
snapshots, because you're never going to have those things happening
at a high enough rate to matter, or so I believe.

The way you get a benefit from CatalogSnapshot is to construct a
workload with a lot of really cheap SQL statements each of which has
to do a bunch of catalog lookups, and then run that at high
concurrency. The concurrency makes taking snapshots more expensive,
because the cache lines are contended, and having the same command use
the same snapshot over and over again instead of taking new ones then
brings the cost down enough to be measurable. But people run SQL
statements in a tight loop, not DDL or backend startup.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: fix cost subqueryscan wrong parallel cost
Next
From: Tom Lane
Date:
Subject: Re: Intermittent buildfarm failures on wrasse