Re: snapshot too old issues, first around wraparound and then more. - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: snapshot too old issues, first around wraparound and then more.
Date
Msg-id CA+hUKGKQcBru4SmtxTNzA-g1L8h6AbyJw0xmJfP1hK=7kucBkg@mail.gmail.com
Whole thread Raw
In response to Re: snapshot too old issues, first around wraparound and then more.  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Wed, Apr 15, 2020 at 2:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Mon, Apr 13, 2020 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > On Fri, Apr 3, 2020 at 2:22 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > > I'm thinking of a version of "snapshot too old" that amounts to a
> > > statement timeout that gets applied for xmin horizon type purposes in
> > > the conventional way, while only showing an error to the client if and
> > > when they access literally any buffer (though not when the relation is
> > > a system catalog). Is it possible that something along those lines is
> > > appreciably better than nothing to users? If it is, and if we can find
> > > a way to manage the transition, then maybe we could tolerate
> > > supporting this greatly simplified implementation of "snapshot too
> > > old".

I rebased that patch and fleshed it out just a bit more.  Warning:
experiment grade, incomplet, inkorrect, but I think it demonstrates
the main elements of Peter's idea from last year.

For READ COMMITTED, the user experience is much like a statement
timeout, except that it can keep doing stuff that doesn't read
non-catalog data.  Trivial example: pg_sleep(10) happily completes
with old_snapshot_threshold=5s, as do queries that materialise all
their input data in time, and yet your xmin is zapped.  For REPEATABLE
READ it's obviously tied to your first query, and produces "snapshot
too old" if you repeatedly SELECT from a little table and your time
runs out.

In this version I put the check into the heapam visibility + vismap
checks, instead of in the buffer access code.  The reason is that the
lower level buffer access routines don't have a snapshot, but if you
push the check down to buffer access and just check the "oldest"
snapshot (definition in this patch, not in master), then you lose some
potential granularity with different cursors.  If you try to put it at
a higher level in places that have a snapshot and access a buffer, you
run into the problem of being uncertain that you've covered all the
bases.  But I may be underthinking that.

Quite a few unresolved questions about catalog and toast snapshots and
other special stuff, as well as the question of whether it's actually
useful or the overheads can be made cheap enough.

> Hmm.  I suppose it must be possible to put the LSN check back: if
> (snapshot->too_old && PageGetLSN(page) > snapshot->lsn) ereport(...).
> Then the granularity would be the same as today -- block level -- but
> the complexity is transferred from the pruning side (has to deal with
> xid time map) to the snapshot-owning side (has to deal with timers,
> CFI() and make sure all snapshots are registered).  Maybe not a great
> deal, and maybe not easier than fixing the existing bugs.

It is a shame to lose the current LSN-based logic; it's really rather
clever (except for being broken).

> One problem is all the new setitimer() syscalls.  I feel like that
> could be improved, as could statement_timeout, by letting existing
> timers run rather than repeatedly rescheduling eagerly, so that eg a 1
> minute timeout never gets rescheduled more than once per minute.  I
> haven't looked into that, but I guess it's no worse than the existing
> implement's overheads anyway.

At least that problem was fixed, by commit 09cf1d52.  (Not entirely
sure why I got away with not reenabling the timer between queries, but
I didn't look very hard).

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Optionally automatically disable logical replication subscriptions on error
Next
From: vignesh C
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions