Re: snapshot too old issues, first around wraparound and then more. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: snapshot too old issues, first around wraparound and then more.
Date
Msg-id 20200402005406.xtn6wgqu7veqie4p@alap3.anarazel.de
Whole thread Raw
In response to Re: snapshot too old issues, first around wraparound and then more.  (Andres Freund <andres@anarazel.de>)
Responses Re: snapshot too old issues, first around wraparound and then more.  (Peter Geoghegan <pg@bowt.ie>)
Re: snapshot too old issues, first around wraparound and then more.  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2020-04-01 16:59:51 -0700, Andres Freund wrote:
> The primary issue here is that there is no TestForOldSnapshot() in
> heap_hot_search_buffer(). Therefore index fetches will never even try to
> detect that tuples it needs actually have already been pruned away.

bitmap heap scan doesn't have the necessary checks either. In the
non-lossy case it uses heap_hot_search_buffer, for the lossy case it has
an open coded access without the check (that's bitgetpage() before v12,
and heapam_scan_bitmap_next_block() after that).

Nor do sample scans, but that was "at least" introduced later.

As far as I can tell there's not sufficient in-tree explanation of when
code needs to test for an old snapshot. There's just the following
comment above TestForOldSnapshot():
 * Check whether the given snapshot is too old to have safely read the given
 * page from the given table.  If so, throw a "snapshot too old" error.
 *
 * This test generally needs to be performed after every BufferGetPage() call
 * that is executed as part of a scan.  It is not needed for calls made for
 * modifying the page (for example, to position to the right place to insert a
 * new index tuple or for vacuuming).  It may also be omitted where calls to
 * lower-level functions will have already performed the test.

But I don't find "as part of a scan" very informative. I mean, it
was explicitly not called from with the executor back then (for a while
the check was embedded in BufferGetPage()):

static void
bitgetpage(HeapScanDesc scan, TBMIterateResult *tbmres)
...
        Page        dp = BufferGetPage(buffer, NULL, NULL, BGP_NO_SNAPSHOT_TEST);


I am more than a bit dumbfounded here.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: SLRU statistics
Next
From: Peter Geoghegan
Date:
Subject: Re: snapshot too old issues, first around wraparound and then more.