[SP-]GiST IOS visibility bug (was: Why doens't GiST require super-exclusive lock) - Mailing list pgsql-hackers

From Matthias van de Meent
Subject [SP-]GiST IOS visibility bug (was: Why doens't GiST require super-exclusive lock)
Date
Msg-id CAEze2WgH13m=MDST58KLo-NkZpbwBEt4xNWcgtghWBwRj3J0+A@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi,

In the original thread [0] we established that GIST and SP-GiST
support index-only scans, but don't have sufficient interlocking to
prevent dead tuples from being revived for their scan by vacuum. As
that thread already contains a lot of detail, I won't go into the
specifics here, but the gist (heh) of it is as follows:

(SP-)GiST index scans don't interlock with their index vacuum in a way
that guarantees that their index-only scans only return TIDs that are
known to be valid TIDs when they're returned by amgettuple, thus
allowing index vacuum to clean TIDs up and let VACUUM mark their pages
ALL_VISIBLE before the tids were returned by the index-only scan.

The keen-eyed reader may have noticed that this issue is very similar
to the issue fixed with 459e7bf8, and that's correct. Both are cases
where the visibility map was accessed to determine the visibility of a
TID of which we didn't know for sure it was still a valid TID on that
heap page.


While the thread in [0] contains a very thorough fix, and (IMO) some
good additions to how index-only scans work with table AMs, the patch
currently proposed there changes things in non-backpatchable ways. So,
attached is a set of 4 patches designed for backpatching a fix.

The patches work similar to [0]'s, but with less invasive code changes
and no meaningful ABI breaks:
index vacuum's lock requirements are increased to super-exclusive
(cleanup) locks. Additionally, index-only scans will now do visibility
checks of each tuple they want to return, before the pin on the index
page that contains those tuples is released.
The visibility check is done on a tuple-by-tuple basis, through a new
function table_index_vischeck_tuple (so as to not totally break the
semantic barrier between indexes and tables regarding visibility
checks). In the patchset of [0] I added a table AM routine to allow
tableAMs to extend this, but that's not backportable, hence this
separate thread for discussing a backpatchable bugfix.
The one exposed ABI change is done by adding a single-byte
"visrecheck" field in an alignment area in IndexScanDescData - which
historically has been the place to put new fields required for
backported bugfixes. The size of the struct doesn't change, and there
should be no backward or forward compatibility issues for extensions
that might make use of this new field as long as they don't assume the
value of the field when they haven't set it themselves.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://postgr.es/m/flat/CAH2-Wz=PqOziyRSrnN5jAtfXWXY7-BJcHz9S355LH8Dt=5qxWQ@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER
Next
From: Matheus Alcantara
Date:
Subject: Re: extension_control_path and "directory"